Bio Namespace |
Class | Description | |
---|---|---|
AATreeT |
Arne Andersson Self Balancing Binary Search Tree.
| |
AATreeTKey, TValue |
Dictionary like implementation using AATree.
| |
Alphabets |
The currently supported and built-in alphabets for sequence items.
| |
AmbiguousDnaAlphabet |
Ambiguous symbol in the DNA.
| |
AmbiguousProteinAlphabet |
Ambiguous characters in the Protein.
| |
AmbiguousRnaAlphabet |
Ambiguous symbols in the RNA.
| |
BigArrayT | ||
BigListT |
Represents a strongly typed list of objects.
Uses BigArray to store objects.
| |
CloneLibrary |
Class created for reading data from resource file having library information.
Singleton design pattern is used to create only one instance of class.
| |
DerivedSequence |
This is a temporary implementation of DerivedSequence to support reversing and complementing a sequence.
| |
DifferenceNode |
Node that tracks difference between the two sequences.
| |
DnaAlphabet |
The basic alphabet that describes symbols used in DNA sequences.
This alphabet allows not only for the four base nucleotide symbols,
but also for various ambiguities, termination, and gap symbols.
The character representations come from the NCBI4na standard and are used in many sequence file formats. The NCBI4na standard is the same as the IUPACna standard with only the addition of the gap character. The entries in this dictionary are: Symbol - Name A - Adenine C - Cytosine M - A or C G - Guanine R - G or A S - G or C V - G or V or A T - Thymine W - A or T Y - T or C H - A or C or T K - G or T D - G or A or T B - G or T or C - - Gap N - A or G or T or C. | |
IndexedItemT |
IndexedItem holds an item and its index.
Index is a zero based position of item.
This class is used in Sparse Sequence to get the known sequence items with their positions.
This class implements IComparable interface and all comparisons are based on index
and not on item.
| |
MetadataListItemT |
It is common for a biological sequence file to contain lists of certain types of metadata,
such as features or references, which can be stored as MetadataListItems. A
MetadataListItem contains a key (which might not be unique) a free-text field of top level
information (such as a sequence location), and a list of sub-items, each consisting of
a key and a data field of type T. If the sub-items have unique keys, a string type can be
used for T. But if the sub-item keys are not unique, a list of strings should be used
for T.
| |
PlatformManager |
Platform manager - this holds all the platform specific services.
| |
ProteinAlphabet |
The basic alphabet that describes symbols used in sequences of amino
acids that come from codon encodings of RNA. This alphabet allows for
the twenty amino acids as well as a termination and gap symbol.
The character representations come from the NCBIstdaa standard and are used in many sequence file formats. The NCBIstdaa standard has all the same characters as NCBIeaa and IUPACaa, but adds Selenocysteine, termination, and gap symbols to the latter. The entries in this dictionary are: Symbol - Extended Symbol - Name A - Ala - Alanine C - Cys - Cysteine D - Asp - Aspartic Acid E - Glu - Glutamic Acid F - Phe - Phenylalanine G - Gly - Glycine H - His - Histidine I - Ile - Isoleucine K - Lys - Lysine L - Leu - Leucine M - Met - Methionine N - Asn - Asparagine O - Pyl - Pyrrolysine P - Pro - Proline Q - Gln - Glutamine R - Arg - Arginine S - Ser - Serine T - Thr - Threoine U - Sel - Selenocysteine V - Val - Valine W - Trp - Tryptophan Y - Tyr - Tyrosine * - Ter - Termination - - --- - Gap. | |
QualitativeSequence |
This class holds quality scores along with the sequence data.
| |
RnaAlphabet |
The basic alphabet that describes symbols used in RNA sequences.
This alphabet allows not only for the four base nucleotide symbols,
but also for various ambiguities, termination, and gap symbols.
The symbol representations come from the NCBI4na standard and are used in many sequence file formats. The NCBI4na standard is the same as the IUPACna standard with only the addition of the gap symbol. The entries in this dictionary are: Symbol - Name A - Adenine C - Cytosine M - A or C G - Guanine R - G or A S - G or C V - G or V or A U - Uracil W - A or U Y - U or C H - A or C or U K - G or U D - G or A or U B - G or U or C - - Gap N - A or G or U or C. | |
Sequence |
This is the standard implementation of the ISequence interface. It contains
the raw data that defines the contents of a sequence. Since Sequence uses
enumerable of bytes that can be accessed as follows:
Sequence mySequence = new Sequence(Alphabets.DNA, "GATTC");
foreach (Nucleotide nucleotide in mySequence) { ... }
The results will be based on the Alphabet associated with the
sequence. Common alphabets include those for DNA, RNA, and Amino Acids.
For users who wish to get at the underlying data directly, Sequence provides
a means to do this as well. This may be useful for those writing algorithms
against the sequence where performance is especially important. For these
advanced users access is provided to the encoding classes associated with the
sequence.
| |
SequenceEqualityComparer |
This class gives the Sequence Equality Comparer.
| |
SequenceRange |
A SequenceRange holds the data necessary to represent a region within
a sequence defined by its start and end index without necessarily holding
any of the sequence item data. At a minimum and ID, start index, and end
index are required. Additional metadata can be stored as well using a
generic key value pair.
| |
SequenceRangeGrouping |
A grouping of SequenceRange objects sorted by their ID values. The
purpose of these groups is to allow a set of SequenceRange objects
to be associated together by bucketing them into groups where each
bucket has a unique SequenceRange ID and all SequenceRange objects
within the bucket has that same ID.
| |
SequenceStatistics |
SequenceStatistics is used to keep track of the number of occurrences of each symbol within
a sequence.
| |
SimpleConsensusResolver |
Calculate the consensus for a list of symbols using simple frequency fraction method.
Normal (non-gap) symbols are given a weight of 100.
The confidence of a symbol is the sum of weights for that symbol,
divided by the total number of symbols occurring at that position.
If symbols have confidence >= threshold, symbol corresponding
to set of these high confidence symbols is used.
If no symbol meets the threshold, symbol corresponding
to set of all the symbols at that position is used.
For ambiguous symbols, the corresponding set of base symbols are retrieved. And for frequency calculation, each base symbol is given a weight of (100 / number of base symbols). | |
SnpItem |
Represents a single nucleotide polymporphism (Snp) at a particular
position for a certain chromosome, with the two possible allele
values for that position.
| |
SparseSequence |
SparseSequence can hold discontinuous sequence. Use this class for storing the sequence items
with their known position from a long continuous sequence. This class uses SortedDictionary to store
the sequence items with their position. Position is zero based indexes at which a sequence items
are present in the original continues sequence.
For example:
To store sequence items at position 10, 101, 200, 1501 this class can be used as shown in the below code.
// Create a SparseSequence by specifying the Alphabet.
SparseSequence mySparseSequence= new SparseSequence(Alphabets.DNA);
// By default count will be set to zero. To insert a sequence item at a position greater than zero,
// Count has to be set to a value greater than the maximum position value.
// If try to insert a sequence item at a position greater than the count an exception will occur.
// You can limit the SparseSequence length by setting the count to desired value. In this example it
will be 1502 as the maximum index is 1501.
mySparseSequence.Count = 1502;
// To access the value in a SparseSequence use Indexer or an Enumerator like below.
// Accessing SparsesSequence using Indexer.
byte seqItem1 = mySparseSequence [10] ; // this will return sequence item A.
byte seqItem2 = mySparseSequence [1501] ; // this will return sequence item G.
byte seqItem3 = mySparseSequence [102] ; // this will return null as there is no sequence item at this position.
// Accessing SparsesSequence using Enumerator.
foreach(byte seqItem in mySparseSequence) {…}
| |
StringListValidator |
A validator for string values that has a specific list of allowed values.
| |
WordMatch |
WordMatch stores the region of similarity between two sequences.
|
Structure | Description | |
---|---|---|
CloneLibraryInformation |
Stores Information of Library.
| |
DifferenceNodeCompareFeature |
Structure that maintains node structure for feature list.
|
Interface | Description | |
---|---|---|
IAlphabet |
An alphabet defines a set of symbols common to a particular representation
of a biological sequence. The symbols in these alphabets are those you would find
as the individual sequence items in an ISequence variable.
The symbols in an alphabet may represent a particular biological structure or they may represent information helpful in understanding a sequence. For instance gap symbol, termination symbol, and symbols representing items whose definition remains ambiguous are all allowed. | |
IConsensusResolver |
Framework to compute the consensus for a list of symbols
For example, one can construct consensus for a set of aligned sequences in the following way: Sequence 1: A G T C G A Sequence 2: A G G C - A Sequence 3: A G G T G - Consensus : A G G C G A In the example here, we might choose the character that occurs maximum number of times for consensus This means that consensus for characters at position 1: {A, A, A} is A, while consensus for characters at position 3: {T, G, G} is G, and so on. This interface provides the framework for consensus generation. Implement this interface to provide different implementations for building consensus. | |
IParameterValidator |
A simple interface to an object that can check a value
for conformance to any required validation rules.
| |
IQualitativeSequence |
Sequence with qualitative data
| |
ISequence |
Implementations of ISequence make up the one of the core sets
of data structures in Bio. It is these sequences that store
data relevant to DNA, RNA, and Amino Acid structures. Several
algorithms for alignment, assembly, and analysis take these items
as their basic data inputs and outputs.
| |
ISequenceRange |
A SequenceRange holds the data necessary to represent a region within
a sequence defined by its start and end index without necessarily holding
any of the sequence item data. At a minimum and ID, start index, and end
index are required. Additional metadata can be stored as well using a
generic key value pair.
|
Enumeration | Description | |
---|---|---|
FastQFormatType |
A FastQFormatType specifies the format of quality scores.
| |
IntersectOutputType |
This enum indicates type of output an intersect operation should return.
| |
SubtractOutputType |
This enum indicates type of output an subtract operation should return.
|