Click or drag to resize
Bio Namespace
The base namespace for .NET Bio
Classes
  ClassDescription
Public classAATreeT
Arne Andersson Self Balancing Binary Search Tree.
Public classAATreeTKey, TValue
Dictionary like implementation using AATree.
Public classAlphabets
The currently supported and built-in alphabets for sequence items.
Public classAmbiguousDnaAlphabet
Ambiguous symbol in the DNA.
Public classAmbiguousProteinAlphabet
Ambiguous characters in the Protein.
Public classAmbiguousRnaAlphabet
Ambiguous symbols in the RNA.
Public classBigArrayT
Public classBigListT
Represents a strongly typed list of objects. Uses BigArray to store objects.
Public classCloneLibrary
Class created for reading data from resource file having library information. Singleton design pattern is used to create only one instance of class.
Public classDerivedSequence
This is a temporary implementation of DerivedSequence to support reversing and complementing a sequence.
Public classDifferenceNode
Node that tracks difference between the two sequences.
Public classDnaAlphabet
The basic alphabet that describes symbols used in DNA sequences. This alphabet allows not only for the four base nucleotide symbols, but also for various ambiguities, termination, and gap symbols.

The character representations come from the NCBI4na standard and are used in many sequence file formats. The NCBI4na standard is the same as the IUPACna standard with only the addition of the gap character.

The entries in this dictionary are: Symbol - Name A - Adenine C - Cytosine M - A or C G - Guanine R - G or A S - G or C V - G or V or A T - Thymine W - A or T Y - T or C H - A or C or T K - G or T D - G or A or T B - G or T or C - - Gap N - A or G or T or C.

Public classIndexedItemT
IndexedItem holds an item and its index. Index is a zero based position of item. This class is used in Sparse Sequence to get the known sequence items with their positions. This class implements IComparable interface and all comparisons are based on index and not on item.
Public classMetadataListItemT
It is common for a biological sequence file to contain lists of certain types of metadata, such as features or references, which can be stored as MetadataListItems. A MetadataListItem contains a key (which might not be unique) a free-text field of top level information (such as a sequence location), and a list of sub-items, each consisting of a key and a data field of type T. If the sub-items have unique keys, a string type can be used for T. But if the sub-item keys are not unique, a list of strings should be used for T.
Public classPlatformManager
Platform manager - this holds all the platform specific services.
Public classProteinAlphabet
The basic alphabet that describes symbols used in sequences of amino acids that come from codon encodings of RNA. This alphabet allows for the twenty amino acids as well as a termination and gap symbol.

The character representations come from the NCBIstdaa standard and are used in many sequence file formats. The NCBIstdaa standard has all the same characters as NCBIeaa and IUPACaa, but adds Selenocysteine, termination, and gap symbols to the latter.

The entries in this dictionary are: Symbol - Extended Symbol - Name A - Ala - Alanine C - Cys - Cysteine D - Asp - Aspartic Acid E - Glu - Glutamic Acid F - Phe - Phenylalanine G - Gly - Glycine H - His - Histidine I - Ile - Isoleucine K - Lys - Lysine L - Leu - Leucine M - Met - Methionine N - Asn - Asparagine O - Pyl - Pyrrolysine P - Pro - Proline Q - Gln - Glutamine R - Arg - Arginine S - Ser - Serine T - Thr - Threoine U - Sel - Selenocysteine V - Val - Valine W - Trp - Tryptophan Y - Tyr - Tyrosine * - Ter - Termination - - --- - Gap.

Public classQualitativeSequence
This class holds quality scores along with the sequence data.
Public classRnaAlphabet
The basic alphabet that describes symbols used in RNA sequences. This alphabet allows not only for the four base nucleotide symbols, but also for various ambiguities, termination, and gap symbols.

The symbol representations come from the NCBI4na standard and are used in many sequence file formats. The NCBI4na standard is the same as the IUPACna standard with only the addition of the gap symbol.

The entries in this dictionary are: Symbol - Name A - Adenine C - Cytosine M - A or C G - Guanine R - G or A S - G or C V - G or V or A U - Uracil W - A or U Y - U or C H - A or C or U K - G or U D - G or A or U B - G or U or C - - Gap N - A or G or U or C.

Public classSequence
This is the standard implementation of the ISequence interface. It contains the raw data that defines the contents of a sequence. Since Sequence uses enumerable of bytes that can be accessed as follows: Sequence mySequence = new Sequence(Alphabets.DNA, "GATTC"); foreach (Nucleotide nucleotide in mySequence) { ... } The results will be based on the Alphabet associated with the sequence. Common alphabets include those for DNA, RNA, and Amino Acids. For users who wish to get at the underlying data directly, Sequence provides a means to do this as well. This may be useful for those writing algorithms against the sequence where performance is especially important. For these advanced users access is provided to the encoding classes associated with the sequence.
Public classSequenceEqualityComparer
This class gives the Sequence Equality Comparer.
Public classSequenceRange
A SequenceRange holds the data necessary to represent a region within a sequence defined by its start and end index without necessarily holding any of the sequence item data. At a minimum and ID, start index, and end index are required. Additional metadata can be stored as well using a generic key value pair.
Public classSequenceRangeGrouping
A grouping of SequenceRange objects sorted by their ID values. The purpose of these groups is to allow a set of SequenceRange objects to be associated together by bucketing them into groups where each bucket has a unique SequenceRange ID and all SequenceRange objects within the bucket has that same ID.
Public classSequenceStatistics
SequenceStatistics is used to keep track of the number of occurrences of each symbol within a sequence.
Public classSimpleConsensusResolver
Calculate the consensus for a list of symbols using simple frequency fraction method. Normal (non-gap) symbols are given a weight of 100. The confidence of a symbol is the sum of weights for that symbol, divided by the total number of symbols occurring at that position. If symbols have confidence >= threshold, symbol corresponding to set of these high confidence symbols is used. If no symbol meets the threshold, symbol corresponding to set of all the symbols at that position is used.

For ambiguous symbols, the corresponding set of base symbols are retrieved. And for frequency calculation, each base symbol is given a weight of (100 / number of base symbols).

Public classSnpItem
Represents a single nucleotide polymporphism (Snp) at a particular position for a certain chromosome, with the two possible allele values for that position.
Public classSparseSequence
SparseSequence can hold discontinuous sequence. Use this class for storing the sequence items with their known position from a long continuous sequence. This class uses SortedDictionary to store the sequence items with their position. Position is zero based indexes at which a sequence items are present in the original continues sequence. For example: To store sequence items at position 10, 101, 200, 1501 this class can be used as shown in the below code. // Create a SparseSequence by specifying the Alphabet. SparseSequence mySparseSequence= new SparseSequence(Alphabets.DNA); // By default count will be set to zero. To insert a sequence item at a position greater than zero, // Count has to be set to a value greater than the maximum position value. // If try to insert a sequence item at a position greater than the count an exception will occur. // You can limit the SparseSequence length by setting the count to desired value. In this example it will be 1502 as the maximum index is 1501. mySparseSequence.Count = 1502; // To access the value in a SparseSequence use Indexer or an Enumerator like below. // Accessing SparsesSequence using Indexer. byte seqItem1 = mySparseSequence [10] ; // this will return sequence item A. byte seqItem2 = mySparseSequence [1501] ; // this will return sequence item G. byte seqItem3 = mySparseSequence [102] ; // this will return null as there is no sequence item at this position. // Accessing SparsesSequence using Enumerator. foreach(byte seqItem in mySparseSequence) {…}
Public classStringListValidator
A validator for string values that has a specific list of allowed values.
Public classWordMatch
WordMatch stores the region of similarity between two sequences.
Structures
  StructureDescription
Public structureCloneLibraryInformation
Stores Information of Library.
Public structureDifferenceNodeCompareFeature
Structure that maintains node structure for feature list.
Interfaces
  InterfaceDescription
Public interfaceIAlphabet
An alphabet defines a set of symbols common to a particular representation of a biological sequence. The symbols in these alphabets are those you would find as the individual sequence items in an ISequence variable.

The symbols in an alphabet may represent a particular biological structure or they may represent information helpful in understanding a sequence. For instance gap symbol, termination symbol, and symbols representing items whose definition remains ambiguous are all allowed.

Public interfaceIConsensusResolver
Framework to compute the consensus for a list of symbols

For example, one can construct consensus for a set of aligned sequences in the following way: Sequence 1: A G T C G A Sequence 2: A G G C - A Sequence 3: A G G T G - Consensus : A G G C G A

In the example here, we might choose the character that occurs maximum number of times for consensus This means that consensus for characters at position 1: {A, A, A} is A, while consensus for characters at position 3: {T, G, G} is G, and so on.

This interface provides the framework for consensus generation. Implement this interface to provide different implementations for building consensus.
Public interfaceIParameterValidator
A simple interface to an object that can check a value for conformance to any required validation rules.
Public interfaceIQualitativeSequence
Sequence with qualitative data
Public interfaceISequence
Implementations of ISequence make up the one of the core sets of data structures in Bio. It is these sequences that store data relevant to DNA, RNA, and Amino Acid structures. Several algorithms for alignment, assembly, and analysis take these items as their basic data inputs and outputs.
Public interfaceISequenceRange
A SequenceRange holds the data necessary to represent a region within a sequence defined by its start and end index without necessarily holding any of the sequence item data. At a minimum and ID, start index, and end index are required. Additional metadata can be stored as well using a generic key value pair.
Enumerations
  EnumerationDescription
Public enumerationFastQFormatType
A FastQFormatType specifies the format of quality scores.
Public enumerationIntersectOutputType
This enum indicates type of output an intersect operation should return.
Public enumerationSubtractOutputType
This enum indicates type of output an subtract operation should return.