Bio.IO.GenBank Namespace |
[Missing <summary> documentation for "N:Bio.IO.GenBank"]
Class | Description | |
---|---|---|
Attenuator |
Region of DNA at which regulation of termination of transcription occurs,
which controls the expression of some bacterial operons.
Sequence segment located between the promoter and the first structural gene that
causes partial termination of transcription.
| |
CaatSignal |
CAAT box; part of a conserved sequence located about 75 bp up-stream of the start point
of eukaryotic transcription units which may be involved in RNA polymerase binding.
Consensus=GG(C or T)CAATCT.
| |
CitationReference |
Citations for all articles containing data reported in this sequence.
Citations in PubMed that do not fall within Medline's scope will have only
a PUBMED identifier. Similarly, citations that *are* in Medline's scope but
which have not yet been assigned Medline UIs will have only a PUBMED identifier.
If a citation is present in both the PubMed and Medline databases, both a
MEDLINE and a PUBMED line will be present.
| |
CodingSequence |
Coding sequence (CDS); sequence of nucleotides that corresponds with the sequence of amino acids
in a protein (location includes stop codon); feature includes amino acid conceptual translation.
| |
CrossReferenceLink |
CrossReferenceLink provides cross-references to resources that support the existence
a sequence record, such as the Project Database and the NCBI
Trace Assembly Archive.
| |
DisplacementLoop |
Displacement Loop (D-Loop): A region within mitochondrial DNA in which a short stretch of RNA is paired with one strand
of DNA, displacing the original partner DNA strand in this region; also used to describe the
displacement of a region of one strand of duplex DNA by a single stranded invader in the
reaction catalyzed by RecA protein.
| |
Enhancer |
A cis-acting sequence that increases the utilization of (some) eukaryotic promoters,
and can function in either orientation and in any location (upstream or downstream)
relative to the promoter.
| |
Exon |
Exon is a region of genome that codes for portion of spliced mRNA, rRNA and tRNA; may contain 5'UTR,
all CDSs and 3' UTR.
| |
FeatureItem |
Feature of sequence present in the metadata can be stored in this class.
All qualifiers of the feature will be stored as sub items.
| |
FivePrimeUtr |
Region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein.
| |
GcSingal |
GC box; a conserved GC-rich region located upstream of the start point of eukaryotic transcription
units which may occur in multiple copies or in either orientation.
Consensus=GGGCGG.
| |
GenBankAccession |
Accession is identifier assigned to each GenBank sequence record.
It contains primary accession number and may contain secondary accession numbers.
| |
GenBankFormatter |
Writes an ISequence to a particular location, usually a file. The output is formatted
according to the GenBank file format. A method is also provided for quickly accessing
the content in string form for applications that do not need to first write to file.
| |
GenBankLocusInfo |
Locus provides a short mnemonic name for the sequence entry in gen bank
database, chosen to suggest the sequence's definition.
It also contains information like Sequence type, Strand type division code etc.
| |
GenBankLocusTokenParser |
Not all 3rd party programs respect the GenBank locus format. Due to this we cannot expect each item to lie in exact
indices with respect to the locus. In order to parse this information based off of tokens we do have to make certain
assumptions about the locus data, however this is well documented and for all but the ID field we know what the data type
will be and what values it may contain.
| |
GenBankLocusTokenParserLocusConstants |
List of text to enumeration mappings to better organize and contain variable information with respect to parsing
the locus.
| |
GenBankMetadata |
GenBankMetadata class holds metadata provided
by the gen bank flat file format.
| |
GenBankParser |
A GenBankParser reads from a source of text that is formatted according to the GenBank flat
file specification, and converts the data to in-memory ISequence objects. For advanced
users, the ability to select an encoding for the internal memory representation is
provided. There is also a default encoding for each alphabet that may be encountered.
Documentation for the latest GenBank file format can be found at
ftp.ncbi.nih.gov/genbank/gbrel.txt
| |
GenBankVersion |
A compound identifier consisting of the primary accession number and
a numeric version number associated with the current version of the
sequence data in the record. This is followed by an integer key
(a "GI") assigned to the sequence by NCBI.
| |
Gene |
The gene feature describes the interval of DNA that corresponds to a genetic trait or phenotype.
It is a region of biological interest identified as a gene and for which a name has been assigned.
This class is meant to represent a region where the gene is located.
| |
InterveningDna |
Intervening DNA (iDNA) is a DNA which is eliminated through any of several kinds of recombination.
For example, in the somatic processing of immunoglobulin genes.
| |
Intron |
A segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences
(exons) on either side of it.
| |
Location |
Location, holds the feature location information.
This is the default implementation of the ILocation interface.
This holds Start and End points of location.
If in case location refers to some other sequence (for example, J00194.1:1..150)
then the accession number information should be stored in the Accession property.
Resolver property is used to resolve any ambiguity in the location start-data and end-data.
By default this will be set to an instance of LocationResolver class.
| |
LocationBuilder |
This is the default implementation of the ILocationBuilder interface.
This class builds the location for the specified location string
and location string for the specified location instance.
| |
LocationRange |
Holds start and end position of a feature in a sequence.
For example:
If location of a feature is "join(1..100,J00194.1:100..202)"
then we need to two LocationRange instance to hold this location.
First LocationRange will be
Accession - empty
StartPosition -1
EndPosition - 100
Second LocationRange will be
Accession - J00194.1
StartPoistion - 100
EndPosition 1 200
Note that the GenBank feature location can be parsed using static method "GetLocationRanges" in GenBankMetadata class.
For example:
GenBankMetadata.GetLocationRanges("join(1..100,J00194.1:100..202)") this will return list of LocationRanges.
| |
LocationResolver |
This is the default implementation of ILocationResolver.
This class resolves the start and end positions of a location.
Please see the following table for how this class resolves the ambiguities in start and end data.
Start/End Data Resolved Start Resolved End
12.30 12 30
>30 30 30
<30 30 30
23^24 23 24
100^1 1000 1
| |
LongTerminalRepeat |
Long terminal repeat (LTR), a sequence directly repeated at both ends of a defined sequence,
of the sort typically found in retroviruses.
| |
MaturePeptide |
Mature peptide or protein coding sequence; coding sequence for the mature or final peptide or protein product following
post-translational modification; the location does not include the stop codon (unlike the corresponding CDS).
| |
MessengerRna |
Messenger RNA (mRNA); includes 5 prime un-translated region (5'UTR), coding sequences (CDS, exon)
and 3 prime un-translated region (3'UTR).
| |
Minus10Signal |
Pribnow box; a conserved region about 10 bp upstream of the start point of bacterial transcription units
which may be involved in binding RNA polymerase.
Consensus=TAtAaT.
| |
Minus35Signal |
A conserved hexamer about 35 bp upstream of the start point of bacterial transcription units.
Consensus=TTGACa or TGTTGACA.
| |
MiscBinding |
Site in nucleic acid which covalently or non-covalently binds another moiety that cannot be described
by any other binding key (primer_bind or protein_bind).
| |
MiscDifference |
Feature sequence is different from that presented in the entry and cannot be described by any
other Difference key (conflict, unsure, old_sequence, variation, or modified_base).
| |
MiscFeature |
Region of biological interest which cannot be described by any other feature key; a new or rare feature.
| |
MiscRecombination |
Site of any generalized, site-specific or replicative recombination event where there is a breakage and
reunion of duplex DNA that cannot be described by other recombination keys or qualifiers of source key (/proviral).
| |
MiscRna |
Any transcript or RNA product that cannot be defined by other RNA keys (prim_transcript, precursor_RNA,
mRNA, 5'UTR, 3'UTR, exon, CDS, sig_peptide, transit_peptide, mat_peptide, intron, polyA_site, ncRNA, rRNA and tRNA).
| |
MiscSignal |
Any region containing a signal controlling or altering gene function or expression that cannot be described
by other signal keys (promoter, CAAT_signal, TATA_signal, -35_signal, -10_signal, GC_signal, RBS, polyA_signal,
enhancer, attenuator, terminator, and rep_origin).
| |
MiscStructure |
Any secondary or tertiary nucleotide structure or conformation that cannot be described by
other Structure keys (stem_loop and D-loop).
| |
ModifiedBase |
The indicated nucleotide is a modified nucleotide and should be substituted for by the
indicated molecule (given in the ModifiedNucleotideBase qualifier value).
| |
NonCodingRna |
A non-protein-coding gene (ncRNA), other than ribosomal RNA and transfer RNA, the functional
molecule of which is the RNA transcript.
| |
OperonRegion |
Operon is a region containing polycistronic transcript containing genes that encode enzymes
that are in the same metabolic pathway and regulatory sequences.
| |
OrganismInfo |
Provides Genus, Species and taxonomic classification levels of the sequence.
| |
PolyASignal |
Recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation.
Consensus=AATAAA.
| |
PolyASite |
Site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation.
| |
PrecursorRna |
Any RNA species that is not yet the mature RNA product; may include 5' un-translated region (5'UTR),
coding sequences (CDS, exon), intervening sequences (intron) and 3' un-translated region (3'UTR).
| |
ProjectIdentifier |
The identifier of a project (such as a Genome Sequencing Project)
to which a GenBank sequence record belongs.
This is obsolete and was removed from the GenBank flat file format
after Release 171.0 in April 2009.
| |
Promoter |
Region on a DNA molecule involved in RNA polymerase binding to initiate transcription.
| |
ProteinBindingSite |
Non-covalent protein binding site on nucleic acid.
| |
RepeatRegion |
Region of genome containing repeating units.
| |
ReplicationOrigin |
Origin of replication (rep_origin); starting site for duplication of nucleic acid to give two identical copies.
| |
RibosomalRna |
Mature ribosomal RNA (rRNA); RNA component of the ribonucleoprotein particle (ribosome)
which assembles amino acids into proteins.
| |
RibosomeBindingSite |
Ribosome binding site (RBS).
In prokaryotes, known as the Shine-Dalgarno sequence: is located 5 to 9 bases upstream of the initiation codon.
Consensus GGAGGT.
| |
SequenceFeatures |
Contains information about genes and gene products,
as well as regions of biological significance reported
in the sequence.
| |
SequenceSegment |
Segment provides the information on the order in which this entry appears in a
series of discontinuous sequences from the same molecule.
| |
SequenceSource |
Source provides the common name of the organism or the name most frequently used
in the literature along with the taxonomic classification levels
| |
SignalPeptide |
Signal peptide coding sequence; coding sequence for an N-terminal domain of a secreted protein; this
domain is involved in attaching nascent polypeptide to the membrane leader sequence.
| |
StandardFeatureKeys |
Static class to hold standard feature keys.
| |
StandardFeatureMap |
Class to map each standard feature key to the class which can hold that feature.
Note that the classes which can hold feature has to be derived from FeatureItem class.
| |
StandardQualifierNames |
Static class to hold standard qualifier names.
| |
StemLoop |
Hairpin; a double-helical region formed by base-pairing between adjacent (inverted) complementary sequences
in a single strand of RNA or DNA.
| |
TataSignal |
TATA box; Goldberg-Hogness box; a conserved AT-rich septamer found about 25 bp before the start point
of each eukaryotic RNA polymerase II transcript unit which may be involved in positioning the enzyme
for correct initiation; consensus=TATA(A or T)A(A or T).
| |
Terminator |
Sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.
| |
ThreePrimeUtr |
ThreePrimeUTR (3'UTR) is a Region at the 3' end of a mature transcript (following the stop codon) that
is not translated into a protein.
| |
TransferMessengerRna |
Transfer messenger RNA; tmRNA acts as a tRNA first, and then as an mRNA that encodes a peptide tag;
the ribosome translates this mRNA region of tmRNA and attaches the encoded peptide tag to the
C-terminus of the unfinished protein; this attached tag targets the protein for destruction or proteolysis.
| |
TransferRna |
Mature transfer RNA (tRNA), a small RNA molecule (75-85 bases long) that mediates the translation of
a nucleic acid sequence into an amino acid sequence.
| |
TransitPeptide |
Transit peptide coding sequence (transit_peptide); coding sequence for an N-terminal domain of a nuclear-encoded organellar protein;
this domain is involved in post-translational import of the protein into the organelle.
| |
UnsureSequenceRegion |
UnsureSequenceRegion (Unsure) is a region in which author is unsure of exact sequence.
| |
Variation |
A related strain contains stable mutations from the same gene (e.g., RFLPs, polymorphisms, etc.)
which differ from the presented sequence at this location (and possibly others).
|
Interface | Description | |
---|---|---|
ILocation |
Interface to hold location information.
| |
ILocationBuilder |
Interface to build the location from location string and from location object to location string.
| |
ILocationResolver |
Interface to resolve the start and end positions of a location.
Classes which implements this interface should resolve any ambiguity in
the start and end positions of a location.
Please refer LocationResolver for default implementation of this interface.
|
Enumeration | Description | |
---|---|---|
CrossReferenceType |
A CrossReferenceType specifies whether the DBLink is
referring to project or a Trace Assembly Archive.
| |
LocationOperator |
Enum for location operators.
| |
MoleculeType |
A MoleculeType specifies which type of biological sequence is stored in an ISequence.
| |
SequenceDivisionCode |
A DivisionCode specifies which family a sequence belongs to.
| |
SequenceStrandTopology |
A StrandTopology specifies whether the strand is linear or circular.
| |
SequenceStrandType |
A StrandType specifies whether sequence occurs as a single stranded,
double stranded or mixed stranded.
|