Help Documentation

Annotation Walkthrough

Background
Records in ORegAnno are annotated by users who have created an account for the database. New records can be submitted through the annotation button in the user menu:


Or, new records can be submitted en masse by an administrator. Administrators have access to the batch functionality in the ORegAnno database as seen below:


This walkthrough describes how to annotate a new record from the User Menu
Annotating Walkthrough
  1. Once you have selected "annotate" from the "User Menu", you will be asked to choose the type of record you are annotating. Currently, supported records are promoter, transcription factor binding site, regulatory polymorphisms, and regulatory haplotypes (see Understanding Record types). You will also need to have this paper checked out from the publication queue (in state "OPEN") before you can begin annotation (to understand ORegAnno's Publication Queue see, Understanding the Queue) (We will accept more types on request as this is easy to upgrade the database to support) If you are not logged in, clicking on "annotate" will redirect you to the login page.


  2. A valid annotation type will redirect you to the "Add a new record" page once the "Annotate button" has been selected.


  3. We will now go through the steps required to enter a "Regulatory Polymorphism" record to the database.
    1. STABLE ID: The stable id is an unchanging ID for this record. ORegAnno will offer a stable id for your record. You have an option to choose another stable id if you prefer by clicking the "Refresh Stable ID" button. Stable ID's for records take the form OREG1234567
    2. DATASET: The dataset field is to link this new annotation to an existing dataset. If this record is not associated to an established dataset, the ORegAnno field should be selected.
    3. GENE: To annotate the gene that is target by the regulatory sequence, you can use either an existing EnsEMBL gene id of the form (ENSG1234567) noting that the ENSG identifier must match the associated database chosen, an NCBI Entrez Gene ID, or a user-defined name (not recommended unless the record's gene target is unknown). To choose the how you want to enter a gene, select the radio button for the source, wait for the page refresh and enter the appropriate information. All entered information is cross-referenced against NCBI or EnsEMBL before a record is saved to the database.


    4. TRANSCRIPTION FACTOR: In the same manner as entering the target gene, the relevant transcription factor must be entered in the database.
    5. PROMOTER NAME: Some enhancers/promoters have names assigned to them. As there is not standardization for this practice, ORegAnno accepts this information if it is available. (Its presence is optional)
    6. TARGET SPECIES: Using a taxonomy ID from the NCBI Taxonomy database, enter the species of interest for this record. Newly entered taxonomy IDs will be cross-referenced against NCBI before the record is saved.
    7. SEQUENCE: Different types of records have different requirements for sequence. The 3 types that are required are "SEQUENCE", "SEQUENCE WITH FLANK", and "SEARCH SPACE". "SEQUENCE" is the regulatory sequence that has been identified in literature. "SEQUENCE WITH FLANK" is the regulatory sequence, plus enough sequence to make this sequence mappable to the genome using a tool like Blast. "SEARCH SPACE" is the sequence that was experimentally assayed. This acts as negative control sequence for the detection of the regulatory sequence when compared with the experimental evidence.
    8. REGULATORY VARIATION SEQUENCES: These sequences are used when annotating a REGULATORY POLYMORPHISM or REGULATORY HAPLOTYPE record. They are the reference genome, the variant sequence (with the variation in place), and corresponding variation ID (if available from dbSNP). An example of a REGULATORY POLYMORPHISM annotation is a SNP (T nucleotide) which is confirmed to lower gene expression may be written as ...aaaaTaaaa... in the reference sequence and ...aaaaTaaaa.... in the variant sequence (that is, if the reference genome sequence possesses the variant which alters/lowers expression). REGULATORY HAPLOTYPES are just a set of these variant sequences which are in linkage disequilibrium (one of which, is likely a REGULATORY POLYMORPHISM but is has not been experimentally identified). Each variation is specified as either GERMLINE, SOMATIC, or ARTIFICIAL.
    9. REFERENCE: A valid PubMED ID for a curated publication.
    10. EVIDENCE: The evidence that describes how this regulatory sequence was discovered in publication. For more detailed information on Evidence types, subtypes, and classes plus those currently supported in ORegAnno, read the Evidence help page here. If known, information on the cell type can be supplied using the eVOC: Cell type ontology. Multiple lines of evidence can be added or removed using the "Add Evidence Description" or "Remove Evidence Description" buttons.
    11. SNP SOURCE:If a regulatory variant has an associated dbSNP ID, it can be entered here. This information is optional and is cross-referenced against dbSNP before the record is saved.
    12. OUTCOME: The outcome of the experiment is recorded as either a POSITIVE, NEUTRAL, or NEGATIVE outcome. A positive outcome describes an annotation where the sequence was confirmed to be regulatory, a neutral outcome is where the experiment directly states the the results were inconclusive, and a negative result is where the primary literature confirms the sequence as not being regulatory (even though it may be regulatory under different conditions). The outcome in all cases should be translated as directly as possible from the associated literature's conclusions.
    13. META DATA: Some types of records have special meta data that can be attached to them. Browse the list of available Meta Data tags, and then add new elements for all the extra attributes that you see fit. (An example for D.melanogaster is the FlyBASE expression id term)
    14. USER: The date and annotator are stored with the annotation
  4. Once an annotation has been entered and is cross-referenced successfully against the necessary external databases, the "Review Record" page will be displayed to show the annotation in its final state.
  5. From the "Review Record" page, the annotator clicks the "Commit Record" button to commit the record to the database, email the record to the ORegAnno guts mailing list, and index it in the search engine.


  6. CONCLUDING NOTES: Once an annotation has been entered, users with the "Validator" role (read the help page about roles for more information) will be able to review, score, and or deprecate this record.