Gene Annotation Template

From GcatWiki
Revision as of 01:51, 9 September 2008 by Salt (talk | contribs)
Jump to: navigation, search

Gene Annotation Log - Template

Basic Information:

DNA Coordinates:

DNA Sequence (FASTA format):

Protein Sequence (FASTA format):

Isoelectric Point:



Similarity Data (Sequence-Based):

BLAST Data:
- Gene Product Name:
- Top hit – organism:
- Length, Score, E-value, Identity, Positives and Gaps
NCBI Statistics
- Alignment of Top Hit and Query Sequence
Alignment Scoring

CDD: Conserved Domains Database
- Significant COG Hits:
Definition of COG
- Names of COGs:
- Score:
- E-value:
CDD website

PDB: Protein Data Bank
- Significant Structure Hits:
This database provides information about the structures of proteins in addition to performing a BLAST alignment.
o Length
o Score
o E-value
o Identities
o Positives
o Gaps
o Alignment
PDB website

T-Coffee:
- Multi-Sequence Alignment
T-coffee Website
This is a useful tool, but it is confusing to use.



Cellular Localization Data:

TMHMM:
http://www.cbs.dtu.dk/services/TMHMM-2.0/
This database predicts the number of transmembrane helices in a protein.
- Number of Predicted TMH’s
- Transmembrane Topology graph and comment

SignalP:
http://www.cbs.dtu.dk/services/SignalP/
This database predicts whether or not a protein is a signal protein.
- Signal Peptide Probability
- Signal Peptide Graph

PSORT:
http://psort.ims.u-tokyo.ac.jp/form.html
This database predicts protein localization sites.
- Cytoplasmic Score:
- Cytoplasmic Membrane Score:
- Periplasmic Score:
- Outer Membrane Score:
- Extracellular Score:
- Final Prediction for Protein Location (of the above listed):

Phobius:
http://phobius.sbc.su.se/
This database lists the locations of the predicted transmembrane helices and intervening loop regions.
Note: If the report states that the protein is non cytoplasmic or cytoplasmic, it simply predicts that no transmembrane helices are likely. It should not be used as a predictor of location.

- Enter Graph:

Final Hypothesis: Where do you expect to find this protein?



Alternative Open Reading Frames:

Proposed DNA Coordinates:

Reasoning:



Structure-Based Evidence of Function:

Pfam-A:
- Significant Matches:
- Pfam Name:
- Pairwise Alignment:
- HMM logo:
- Key Functional Residues:

PDB:
- Significant Structure Hits:
o Length
o Score
o E-value
o Identities
o Positives
o Gaps
- Alignment:



Pathways:

KEGG:
This website has two tools:

- KEGG Pathway is a database that is a collection of pathway maps to represent the molecular interaction and reaction networks for:

     1. Metabolism
     2. Genetic Information Processing
     3. Environmental Information Processing
     4. Cellular Processes
     5. Human Diseases

- KEGG Module is a collection of pathway modules, molecular complexes, and other functional units

EcoCyc:
This is a bioinformatics database that describes the genome and the biochemical machinery of E. coli K-12 MG1655. It can be used as a reference source that we can relate our findings to.

E.C. Number:



Duplication and Degradation:


Duplication:
Paralogs are homologous genes within a single species that arose by gene duplication. Through analysis of paralogs, we can determine which genes may have been duplicated.

You can search for paralogs of an individual gene by:
Scrolling to the bottom of the Gene Detail page.
Under "Homolog Display", you will find a "Homolog selection" dropbox.
Select "Paralogs / Orthologs."

JGI requests certain information about the top paralog hit:
- Gene Object ID
- Length (bp)
- Score
- E-value
- Identity
- Positives
- Gaps
- Alignment of Top Hit and Query Sequence:
Alignment Instructions

Other possible information:
- Number of paralogs above a certain Bit Score.
- How could we measure Degradation?



Evidence of Horizontal Gene Transfer:

Phylogenetic Tree Diagram:

Gene Context:
- Ortholog Neighborhood Region of Organism:
- Examples of similarities or Differences:
- Comment:

Chromosome Viewer GC Heat Map:
- Characteristic GC% of genome:
- Average GC% of gene:



RNA (Rfam):

RNA Family:
Bits Score:

Alignment: