Beta-galactosidase (Olivia Ho-Shing)
I chose a well-known predicted gene involved in sugar metabolism for H. mukohataei: 644033004 beta-galactosidase/beta-glucuronidase ( EC:3.2.1.23 )
To verify this predicted protein, I used:
- BLASTn
- BLASTp
- Look for Shine-Dalgarno sequence within 50 bp upstream
- GC Calculator
BLASTn
Usually JGI highlights the start and stop codons in red, and any upstream or downstream sequence in green. However, with this nucleotide sequence, there was no start codon highlighted. The first codon of the sequence was TTG.
Here is the distribution and the alignment of the BLAST hits:
The first relevant BLAST hit I got from the sequence was Synthetic construct beta-galactosidase (lacZnls12co) gene, complete cds
Query Coverage = 44% Score = 206 bits (228) E-value = 4e-49 Identities = 658/1008 (65%) Gaps = 73/1008 (7%) Strand=Plus/Plus
These BLAST hits weren't as well-aligned as I thought they would be for this protein, and I was surprised that didn't seem to be a definitive start codon. The beginning of the query sequence did not align with the beginning of the hit described above either, but this could just mean that the protein is not well-conserved on the 5' and 3' ends.
BLASTp
Although the nucleotide sequence given by JGI did not begin with a definitive start codon, the amino acid sequence given still began with M, so JGI must use M as the default initiating amino acid without regarding the actual codon.