Difference between revisions of "Determining Unique and Conserved Proteins: How to Use Katie's Webpage"

From GcatWiki
Jump to: navigation, search
Line 2: Line 2:
  
 
Once you have the two sequences you want to compare in FASTA format, head to the [http://www.example.com Pairwise Genomic Comparison] page.  (Note: I'll add the link when Katie finalizes the webpage.)<br><br>
 
Once you have the two sequences you want to compare in FASTA format, head to the [http://www.example.com Pairwise Genomic Comparison] page.  (Note: I'll add the link when Katie finalizes the webpage.)<br><br>
 
  
 
<b>Comparing Proteomes Online: For Smaller Proteomes</b><br>
 
<b>Comparing Proteomes Online: For Smaller Proteomes</b><br>

Revision as of 05:21, 12 November 2009

Firstly, your two sequences must be in FASTA format in order to use our Pairwise Genomic Comparison program. If your sequences are in GenBank format (or another format), visit Claudia's tutorial page first to learn how to convert your sequences to FASTA.

Once you have the two sequences you want to compare in FASTA format, head to the Pairwise Genomic Comparison page. (Note: I'll add the link when Katie finalizes the webpage.)

Comparing Proteomes Online: For Smaller Proteomes
If each of your sequences is less than (Note: Add character limit here) characters, you can use our webpage to perform your comparison. I'll discuss how to download and use the Perl script later on in this tutorial in the likely event that your sequences are larger than our character limit. If they aren't too long, then enter your desired Expect (E), or threshold value. Keep in mind that lower E values will be more restrictive and lead to less matches by chance (though any matches found will be more statistically significant). For our purposes, using an E value around 0.001 or 0.01 should be sufficient.

Next, copy and paste the sequence you want to receive unique and conserved proteins for (still in FASTA format) into the first box. In the second box, add the sequence that you are using to compare your original sequence to. This sequence allows the program to determine which proteins are unique to the first species and which proteins are conserved between the two species. Once your page looks something like this, you're ready to "submit":

File:ADD

The program will provide two Excel files: one with a list of conserved proteins and their gene locations within the first proteome, and one with a list of unique proteins to the first proteome. (Note: Is this correct?) It is helpful to compare your proteome of interest with several related proteomes in order to identify which genes are actually conserved and which are probably unique to your species.

Downloading and Using the Perl Script for Larger Proteomes

If your proteome has more than (Note: add number) characters, then you'll have to download and run the program yourself. At the bottom of the program webpage you'll find a link to the Perl script for the comparison program. Make sure you have access to (PROGRAM NAME), which is needed to run programs written in Perl.