Difference between revisions of "Potential Gene Across-Species Analysis with Mr. Bayes"

From GcatWiki
Jump to: navigation, search
Line 1: Line 1:
 +
== Potential Gene Across-Species Analysis with Mr. Bayes ==
  
1. Isolating Potential Genes Across Species <br><br>
+
 
2. Aligning Potential Genes with MUSCLE <br><br>
+
 
3. Building NEXUS file <br><br>
+
=== Make databases of species genome FASTA files ===
4. Downloading MrBayes <br><br>
+
<pre>
5. Using MrBayes <br><br>
+
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in bb_latest_assembly.fasta
<tab>a)Commands
+
</pre>
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in VVgenome
 +
</pre>
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in rice_communisTIGR_castorWGS_release_0.1.assembly.fsa
 +
</pre>
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in fragaria vescachr.fna
 +
</pre>
 +
 
 +
=== Search for desired gene from Genbank ===
 +
 
 +
From species most closely related to target studied species (Vinis vinifera- Vaccinium corymbosum)<br><br>
 +
[http://www.ncbi.nlm.nih.gov/gene/100256566 VINST1] <br><br>
 +
 
 +
 
 +
=== Download FASTA file of nucleotides ===
 +
 
 +
[http://www.ncbi.nlm.nih.gov/nuccore/NW_002240906?report=fasta&from=206721&to=208614 VINST1 nucelotides] <br><br>
 +
 
 +
 
 +
=== BLASTn FASTA file of gene against databases of species sequences ===
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db bb_latest_assembly.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send"
 +
</pre>
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db VVgenome -outfmt "7 qacc sacc evalue qstart qend sstart send"
 +
</pre>
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db rice_communisTIGR_castorWGS_release_0.1.assembly.fsa -outfmt "7 qacc sacc evalue qstart qend sstart send"
 +
</pre>
 +
<pre>
 +
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db fragaria vescachr.fna -outfmt "7 qacc sacc evalue qstart qend sstart send"
 +
</pre>
 +
 
 +
 
 +
=== Copy and paste top BLASTn hit for each species with FASTA heading into [http://www.ebi.ac.uk/Tools/services/web_muscle/toolform.ebi MUSCLE window] ===
 +
 
 +
=== Download [http://mrbayes.csit.fsu.edu/download.php MrBayes] ===
 +
 
 +
=== Copy alignments into simple NEXUS text file ===
 +
 +
Formatted below with suffix .nexus<br><br>
 +
ntax is the number of taxa (sequences aligned)<br><br>
 +
nchar is the number of characters of each alignment<br><br>
 +
 
 +
<pre>
 +
#NEXUS
 +
 
 +
begin data;
 +
dimensions ntax=4 nchar=1200;
 +
format datatype=dna interleave=nogap=-;
 +
matrix <br>
 +
Vaccinium_corybosum AGTCGCGCTAGCGCTG…ATGCTCGGTAGATCG
 +
Vinis_vinifera     AGTCGCGCTCGCGCTG…ATGCTCGATAGATCG
 +
Fragaria_vesca     AGTCGCGCTCGCGCTG…ATGCCCGATTGATCG
 +
Ricinus_communis    AGTCGCGCTCGCGCTG…ATGCTCGATAGATCG
 +
;
 +
end;
 +
 
 +
(shortened for demonstration, named VINST1.nexus)
 +
</pre>
 +
 
 +
=== Open MrBayes executable ===
 +
 
 +
Execute .nexus file:
 +
<pre>
 +
execute VINST1.nexus
 +
</pre>
 +
Set evolutionary model:
 +
<pre>
 +
lset nst=6 rates=invgamma
 +
</pre>
 +
Set number of samples:
 +
<pre>
 +
Mcmc ngen=10000 sample freq=10
 +
</pre>
 +
Summarize the parameter values (burnin=25% of your samples):
 +
<pre>
 +
Sump burnin=250
 +
</pre>
 +
Summarize and visualize the trees:
 +
<pre>
 +
Sumt burnin=250
 +
</pre>

Revision as of 13:57, 24 February 2011

Potential Gene Across-Species Analysis with Mr. Bayes

Make databases of species genome FASTA files

w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in bb_latest_assembly.fasta
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in VVgenome
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in rice_communisTIGR_castorWGS_release_0.1.assembly.fsa
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/makeblastdb -title testdb -dbtype nucl -in fragaria vescachr.fna

Search for desired gene from Genbank

From species most closely related to target studied species (Vinis vinifera- Vaccinium corymbosum)

VINST1


Download FASTA file of nucleotides

VINST1 nucelotides


BLASTn FASTA file of gene against databases of species sequences

w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db bb_latest_assembly.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send"
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db VVgenome -outfmt "7 qacc sacc evalue qstart qend sstart send"
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db rice_communisTIGR_castorWGS_release_0.1.assembly.fsa -outfmt "7 qacc sacc evalue qstart qend sstart send"
w10120:Desktop jamimms$ /usr/local/ncbi/blast/bin/blastn -query VINST1.fasta -db fragaria vescachr.fna -outfmt "7 qacc sacc evalue qstart qend sstart send"


Copy and paste top BLASTn hit for each species with FASTA heading into MUSCLE window

Download MrBayes

Copy alignments into simple NEXUS text file

Formatted below with suffix .nexus

ntax is the number of taxa (sequences aligned)

nchar is the number of characters of each alignment

#NEXUS

begin data;
	dimensions ntax=4 nchar=1200;
	format datatype=dna interleave=nogap=-;
	matrix <br>
	Vaccinium_corybosum AGTCGCGCTAGCGCTG…ATGCTCGGTAGATCG
	Vinis_vinifera	    AGTCGCGCTCGCGCTG…ATGCTCGATAGATCG
	Fragaria_vesca	    AGTCGCGCTCGCGCTG…ATGCCCGATTGATCG
	Ricinus_communis    AGTCGCGCTCGCGCTG…ATGCTCGATAGATCG
	;
end;

(shortened for demonstration, named VINST1.nexus)

Open MrBayes executable

Execute .nexus file:

execute VINST1.nexus

Set evolutionary model:

lset nst=6 rates=invgamma

Set number of samples:

Mcmc ngen=10000 sample freq=10

Summarize the parameter values (burnin=25% of your samples):

Sump burnin=250

Summarize and visualize the trees:

Sumt burnin=250