Difference between revisions of "Parsing Blast Results from Your Favorite Database"

From GcatWiki
Jump to: navigation, search
 
(37 intermediate revisions by the same user not shown)
Line 1: Line 1:
This tutorial assumes the user has blast version __________ installed and has already made their local blast database (on their computer). It is also written for mac users, however all scripts and tools mentioned here are windows compatible with windows.
+
This tutorial demonstrates how to quickly parse out hits from a '''blastn''' search on a local blast database. A python script ([http://dl.dropbox.com/u/4936834/blastparse.py blastParse.py]) is used to parse the data. This script was developed to identify what pieces of an unfinished blueberry genome (scaffold/contig) typically had the most instances of chloroplasts or mitochondrial DNA.
  
#Run your search using the command<pre>/usr/local/ncbi/blast/bin/blastn -query scf1453.fasta -db bb_latest_assembly.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send"</pre>
+
NOTE: This tutorial assumes the user has blast version 2.2.24 installed and has already made their local blast database (on their computer). It is also written for Macintosh users; however, all scripts and tools are Windows compatible or have similar programs for Windows.
#*What does this command do?
 
  
#Download blastParse.py
+
#Open terminal and navigate into the folder containing the blast query sequence and the blast data base using following Unix commands<pre>cd OR ls</pre>
 +
#Run your blast search using the command (remember to replace the command parameters)<pre>/usr/local/ncbi/blast/bin/blastn -query querySequence.fasta -db dataBase.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send" -out blast_output.txt</pre>
 +
#:What does this command do? "-" indicates a command. The text that follows is the actual command parameters.
 +
#:*-query = query file
 +
#:*-db = database to search
 +
#:*-outfmt = output format '''<span style="color:#FF0000">DO NOT CHANGE THIS UNLESS YOU KNOW HOW TO EDIT blastParse.py</span>'''
 +
#:*-out = output file for blast results
 +
#Download [http://dl.dropbox.com/u/4936834/blastparse.py blastParse.py]
 
#Place blastParse.py into the same folder as your blast results file
 
#Place blastParse.py into the same folder as your blast results file
#Open terminal and navigate into the folder containing blastParse.py using following Unix commands<pre>cd OR ls</pre>
+
#In terminal, navigate into the folder containing blastParse.py and your blast results file
 
#Run blastParse.py using the command <pre>python blastParse.py</pre>
 
#Run blastParse.py using the command <pre>python blastParse.py</pre>
 +
#Follow prompts of the BLASTPARSE program
 +
#Results will be saved as tab delimited data in text files. If you would like to visualize the data, open files in excel to make graphs (''right click'' > open with > excel).
 +
 +
 +
A video of this tutorial can be found [http://megaswf.com/serve/1028117 here].

Latest revision as of 22:46, 23 February 2011

This tutorial demonstrates how to quickly parse out hits from a blastn search on a local blast database. A python script (blastParse.py) is used to parse the data. This script was developed to identify what pieces of an unfinished blueberry genome (scaffold/contig) typically had the most instances of chloroplasts or mitochondrial DNA.

NOTE: This tutorial assumes the user has blast version 2.2.24 installed and has already made their local blast database (on their computer). It is also written for Macintosh users; however, all scripts and tools are Windows compatible or have similar programs for Windows.

  1. Open terminal and navigate into the folder containing the blast query sequence and the blast data base using following Unix commands
    cd OR ls
  2. Run your blast search using the command (remember to replace the command parameters)
    /usr/local/ncbi/blast/bin/blastn -query querySequence.fasta -db dataBase.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send" -out blast_output.txt
    What does this command do? "-" indicates a command. The text that follows is the actual command parameters.
    • -query = query file
    • -db = database to search
    • -outfmt = output format DO NOT CHANGE THIS UNLESS YOU KNOW HOW TO EDIT blastParse.py
    • -out = output file for blast results
  3. Download blastParse.py
  4. Place blastParse.py into the same folder as your blast results file
  5. In terminal, navigate into the folder containing blastParse.py and your blast results file
  6. Run blastParse.py using the command
    python blastParse.py
  7. Follow prompts of the BLASTPARSE program
  8. Results will be saved as tab delimited data in text files. If you would like to visualize the data, open files in excel to make graphs (right click > open with > excel).


A video of this tutorial can be found here.