Difference between revisions of "Parsing Blast Results from Your Favorite Database"
From GcatWiki
(8 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | This tutorial demonstrates how to quickly parse out hits from a '''blastn''' search on a local blast database. | + | This tutorial demonstrates how to quickly parse out hits from a '''blastn''' search on a local blast database. A python script ([http://dl.dropbox.com/u/4936834/blastparse.py blastParse.py]) is used to parse the data. This script was developed to identify what pieces of an unfinished blueberry genome (scaffold/contig) typically had the most instances of chloroplasts or mitochondrial DNA. |
NOTE: This tutorial assumes the user has blast version 2.2.24 installed and has already made their local blast database (on their computer). It is also written for Macintosh users; however, all scripts and tools are Windows compatible or have similar programs for Windows. | NOTE: This tutorial assumes the user has blast version 2.2.24 installed and has already made their local blast database (on their computer). It is also written for Macintosh users; however, all scripts and tools are Windows compatible or have similar programs for Windows. | ||
− | #Run your search using the command<pre>/usr/local/ncbi/blast/bin/blastn -query querySequence.fasta -db dataBase.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send" -out blast_output.txt</pre> | + | #Open terminal and navigate into the folder containing the blast query sequence and the blast data base using following Unix commands<pre>cd OR ls</pre> |
+ | #Run your blast search using the command (remember to replace the command parameters)<pre>/usr/local/ncbi/blast/bin/blastn -query querySequence.fasta -db dataBase.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send" -out blast_output.txt</pre> | ||
#:What does this command do? "-" indicates a command. The text that follows is the actual command parameters. | #:What does this command do? "-" indicates a command. The text that follows is the actual command parameters. | ||
#:*-query = query file | #:*-query = query file | ||
Line 11: | Line 12: | ||
#Download [http://dl.dropbox.com/u/4936834/blastparse.py blastParse.py] | #Download [http://dl.dropbox.com/u/4936834/blastparse.py blastParse.py] | ||
#Place blastParse.py into the same folder as your blast results file | #Place blastParse.py into the same folder as your blast results file | ||
− | # | + | #In terminal, navigate into the folder containing blastParse.py and your blast results file |
#Run blastParse.py using the command <pre>python blastParse.py</pre> | #Run blastParse.py using the command <pre>python blastParse.py</pre> | ||
#Follow prompts of the BLASTPARSE program | #Follow prompts of the BLASTPARSE program | ||
#Results will be saved as tab delimited data in text files. If you would like to visualize the data, open files in excel to make graphs (''right click'' > open with > excel). | #Results will be saved as tab delimited data in text files. If you would like to visualize the data, open files in excel to make graphs (''right click'' > open with > excel). | ||
+ | |||
+ | |||
+ | A video of this tutorial can be found [http://megaswf.com/serve/1028117 here]. |
Latest revision as of 22:46, 23 February 2011
This tutorial demonstrates how to quickly parse out hits from a blastn search on a local blast database. A python script (blastParse.py) is used to parse the data. This script was developed to identify what pieces of an unfinished blueberry genome (scaffold/contig) typically had the most instances of chloroplasts or mitochondrial DNA.
NOTE: This tutorial assumes the user has blast version 2.2.24 installed and has already made their local blast database (on their computer). It is also written for Macintosh users; however, all scripts and tools are Windows compatible or have similar programs for Windows.
- Open terminal and navigate into the folder containing the blast query sequence and the blast data base using following Unix commands
cd OR ls
- Run your blast search using the command (remember to replace the command parameters)
/usr/local/ncbi/blast/bin/blastn -query querySequence.fasta -db dataBase.fasta -outfmt "7 qacc sacc evalue qstart qend sstart send" -out blast_output.txt
- What does this command do? "-" indicates a command. The text that follows is the actual command parameters.
- -query = query file
- -db = database to search
- -outfmt = output format DO NOT CHANGE THIS UNLESS YOU KNOW HOW TO EDIT blastParse.py
- -out = output file for blast results
- What does this command do? "-" indicates a command. The text that follows is the actual command parameters.
- Download blastParse.py
- Place blastParse.py into the same folder as your blast results file
- In terminal, navigate into the folder containing blastParse.py and your blast results file
- Run blastParse.py using the command
python blastParse.py
- Follow prompts of the BLASTPARSE program
- Results will be saved as tab delimited data in text files. If you would like to visualize the data, open files in excel to make graphs (right click > open with > excel).
A video of this tutorial can be found here.