Difference between revisions of "General Statistical Calculation Tool"

From GcatWiki
Jump to: navigation, search
(Features)
(Features)
Line 1: Line 1:
 
=Features=
 
=Features=
 
===Calculate Distribution of Read Lengths===
 
===Calculate Distribution of Read Lengths===
==Calculate How Many Substrings==
+
===Calculate How Many Substrings===
 
*http://genometools.org/index.html - for both k-mer calculation tool and substring tool
 
*http://genometools.org/index.html - for both k-mer calculation tool and substring tool
 
*[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shustring/shustring.cgi.pl Shustring] (SHortest Unique subSTRING) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1166540/ paper]
 
*[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shustring/shustring.cgi.pl Shustring] (SHortest Unique subSTRING) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1166540/ paper]
 
*[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shulen/shulen.cgi.pl Shulen] - Program for Computing the Null-Distribution of Shortest Unique Substring Lengths in DNA Sequences
 
*[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shulen/shulen.cgi.pl Shulen] - Program for Computing the Null-Distribution of Shortest Unique Substring Lengths in DNA Sequences
  
==Calculate N50==
+
===Calculate N50===
 
*http://code.google.com/p/biopieces/wiki/calc_N50 - part of biopieces
 
*http://code.google.com/p/biopieces/wiki/calc_N50 - part of biopieces
 
*[http://genomics-array.blogspot.com/2011/02/calculating-n50-of-contig-assembly-file.html Perl Script] - and step by step idea
 
*[http://genomics-array.blogspot.com/2011/02/calculating-n50-of-contig-assembly-file.html Perl Script] - and step by step idea
 
*[http://seqanswers.com/forums/showthread.php?t=2857 Python code]
 
*[http://seqanswers.com/forums/showthread.php?t=2857 Python code]
  
==Calculate GC Content==
+
===Calculate GC Content===
==Calculate k-mer Distributions==
+
===Calculate k-mer Distributions===
 
*http://genometools.org/index.html - for both k-mer calculation tool and substring tool
 
*http://genometools.org/index.html - for both k-mer calculation tool and substring tool
 
*[http://www.cbcb.umd.edu/software/jellyfish/ Jellyfish] - more recent k-mer counting tool
 
*[http://www.cbcb.umd.edu/software/jellyfish/ Jellyfish] - more recent k-mer counting tool
==Estimate Genome Size==
+
===Estimate Genome Size===
 
*http://seqanswers.com/forums/showthread.php?t=11434
 
*http://seqanswers.com/forums/showthread.php?t=11434
 
*[http://gsizepred.sourceforge.net/ GSP] - incorporate the Bayesian framework and EM algorithm for the genome size prediction
 
*[http://gsizepred.sourceforge.net/ GSP] - incorporate the Bayesian framework and EM algorithm for the genome size prediction

Revision as of 19:32, 1 June 2011

Features

Calculate Distribution of Read Lengths

Calculate How Many Substrings

Calculate N50

Calculate GC Content

Calculate k-mer Distributions

Estimate Genome Size

Motivation

http://phagesdb.org/sort/ - it looks like genome size and gc% roughly correlates to the phage cluster