Difference between revisions of "General Statistical Calculation Tool"
From GcatWiki
(→Calculate How Many Substrings) |
|||
(12 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=Features= | =Features= | ||
− | ==Calculate Distribution of Read Lengths== | + | ===Calculate Distribution of Read Lengths=== |
− | ==Calculate How Many Substrings== | + | ===Calculate How Many Substrings=== |
*http://genometools.org/index.html - for both k-mer calculation tool and substring tool | *http://genometools.org/index.html - for both k-mer calculation tool and substring tool | ||
*[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shustring/shustring.cgi.pl Shustring] (SHortest Unique subSTRING) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1166540/ paper] | *[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shustring/shustring.cgi.pl Shustring] (SHortest Unique subSTRING) [http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1166540/ paper] | ||
+ | *[http://adenine.biz.fh-weihenstephan.de/cgi-bin/shulen/shulen.cgi.pl Shulen] - Program for Computing the Null-Distribution of Shortest Unique Substring Lengths in DNA Sequences | ||
− | ==Calculate N50== | + | ===Calculate N50=== |
− | ==Calculate GC Content== | + | *http://code.google.com/p/biopieces/wiki/calc_N50 - part of biopieces |
− | ==Calculate k-mer Distributions== | + | *[http://genomics-array.blogspot.com/2011/02/calculating-n50-of-contig-assembly-file.html Perl Script] - and step by step idea |
+ | *[http://seqanswers.com/forums/showthread.php?t=2857 Python code] | ||
+ | |||
+ | ===Calculate GC Content=== | ||
+ | ===Calculate k-mer Distributions=== | ||
*http://genometools.org/index.html - for both k-mer calculation tool and substring tool | *http://genometools.org/index.html - for both k-mer calculation tool and substring tool | ||
*[http://www.cbcb.umd.edu/software/jellyfish/ Jellyfish] - more recent k-mer counting tool | *[http://www.cbcb.umd.edu/software/jellyfish/ Jellyfish] - more recent k-mer counting tool | ||
− | ==Estimate Genome Size== | + | ===Estimate Genome Size=== |
*http://seqanswers.com/forums/showthread.php?t=11434 | *http://seqanswers.com/forums/showthread.php?t=11434 | ||
+ | *http://www.dolphing.com/?p=508 | ||
+ | *[http://gsizepred.sourceforge.net/ GSP] - incorporate the Bayesian framework and EM algorithm for the genome size prediction | ||
+ | **People skeptical of this method http://seqanswers.com/forums/showthread.php?t=10988 | ||
+ | *Other Method - http://www.cmb.usc.edu/papers/msw_papers/msw-149.pdf | ||
+ | *http://www.nature.com/nature/journal/v463/n7279/extref/nature08696-s1.pdf | ||
+ | |||
+ | *https://banana-slug.soe.ucsc.edu/bioinformatic_tools:quake | ||
+ | |||
=Motivation= | =Motivation= | ||
http://phagesdb.org/sort/ - it looks like genome size and gc% roughly correlates to the phage cluster | http://phagesdb.org/sort/ - it looks like genome size and gc% roughly correlates to the phage cluster |
Latest revision as of 18:34, 9 June 2011
Contents
[hide]Features
Calculate Distribution of Read Lengths
Calculate How Many Substrings
- http://genometools.org/index.html - for both k-mer calculation tool and substring tool
- Shustring (SHortest Unique subSTRING) paper
- Shulen - Program for Computing the Null-Distribution of Shortest Unique Substring Lengths in DNA Sequences
Calculate N50
- http://code.google.com/p/biopieces/wiki/calc_N50 - part of biopieces
- Perl Script - and step by step idea
- Python code
Calculate GC Content
Calculate k-mer Distributions
- http://genometools.org/index.html - for both k-mer calculation tool and substring tool
- Jellyfish - more recent k-mer counting tool
Estimate Genome Size
- http://seqanswers.com/forums/showthread.php?t=11434
- http://www.dolphing.com/?p=508
- GSP - incorporate the Bayesian framework and EM algorithm for the genome size prediction
- People skeptical of this method http://seqanswers.com/forums/showthread.php?t=10988
- Other Method - http://www.cmb.usc.edu/papers/msw_papers/msw-149.pdf
- http://www.nature.com/nature/journal/v463/n7279/extref/nature08696-s1.pdf
Motivation
http://phagesdb.org/sort/ - it looks like genome size and gc% roughly correlates to the phage cluster