Explaining My Project
From GcatWiki
Shotgun Sequencing
Counting Kmers to Tell you about Genome
(taken from https://banana-slug.soe.ucsc.edu/bioinformatic_tools:jellyfish)
Bad kmer rate = bad multiplicity kmers/total number of all kmers
Seq Error Rate = bad kmer Rate/kmer size
Genome Coverage = use gamma fit on the good multiplicity values of the best kmer (usually largest). The peak of this line gives genome coverage (see red line) (here about 47.11x)
Genome size = number of unique good multiplicity kmers/coverage