Difference between revisions of "DM Notes 2.16.16"

From GcatWiki
Jump to: navigation, search
 
Line 12: Line 12:
  
 
gene_symbols[gene symbol] = [GO terms associated w/ gene symbol]
 
gene_symbols[gene symbol] = [GO terms associated w/ gene symbol]
 +
 +
This didn't entirely work, so we're formulating a new approach.
  
 
Back to home [[Dylan Maghini]]
 
Back to home [[Dylan Maghini]]

Latest revision as of 18:33, 18 February 2016

Continuing from last class:

  • Pull organism GAF from GO database (Saipriya and I pulled the organism names. There are 443 different ones, though some are viruses, fungi, bacteria, etc.) The organisms GO has account for ~75% of the genes in our file.
  • Pull gene names for each organism from contig file and associate those w/ the GAF GO ID#s
  • Associate GO IDs with terms

Last Thursday, we pulled the organisms referenced in the python genome file. We also counted how many contigs listed each organism, and found that the most annotated ~17 organisms accounted for >75% of the genes in the file.

Today, we are writing a script that will pull the organism and the gene symbol from each contig, and put them into a dictionary of dictionaries. It will be formatted as follows:

organism_dictionary[organism name] = {gene symbols}

gene_symbols[gene symbol] = [GO terms associated w/ gene symbol]

This didn't entirely work, so we're formulating a new approach.

Back to home Dylan Maghini