JP Feb 18 16

From GcatWiki
Revision as of 19:24, 18 February 2016 by Jupreziosi (talk | contribs)
Jump to: navigation, search

Julia Preziosi

Drs. C & H findings:

http://www.bio.davidson.edu/courses/Bio343/2016/Thursday_18Feb_2016.txt

Euclidean distance correlations for clustering. Changed from z scale to absolute value.

Then looked at correlation (1 - correlation), clustering was different, dendrogram was different.

They are now working on supervised clustering; make csv to export gene names based on clustering. Given a seed gene, output correlated genes.

  • Transcription factors are not usually highly transcribed - small value; we look for things that are transcribed after feeding.

GAMEPLAN:

Our snake 1-6 data (excel files) were mapped to Todd's python genome (text file) to associate sequences with Gene names. If we can Blast2GO Todd's "protein of unknown function" sequences, we can get gene names and GO terms for these unknown proteins. As they're labeled, we can match the label (ex "...unknown_function_20") to our output with the same label, and frequencies, and find the GO terms for these.

We believe this method will yield new results as Todd did his genome a while ago, and new information may have become available since then.