February 18, 2016
Dr. C and Dr. Heyer made cool progress. We did not do Euclidian distance, we looked at CORRElATION, we can also do absolute value. p value = 0.01 instead of 0.05 so that we get a shorter list of genes. Document has code on it. There is a code where you can find gene names. "write.csv(colnames(carp).... if there is a seed gene we want genes most correlated to that one. Theres a lot of really good candidate genes! Transcription factors are not always transcribed in high quanities becuase you don't need a lot... So look for change (on vs. off) not just quantity of expression. t in code = transforming/transposing x and y axis.
Tae sequences put them in Blast to Go- put in formated listed of sequences then it runs it through mapping. Sequence based method... Other group taking gene names from file and pair names of genes in gene file to gene ontology terms. Name based method...
All 6 are good liver samples. We still need to identify/verify intestine samples 3 and 6. Look in excel sheet Kathryn shared. Blasting over represented genes gave us genes. Look at and cite Kathryn's page. Google names on list of housekeeping genes and see if we can verify/decode/match differences in names to verify tissue samples.
Housekeeping genes in Pythons: NHE3 is a sodium transporter in intestinal membrane. Let's see if it matches up with one of our samples. NHE3 article