Difference between revisions of "February 23, 2016"

From GcatWiki
Jump to: navigation, search
(Created page with "I think we are doing R-studio Check katherine's 2/16 notes. Get everything up to date by Spring Break. "expected_count" in coding NOT "FPKM" or else we are re-normalizing pre...")
 
Line 15: Line 15:
 
we can coordinate with other organs group and see if we all converge on same gene from different approaches.
 
we can coordinate with other organs group and see if we all converge on same gene from different approaches.
  
See where our work takes us, and then hunt down/research the genes we find.
+
See where our work takes us, and then hunt down/research the genes we find.
 +
 
 +
 
 +
Big thanks to Elise and Kathryn for helping me conquer Rstudio and for their patience
 +
look at "toSearch" to find interesing genes and make new clusters by plugging in those genes to the command line. 
 +
 
 +
GENES I TRIED TODAY USING SUPERVISED CLUSTERING: (what about correlation cluster (Feb 18 syllabus) correlation- just give me what is similar to this in fed vs. non-fed
 +
394 Contig110_GNA13_Guanine_nucleotide-binding_protein_subunit_alpha-13_Homo_sapiens_2
 +
 
 +
Forkhead box proteins are a family of transcription factors. 
 +
Contig77_FOXO1_Forkhead_box_protein_O1_Homo_sapiens 
 +
 
 +
it seems like our clustering is sensitive to the number of reads. Snake 4 has significantly fewer reads than the other snakes and it has appeared as an outlier in each cluster.
 +
Heat maps didn't show anything super significant so I did not include them.
 +
 
 +
I think we need to set a more strict threshold?
 +
 
 +
 
 +
[http://gcat.davidson.edu/mediawiki-1.19.1/index.php/Ashlyn Ashlyn's Main Page]

Revision as of 19:51, 23 February 2016

I think we are doing R-studio Check katherine's 2/16 notes. Get everything up to date by Spring Break.

"expected_count" in coding NOT "FPKM" or else we are re-normalizing previously normalized data.

Strategic clustering. #filtering to keep only those genes who mean expression is >10, play wiht this value to see how the size of toSearch changes. nothing is special about the number 10 myMeans <- apply(as.matrix(myCountData), 1, mean) toSearch <- myCountData[myMeans > 10 & !is.na9myMeans,]

we can differentiate between fed and non-fed with all of our knowledge. We need help with the geneontology search- search for function instead of just a transcription factor.

Keep in mind scientific goal: find genes that are differentially expressed between fed and non-fed and try to find candidates for genes that are at the beginning of the cascade.

we can coordinate with other organs group and see if we all converge on same gene from different approaches.

See where our work takes us, and then hunt down/research the genes we find.


Big thanks to Elise and Kathryn for helping me conquer Rstudio and for their patience look at "toSearch" to find interesing genes and make new clusters by plugging in those genes to the command line.

GENES I TRIED TODAY USING SUPERVISED CLUSTERING: (what about correlation cluster (Feb 18 syllabus) correlation- just give me what is similar to this in fed vs. non-fed 394 Contig110_GNA13_Guanine_nucleotide-binding_protein_subunit_alpha-13_Homo_sapiens_2

Forkhead box proteins are a family of transcription factors. Contig77_FOXO1_Forkhead_box_protein_O1_Homo_sapiens

it seems like our clustering is sensitive to the number of reads. Snake 4 has significantly fewer reads than the other snakes and it has appeared as an outlier in each cluster. Heat maps didn't show anything super significant so I did not include them.

I think we need to set a more strict threshold?


Ashlyn's Main Page