Difference between revisions of "February 23, 2016"

From GcatWiki
Jump to: navigation, search
Line 1: Line 1:
I think we are doing R-studio
+
== Classwork == 
Check katherine's 2/16 notes.
 
Get everything up to date by Spring Break.
 
  
"expected_count" in coding NOT "FPKM" or else we are re-normalizing previously normalized data.  
+
'''Coding Notes:''' 
 +
*Use "expected_count" in coding NOT "FPKM" or else we are re-normalizing previously normalized data.
 +
*We have been #filtering to keep only those genes who mean expression is >10; however, for strategic clustering play with this value to see how the size of toSearch changes because nothing is special about the number 10. 
 +
'''myMeans <- apply(as.matrix(myCountData), 1, mean) 
 +
toSearch <- myCountData[myMeans > 10 & !is.na9myMeans,]'''   
 +
*Look at "toSearch" to find interesing genes and make new clusters by plugging in those genes to the command line. 
 +
*''Big thanks to Elise and Kathryn for helping me conquer Rstudio and for their patience!''
  
Strategic clustering. #filtering to keep only those genes who mean expression is >10, play wiht this value to see how the size of toSearch changes. nothing is special about the number 10 
+
CHECK KATHRYN'S RSTUDIO NOTES.   
myMeans <- apply(as.matrix(myCountData), 1, mean) 
 
toSearch <- myCountData[myMeans > 10 & !is.na9myMeans,]  
 
  
we can differentiate between fed and non-fed with all of our knowledge. We need help with the geneontology search- search for function instead of just a transcription factor.
 
  
Keep in mind scientific goal: find genes that are differentially expressed between fed and non-fed and try to find candidates for genes that are at the beginning of the cascade.  
+
== Gene Search: == 
 +
We can coordinate with other organ groups and see if we all converge on the same gene from different approaches. Although we can differentiate between fed and non-fed with all of our existing knowledge, we would like help with the gene ontology search so that we can search for gene function instead of just a transcription factor.
  
we can coordinate with other organs group and see if we all converge on same gene from different approaches.
+
Remember to keep in mind the '''''scientific goal:''''' find genes that are differentially expressed between fed and non-fed snakes and try to find candidates for genes that are at the beginning of the cascade.
  
See where our work takes us, and then hunt down/research the genes we find.   
+
From there, we will see where our work takes us and hunt down the genes that we find.   
  
  
Big thanks to Elise and Kathryn for helping me conquer Rstudio and for their patience
+
==== Supervised Clustering Attempts: ==== 
look at "toSearch" to find interesing genes and make new clusters by plugging in those genes to the command line.  
+
*394 Contig110_GNA13_Guanine_nucleotide-binding_protein_subunit_alpha-13_Homo_sapiens_2  
 +
*Contig77_FOXO1_Forkhead_box_protein_O1_Homo_sapiens (Forkhead box proteins are a family of transcription factors)   
  
GENES I TRIED TODAY USING SUPERVISED CLUSTERING: (what about correlation cluster (Feb 18 syllabus) correlation- just give me what is similar to this in fed vs. non-fed
+
After clustering with a few different genes, it seems as though our clustering is sensitive to the number of reads. Snake 4 has significantly fewer reads than the other snakes and has appeared as an outlier in each cluster.
394 Contig110_GNA13_Guanine_nucleotide-binding_protein_subunit_alpha-13_Homo_sapiens_2
 
  
Forkhead box proteins are a family of transcription factors.  
+
The heat maps generated from the above clustering were poorly constructed and insignificant; therefore, they are not included. (I SHOULD MAYBE STILL ADD ONE THOUGH)  
Contig77_FOXO1_Forkhead_box_protein_O1_Homo_sapiens  
 
  
it seems like our clustering is sensitive to the number of reads. Snake 4 has significantly fewer reads than the other snakes and it has appeared as an outlier in each cluster.
 
Heat maps didn't show anything super significant so I did not include them.
 
  
I think we need to set a more strict threshold?  
+
 
 +
=== Questions to Consider: === 
 +
*Should we set a more strict threshold?
 +
 
  
  
 
[http://gcat.davidson.edu/mediawiki-1.19.1/index.php/Ashlyn Ashlyn's Main Page]
 
[http://gcat.davidson.edu/mediawiki-1.19.1/index.php/Ashlyn Ashlyn's Main Page]

Revision as of 15:31, 9 March 2016

Classwork

Coding Notes:

  • Use "expected_count" in coding NOT "FPKM" or else we are re-normalizing previously normalized data.
  • We have been #filtering to keep only those genes who mean expression is >10; however, for strategic clustering play with this value to see how the size of toSearch changes because nothing is special about the number 10.

myMeans <- apply(as.matrix(myCountData), 1, mean) toSearch <- myCountData[myMeans > 10 & !is.na9myMeans,]

  • Look at "toSearch" to find interesing genes and make new clusters by plugging in those genes to the command line.
  • Big thanks to Elise and Kathryn for helping me conquer Rstudio and for their patience!

CHECK KATHRYN'S RSTUDIO NOTES.


Gene Search:

We can coordinate with other organ groups and see if we all converge on the same gene from different approaches. Although we can differentiate between fed and non-fed with all of our existing knowledge, we would like help with the gene ontology search so that we can search for gene function instead of just a transcription factor.

Remember to keep in mind the scientific goal: find genes that are differentially expressed between fed and non-fed snakes and try to find candidates for genes that are at the beginning of the cascade.

From there, we will see where our work takes us and hunt down the genes that we find.


Supervised Clustering Attempts:

  • 394 Contig110_GNA13_Guanine_nucleotide-binding_protein_subunit_alpha-13_Homo_sapiens_2
  • Contig77_FOXO1_Forkhead_box_protein_O1_Homo_sapiens (Forkhead box proteins are a family of transcription factors)

After clustering with a few different genes, it seems as though our clustering is sensitive to the number of reads. Snake 4 has significantly fewer reads than the other snakes and has appeared as an outlier in each cluster.

The heat maps generated from the above clustering were poorly constructed and insignificant; therefore, they are not included. (I SHOULD MAYBE STILL ADD ONE THOUGH)


Questions to Consider:

  • Should we set a more strict threshold?


Ashlyn's Main Page