Difference between revisions of "JP Feb 09 16"

From GcatWiki
Jump to: navigation, search
Line 1: Line 1:
 
[[Julia Preziosi]]
 
[[Julia Preziosi]]
  
Correlation Coefficient:
+
'''Clustering & Correlation Coefficient:'''
 +
 
 
Statistically, we see significant changes. Biologically, looking at changes in the genes, we may not think they're that significant.
 
Statistically, we see significant changes. Biologically, looking at changes in the genes, we may not think they're that significant.
  
Line 8: Line 9:
 
Small fluctuations can really change correlations, especially if the gene is clustered around a very tight line.  
 
Small fluctuations can really change correlations, especially if the gene is clustered around a very tight line.  
 
*Do we include all three fastings and feedings if little differences change clustering?
 
*Do we include all three fastings and feedings if little differences change clustering?
 +
 +
'''Clustering:'''
 +
Grouping the genes and samples together and presenting in an order.
 +
Need to understand the algorithms for clustering.
 +
 +
Reasons to cluster: explore big data sets, pull out patterns, make predictions.
 +
 +
Gene expression can be made into a ratio between fed and nonfed expression levels. Generates "gene induction / repression" values.
 +
However, 1/16th looks less significant than 16 fold. *Use a log scale instead: ratio of 16 becomes value of 4. Lets you visualize repression and induction between genes - direct and indirect relationships can be assumed (coregulation).

Revision as of 19:03, 9 February 2016

Julia Preziosi

Clustering & Correlation Coefficient:

Statistically, we see significant changes. Biologically, looking at changes in the genes, we may not think they're that significant.

Genes are correlated when multiple samples display a trend in expression for the genes. For instance, if Gene 1 is upregulated and Gene 2 is upregulated (magnitude doesn't matter) in the same sample, they're correlated for that sample. Over multiple samples, a correlation coefficient can be produced, especially if it's more upregulated in other samples. Negative correlations indicate that as one gene increases, the other gene decreases. They don't have to be parallel lines. Small fluctuations can really change correlations, especially if the gene is clustered around a very tight line.

  • Do we include all three fastings and feedings if little differences change clustering?

Clustering: Grouping the genes and samples together and presenting in an order. Need to understand the algorithms for clustering.

Reasons to cluster: explore big data sets, pull out patterns, make predictions.

Gene expression can be made into a ratio between fed and nonfed expression levels. Generates "gene induction / repression" values. However, 1/16th looks less significant than 16 fold. *Use a log scale instead: ratio of 16 becomes value of 4. Lets you visualize repression and induction between genes - direct and indirect relationships can be assumed (coregulation).