Difference between revisions of "JP Feb 09 16"

From GcatWiki
Jump to: navigation, search
Line 9: Line 9:
 
Small fluctuations can really change correlations, especially if the gene is clustered around a very tight line.  
 
Small fluctuations can really change correlations, especially if the gene is clustered around a very tight line.  
 
*Do we include all three fastings and feedings if little differences change clustering?
 
*Do we include all three fastings and feedings if little differences change clustering?
 +
  
 
'''Clustering:'''
 
'''Clustering:'''
Line 18: Line 19:
 
Gene expression can be made into a ratio between fed and nonfed expression levels. Generates "gene induction / repression" values.  
 
Gene expression can be made into a ratio between fed and nonfed expression levels. Generates "gene induction / repression" values.  
 
However, 1/16th looks less significant than 16 fold. *Use a log scale instead: ratio of 16 becomes value of 4. Lets you visualize repression and induction between genes - direct and indirect relationships can be assumed (coregulation).
 
However, 1/16th looks less significant than 16 fold. *Use a log scale instead: ratio of 16 becomes value of 4. Lets you visualize repression and induction between genes - direct and indirect relationships can be assumed (coregulation).
 +
*Analysis techniques - how to pull out negative correlations.
 +
 +
Clustering by gene expression profiles- "Guilt by association".
 +
Compare expression levels of the genes over samples. *Pattern less meaningful since we are not doing 'over time'.

Revision as of 19:07, 9 February 2016

Julia Preziosi

Clustering & Correlation Coefficient:

Statistically, we see significant changes. Biologically, looking at changes in the genes, we may not think they're that significant.

Genes are correlated when multiple samples display a trend in expression for the genes. For instance, if Gene 1 is upregulated and Gene 2 is upregulated (magnitude doesn't matter) in the same sample, they're correlated for that sample. Over multiple samples, a correlation coefficient can be produced, especially if it's more upregulated in other samples. Negative correlations indicate that as one gene increases, the other gene decreases. They don't have to be parallel lines. Small fluctuations can really change correlations, especially if the gene is clustered around a very tight line.

  • Do we include all three fastings and feedings if little differences change clustering?


Clustering: Grouping the genes and samples together and presenting in an order. Need to understand the algorithms for clustering.

Reasons to cluster: explore big data sets, pull out patterns, make predictions.

Gene expression can be made into a ratio between fed and nonfed expression levels. Generates "gene induction / repression" values. However, 1/16th looks less significant than 16 fold. *Use a log scale instead: ratio of 16 becomes value of 4. Lets you visualize repression and induction between genes - direct and indirect relationships can be assumed (coregulation).

  • Analysis techniques - how to pull out negative correlations.

Clustering by gene expression profiles- "Guilt by association". Compare expression levels of the genes over samples. *Pattern less meaningful since we are not doing 'over time'.