Difference between revisions of "Feb 9"

From GcatWiki
Jump to: navigation, search
(Created page with "Clustering: Grouping in a particular way based on some sort of algorithm with given parameters Why cluster? Exploration of huge data, extract patterns and make predictions on...")
 
Line 3: Line 3:
 
Why cluster? Exploration of huge data, extract patterns and make predictions on these patterns (hypothesis generation and testing)
 
Why cluster? Exploration of huge data, extract patterns and make predictions on these patterns (hypothesis generation and testing)
  
Gene expression data:
+
'''Gene expression data:'''
  
 
Induction looks much more dramatic than the repression (be sure and remember this), equivalent on the fold change, but look very dissimilar
 
Induction looks much more dramatic than the repression (be sure and remember this), equivalent on the fold change, but look very dissimilar
Line 13: Line 13:
 
Scatter/line plots are a different way to represent a heat map
 
Scatter/line plots are a different way to represent a heat map
  
Comparing Gene Expression Profiles or Guilt by expression:
+
'''Comparing Gene Expression Profiles or Guilt by expression:'''
  
 
Co-regulation or directly regulating each other
 
Co-regulation or directly regulating each other
  
Proximity Measures:
+
''Proximity Measures:''
  
 
Want to understand relationships genes and expression level over time or samples  
 
Want to understand relationships genes and expression level over time or samples  
  
 
Correlation, Euclidean distance (distance formula), Inner product x y, Hamming distance, L1 distance, Dissimilarities may or may not be metrics
 
Correlation, Euclidean distance (distance formula), Inner product x y, Hamming distance, L1 distance, Dissimilarities may or may not be metrics

Revision as of 19:11, 9 February 2016

Clustering: Grouping in a particular way based on some sort of algorithm with given parameters

Why cluster? Exploration of huge data, extract patterns and make predictions on these patterns (hypothesis generation and testing)

Gene expression data:

Induction looks much more dramatic than the repression (be sure and remember this), equivalent on the fold change, but look very dissimilar

A log transformation "normalizing" the way this data looks for fold changes

Negative correlations are as informative as the positive correlations

Scatter/line plots are a different way to represent a heat map

Comparing Gene Expression Profiles or Guilt by expression:

Co-regulation or directly regulating each other

Proximity Measures:

Want to understand relationships genes and expression level over time or samples

Correlation, Euclidean distance (distance formula), Inner product x y, Hamming distance, L1 distance, Dissimilarities may or may not be metrics