DM Notes 2.04.16
What do we want out of our research? What's the perfect outcome? How do we get there?
Goal: find all of the differentially expressed genes between samples. Perfect outcome is having a few genes that look like they're setting this digestive process in motion. What is this gene encoding for? Why is this the trigger? What kind of protein starts atrophy/hyperplasia?
What do we need to do with each of our twelve data sets, to evaluate it/know how to treat it downstream?
So far, we've made comparisons in R: liver vs intestine, fed vs not fed. Dustin is going to compare all fed versus all non fed organs to get common expression across all organs (essentially, fed vs not fed, irregardless of tissue). What exactly is this heat map representing? Go into DESeq documentation and read up on exactly how it calculated log change, how it selects which genes to represent in the heat map.
Starting questions we have
- What tissues do we have, specifically? Validate that we sampled as expected. Are our samples comparable? Look for housekeeping genes in those tissues.Ex: serosa won't have proteins involved in amino acid uptake. So, do we have mucosa, serosa, and from proximal, distal...
- What exactly is the heat map representing? How was it calculated? How does it select genes? We gave it a p-value.
- Genes that are not expressed at all in one treatment, but not at all in the other- do they show up on the heat map?
- From the genes in our R heat map, can we bin those by function to see if any one cellular function is more prominent?
- Why does intestine_fed_4 show up as more closely related to intestine_no_1 than it does to other fed samples?
- Can we find a transcript for a transcription factor, that might trigger other proteins?
- Can we write our own script that normalizes read numbers, then compare expression across samples?
- We can also use R to generate a heat map, but it'd be interesting to set our own parameters.
- How do we account for genetic differences between individual snakes?
- Can we sort by genes that are transcription factors, then look for differences? (Functional first approach) Use GO.
From class discussion: G proteins: Transcripts aren't always there. What will trigger a cascade that makes all of these transcripts? We'd like to see some sort of transcription activator, or is the G protein initiating a cascade of existing proteins that turns on some keystone activator. Can't look at proteins, can look at transcripts. All proteins that appear in plasma membrane for uptake come from where? Huge impact on our study. Look this up. Typically cascade = transcription factor that turns on a whole bunch of things. Can we find a transcript for transcription factors?
Exploring correlation: correl_explore.pdf, correl_explore_scenarios.xls
Correlation: not necessarily directional. Two variables/treatments seeing change together. In our heat maps, how do we measure correlation? R^2: how well data fits the linear regression. Slope tells positive/negative relationship. Same slope can have tightly correlated points, wide points. Slope does not indicate good correlation.
Back to home Dylan Maghini