DM Notes 1.28.16
Reads are nearly all 72 bp. We want to know what genes they're associated with, and how many of that gene there are in a particular sample.
Some challenges ahead:
How do you count the number of genes and compare across time points? What is the standard procedure? There aren't the same number of reads in every sample. Could normalize to a particular housekeeping gene. (2 people)
How do we know we have the right tissue? Serosa, mucosa, cross section. We can find a gene that we know is expressed only in a certain tissue subset. What can we do now? We have the fastQC reports- can BLAST the overrepresented sequences in the report. We don't know what they are, but a BLAST could give us some hints (8 people will work on that).
Campbell has gone ahead and asked Todd Castoe for the supplementary gene names from the paper that identified ~2000 differentially expressed intestinal genes.
SRP151827 - link to NCBI repository of all of their reads. Lots of links and tools, but we're not entirely sure how to navigate. (2 people)
Someone needs to go through the gene names list, and change "protein of unknown function" to numbered/different names, so there aren't duplicates. Find any other duplicates, and rename and number.(4 people)