Difference between revisions of "Jon Lim"

From GcatWiki
Jump to: navigation, search
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
Jon Lim's scratchpad
 
Jon Lim's scratchpad
 
----
 
----
Below is a test photo of horizontal lines
+
<br>
 +
Our immediate project: determine what parent-dependent transcription changes exist (in DS and N by themselves), and whether log-fold changes between DS and N are also parent-dependent.<br>
 +
<br>
 +
<b>Our five potential genes of interest:</b><br>
 +
HLCS<br>
 +
HMGN1<br>
 +
DYRK1A<br>
 +
BRWD1<br>
 +
RUNX1<br>
 +
<br>
 +
<br>
 +
Imprinting, the inheritance of maternal or paternal methylation, will be a huge part of our investigation. Note that paper 2_parent_of_origin reveals that sometimes imprinting doesn't affect transcription at all. This paper also gives great guidelines for searching for imprinting transcription effects. Note that there is a potential for one imprint to regulate transcription in an entire 'imprint-controlled region,' and the authors here used a 4-Mb region (pg 15). While this same paper points out <i>WRB</i> as an imprinted gene, we won't have epigenome data, only transcriptome data, so that finding isn't relevant.<br>
 +
<br>
 +
<br>
 +
<b>What is the critical region of HSA21?</b> See [www.ds-health.com/trisomy.htm], which states that Robertsonian Translocations between chr21 and chr14 help to narrow down the critical region. In fact, there is a list on that page listing genes with known input...mine references from that page.
 +
<br>
 +
<br>
 +
Where did the data come from?<br>
 +
Biological cDNA reads from various sources synthesized from the mRNA transcriptome and tagged with a barcode to label by source; all reads pooled. Reads are demultiplexed by barcodes to sort out the samples. Trimmomatic removes rough edges and barcodes. RSEM maps the reads to ENsembl genes and counts reads. Dustin has preprocessed the data for input to DESeq2.
 +
<br>
 +
<br>
 +
02_03: Run DESeq2 comparison between trisomic and disomic, same parent-of-origin. DO THIS SOON SO IT HAS TIME TO RUN.
 +
<br>
 +
<br>
 +
Ts65Dn genetic background<br>
 +
There are two Ts65Dn strains: 001924 (WT for Pde6b), and 005252 (mut for Pde6b). Pde6b is a protein-coding MMU5 gene, and mutations cause abnormal retinal physiology. Since 005252 is mutant for Pde6b, it is usually bred to C3Sn.BLiA-Pde6b+/DnJ)F1/J (003647) males to eliminate the mutant phenotype. The usual male strain for 001924, however, is B6EiC3SnF1/J (001875).<br>
 +
<br>
 +
Ours is 001924, maintained by crossing 1924 females to F1 males, progeny of B6 and C3H. The male sterile phenotype is incompletely penetrant. Each CB# is an embryonic stem cell line, and the P# refers to the passage number. The maternal disomic lines are embryonic stem cells derived from embryos of the same 1924 carrier females, without the 17^16 chromosome. All of the paternal line are embryonic stem cells from embryos in a colony with rare fertile 1924 males. By nature, ES cell lines are XY, and the karyotypes of the cell lines have been monitored for chromosomal aberrations.<br>
 +
<br>
 +
<br>
 +
<b> ESC transcriptome</b><br>
 +
"A Meta-analysis" (Assou et al.) -- encompasses (Richards et al), (Li et al. 2006). <br>
 +
<b>Human - </b>Great meta-analysis; compiled a list of 1076 genes overexpressed in hESC with three or more references; interestingly only 1 gene was found by all analyses. Published a database listing genes DE in hESC. [http://amazonia.transcriptome.eu/myAmaZonia.php?section=list&zone=StemCells-HESC] <br><br>
 +
"Transcriptome coexpression" (Li et al 2006) -- <br>
 +
<b>Human - </b> Maps chromosomal domains of human ESC and embryoid body transcriptome changes. Only HSA21 DE domain found was in embryoid body, not embryonic stem cells.<br><br>
 +
"The Transcriptome Profile of Human Embryonic Stem Cells as Defined by SAGE" (Richards et al.) -- <br>
 +
<b>Human vs mouse - </b> Introduction references papers showing differences between human and mouse ESCs; the paper might give some new information, too.<br><br>
 +
"Transcriptome analysis of Mouse Stem Cells and Early Embryos" (Sharov et al) -- <br>
 +
<b>Mouse - </b> Contains a figure giving Signature Genes for Specific Groups of Early Embryos and Stem Cells.<br><br>
 +
"Transcriptome Profiling of Human and Murine ESCs Identifies Divergent Paths Required to Maintain the Stem Cell State" (Wei et al.) -- <br>
 +
<b>Humans vs mouse - </b> Compared hESCs and mouse ESCs. Examining Major differences and conserved similarities, found only a small (core) set of genes conserved between humans and mice. Also identified were major differences in leukemia inhibitory factor, transforming growth factor-beta, and Wnt and fibroblast growth factor signaling pathways, as well as the expression of genes encoding metabolic, cytoskeletal, and matrix proteins.<br><br>
 +
----
 +
Progress<br>
 +
<br>
 +
1. Paternal all vs Maternal all (see Emilie Uffman for details)<br>
 +
<br>
 +
2. Sorted paternal: tri vs di by p-value<br>
 +
<b>Galaxy > Filter & Sort > Sort (column 6)</b><br>
 +
<b>Observation</b> Some genes have data for base mean, fold change, Wald-Stat data; but p-value and adjusted p-value are NA. These are genes with sample read values detected as outliers by Cook's distance (see [https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.pdf DESeq2 documentation, page 18].<br>
 +
3. We ignored these data points with outliers using a Galaxy filter.<br>
 +
<b>Output:</b><br>
 +
Filtering with c6!='NA', kept 69.78% of 99567 valid lines (99567 total lines).Skipped 30091 invalid line(s) <br>
 +
<br>
 +
NEXT TIME: Use MGI to look up gene names for easier comparison.<br>
 +
<br>
 +
ALSO: Try using DESeq to compare a reduced model (expression ~ condition only) vs a full model (expression ~ condition and parental origin) to find specific effects of parental origin.<br>
 +
See:<br>
 +
[https://www.bioconductor.org/packages/devel/bioc/vignettes/DESeq/inst/doc/DESeq.pdf Multi-Factor_Designs]<br>
 +
[https://support.bioconductor.org/p/56948/ comparing_specific_conditions]<br>
  
[[File:test_lines.png|300px]]
+
----
 +
<b>Galaxy has tools for displaying gene network associations, looking for correlations with GO terms</b><br>
 +
[http://bio.davidson.edu/Courses/Bio343/2017/Galaxy_2013.pdf http://bio.davidson.edu/Courses/Bio343/2017/Galaxy_2013.pdf]
 +
<br>
 +
<br>
 +
<b>DeSEQ notes</b><br>
 +
<br>
 +
Base mean = mean RPKM across all samples<br>
 +
<br>
 +
log2FC = log(2) of 1st category over 2nd category<br>
 +
<br>
 +
P-value (not adjusted p-value) is the most relevant value when looking at single genes<br>
 +
<br>
 +
<br>
 +
<b>Excel notes</b><br>
 +
<br>
 +
<b>left(A1, 18)</b> - truncate contents of A1 after 18 charinformat<br>
 +
----
 +
Vocabulary/Background for 'Trisomy 21 alters DNA Methylation in Parent-of-Origin-Dependent and Independent manners'<br>
 +
<br>
 +
Goals<br>
 +
1. Find a high-certainty method of ascertaining parent-of-origin of HSA21<br>
 +
2. Explore parent-of-origin effects on gene expression<br>
 +
3. Evaluate parent-of-origin effects on methylation of RUNX1 (HSA21) and TMEM131 (HSA2), known to be differentially regulated in Trisomy 21.<br>
 +
<br>
 +
INTRO<br>
 +
Since Down Syndrome mainly caused by maternal nondisjunction during oogenesis, cases of paternal inheritance are rare, difficult to study. Trying to determine what the parent-of-origin effect is.<br>
 +
<br>
 +
Genomic imprinting - inherited gene silencing, mediated by DNa or histone methylation<br>
 +
Not sure why the authors bring up uniparental disomy (both copies of a chromosome from one parent)...shares some similarity with DS, but more likely to suffer from recessive disorders / total gene silencing than a dosage effect. Is this the androgenetic mole used as a control later?<br>
 +
Authors argue that imprinting may cause parent-of-origin-dependent effects of nondisjoined HSA21<br>
 +
CpG - possible methylation site<br>
 +
STR - short tandem repeats
 +
<br>
 +
 
 +
RESULTS<br>
 +
<br>
 +
First, notice methylation profile of CGIs. CGIs 2, 3, 5 are differentially methylated in male and female gametes.<br>
 +
Fig. 2 - exp. validation: no methylation for WRB CGI-1 and 3 when looking at HhaI(+) readout.contrast with wRB CGI-2, which was previously reported to be methylated in blood cells.<br>
 +
Pg 10: Looking at healthy disomic subjects, the snp neighboring CGI-2 gives the known parental identity of the chromosome. Authors found that maternal HSA21 was consistently methylated at CGI-2, and paternal HSA21 unmethylated. This matches with the known differential methylation in the gametes (imprinting?).<br>
 +
Fig. 3 - Looking at typical nuclear trios, the neighboring SNP gives parent-of-origin of the chromosome. The developed assay shows conclusively that methylation is 1:1 with parent-of-origin. HhaI digests unmethylated DNA, leaving behind maternal allele. McrBC digests methylated DNA, leaving behind paternal allele. Conclusion: imprinting on maternal chromosomes.<br>
 +
Fig. 6A-B Unlike at the WRB CGI-2 DMR, RUNX1 and TMEME131 methylation changes in Trisomy 21 probands were NOT PARENT-DEPENDENT.<br>
 +
Pgs 14-16: no evidence that the WRB DMR imprinting effects changes in gene expression (evidenced by biallelic mRNA reads for all neighboring SNPs that had available data.<br>
 +
----
 +
<b>Online tools</b><br>
 +
<br>
 +
[https://www.omim.org/ OMIM]<br>
 +
OMIM can be searched using gene names to find human information.<br>
 +
<br>
 +
[http://www.genome.jp/kegg/disease/ KEGG Disease database]<br>
 +
Search this database with protein names from OMIM to find associated diseases. The 'Pathway' field of the disease entry brings up helpful pathway diagrams.<br>
 +
<br>
 +
[https://genome.ucsc.edu/cgi-bin/hgGateway UCSC Genome Browser]<br>
 +
Human reference genome(s). Consider using a single version for the duration of the project.<br>
 +
<br>
 +
GSEA from the Broad Institute<br>
 +
----
 +
----
 +
<b>References</b><br>
 +
<br>
 +
[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3975056/ Court F, Tayama C, Romanelli V, Martin-Trujillo A, Iglesias-Platas I, Okamura K, et al. Genome-wide parent-of-origin DNA methylation analysis reveals the intricacies of human imprinting and suggests a germline meth-ylation-independent mechanism of establishment. Genome Res. 2014; 24(4):554–69. doi:10.1101/gr.164913.113PMID:24402520; PubMed Central PMCID: PMCPMC3975056.] <br>
 +
[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3963529/ Docherty LE, Rezwan FI, Poole RL, Jagoe H, Lake H, Lockett GA, et al. Genome-wide DNA methylation anal-ysis of patients with imprinting disorders identifies differentially methylated regions associated with novel can-didate imprinted genes. J Med Genet. 2014; 51(4):229–38. doi:10.1136/jmedgenet-2013-102116PMID:24501229; PubMed Central PMCID: PMCPMC3963529.] <br>

Latest revision as of 14:56, 23 March 2017

Jon Lim's scratchpad



Our immediate project: determine what parent-dependent transcription changes exist (in DS and N by themselves), and whether log-fold changes between DS and N are also parent-dependent.

Our five potential genes of interest:
HLCS
HMGN1
DYRK1A
BRWD1
RUNX1


Imprinting, the inheritance of maternal or paternal methylation, will be a huge part of our investigation. Note that paper 2_parent_of_origin reveals that sometimes imprinting doesn't affect transcription at all. This paper also gives great guidelines for searching for imprinting transcription effects. Note that there is a potential for one imprint to regulate transcription in an entire 'imprint-controlled region,' and the authors here used a 4-Mb region (pg 15). While this same paper points out WRB as an imprinted gene, we won't have epigenome data, only transcriptome data, so that finding isn't relevant.


What is the critical region of HSA21? See [www.ds-health.com/trisomy.htm], which states that Robertsonian Translocations between chr21 and chr14 help to narrow down the critical region. In fact, there is a list on that page listing genes with known input...mine references from that page.

Where did the data come from?
Biological cDNA reads from various sources synthesized from the mRNA transcriptome and tagged with a barcode to label by source; all reads pooled. Reads are demultiplexed by barcodes to sort out the samples. Trimmomatic removes rough edges and barcodes. RSEM maps the reads to ENsembl genes and counts reads. Dustin has preprocessed the data for input to DESeq2.

02_03: Run DESeq2 comparison between trisomic and disomic, same parent-of-origin. DO THIS SOON SO IT HAS TIME TO RUN.

Ts65Dn genetic background
There are two Ts65Dn strains: 001924 (WT for Pde6b), and 005252 (mut for Pde6b). Pde6b is a protein-coding MMU5 gene, and mutations cause abnormal retinal physiology. Since 005252 is mutant for Pde6b, it is usually bred to C3Sn.BLiA-Pde6b+/DnJ)F1/J (003647) males to eliminate the mutant phenotype. The usual male strain for 001924, however, is B6EiC3SnF1/J (001875).

Ours is 001924, maintained by crossing 1924 females to F1 males, progeny of B6 and C3H. The male sterile phenotype is incompletely penetrant. Each CB# is an embryonic stem cell line, and the P# refers to the passage number. The maternal disomic lines are embryonic stem cells derived from embryos of the same 1924 carrier females, without the 17^16 chromosome. All of the paternal line are embryonic stem cells from embryos in a colony with rare fertile 1924 males. By nature, ES cell lines are XY, and the karyotypes of the cell lines have been monitored for chromosomal aberrations.


ESC transcriptome
"A Meta-analysis" (Assou et al.) -- encompasses (Richards et al), (Li et al. 2006).
Human - Great meta-analysis; compiled a list of 1076 genes overexpressed in hESC with three or more references; interestingly only 1 gene was found by all analyses. Published a database listing genes DE in hESC. [1]

"Transcriptome coexpression" (Li et al 2006) --
Human - Maps chromosomal domains of human ESC and embryoid body transcriptome changes. Only HSA21 DE domain found was in embryoid body, not embryonic stem cells.

"The Transcriptome Profile of Human Embryonic Stem Cells as Defined by SAGE" (Richards et al.) --
Human vs mouse - Introduction references papers showing differences between human and mouse ESCs; the paper might give some new information, too.

"Transcriptome analysis of Mouse Stem Cells and Early Embryos" (Sharov et al) --
Mouse - Contains a figure giving Signature Genes for Specific Groups of Early Embryos and Stem Cells.

"Transcriptome Profiling of Human and Murine ESCs Identifies Divergent Paths Required to Maintain the Stem Cell State" (Wei et al.) --
Humans vs mouse - Compared hESCs and mouse ESCs. Examining Major differences and conserved similarities, found only a small (core) set of genes conserved between humans and mice. Also identified were major differences in leukemia inhibitory factor, transforming growth factor-beta, and Wnt and fibroblast growth factor signaling pathways, as well as the expression of genes encoding metabolic, cytoskeletal, and matrix proteins.


Progress

1. Paternal all vs Maternal all (see Emilie Uffman for details)

2. Sorted paternal: tri vs di by p-value
Galaxy > Filter & Sort > Sort (column 6)
Observation Some genes have data for base mean, fold change, Wald-Stat data; but p-value and adjusted p-value are NA. These are genes with sample read values detected as outliers by Cook's distance (see DESeq2 documentation, page 18.
3. We ignored these data points with outliers using a Galaxy filter.
Output:
Filtering with c6!='NA', kept 69.78% of 99567 valid lines (99567 total lines).Skipped 30091 invalid line(s)

NEXT TIME: Use MGI to look up gene names for easier comparison.

ALSO: Try using DESeq to compare a reduced model (expression ~ condition only) vs a full model (expression ~ condition and parental origin) to find specific effects of parental origin.
See:
Multi-Factor_Designs
comparing_specific_conditions


Galaxy has tools for displaying gene network associations, looking for correlations with GO terms
http://bio.davidson.edu/Courses/Bio343/2017/Galaxy_2013.pdf

DeSEQ notes

Base mean = mean RPKM across all samples

log2FC = log(2) of 1st category over 2nd category

P-value (not adjusted p-value) is the most relevant value when looking at single genes


Excel notes

left(A1, 18) - truncate contents of A1 after 18 charinformat


Vocabulary/Background for 'Trisomy 21 alters DNA Methylation in Parent-of-Origin-Dependent and Independent manners'

Goals
1. Find a high-certainty method of ascertaining parent-of-origin of HSA21
2. Explore parent-of-origin effects on gene expression
3. Evaluate parent-of-origin effects on methylation of RUNX1 (HSA21) and TMEM131 (HSA2), known to be differentially regulated in Trisomy 21.

INTRO
Since Down Syndrome mainly caused by maternal nondisjunction during oogenesis, cases of paternal inheritance are rare, difficult to study. Trying to determine what the parent-of-origin effect is.

Genomic imprinting - inherited gene silencing, mediated by DNa or histone methylation
Not sure why the authors bring up uniparental disomy (both copies of a chromosome from one parent)...shares some similarity with DS, but more likely to suffer from recessive disorders / total gene silencing than a dosage effect. Is this the androgenetic mole used as a control later?
Authors argue that imprinting may cause parent-of-origin-dependent effects of nondisjoined HSA21
CpG - possible methylation site
STR - short tandem repeats

RESULTS

First, notice methylation profile of CGIs. CGIs 2, 3, 5 are differentially methylated in male and female gametes.
Fig. 2 - exp. validation: no methylation for WRB CGI-1 and 3 when looking at HhaI(+) readout.contrast with wRB CGI-2, which was previously reported to be methylated in blood cells.
Pg 10: Looking at healthy disomic subjects, the snp neighboring CGI-2 gives the known parental identity of the chromosome. Authors found that maternal HSA21 was consistently methylated at CGI-2, and paternal HSA21 unmethylated. This matches with the known differential methylation in the gametes (imprinting?).
Fig. 3 - Looking at typical nuclear trios, the neighboring SNP gives parent-of-origin of the chromosome. The developed assay shows conclusively that methylation is 1:1 with parent-of-origin. HhaI digests unmethylated DNA, leaving behind maternal allele. McrBC digests methylated DNA, leaving behind paternal allele. Conclusion: imprinting on maternal chromosomes.
Fig. 6A-B Unlike at the WRB CGI-2 DMR, RUNX1 and TMEME131 methylation changes in Trisomy 21 probands were NOT PARENT-DEPENDENT.
Pgs 14-16: no evidence that the WRB DMR imprinting effects changes in gene expression (evidenced by biallelic mRNA reads for all neighboring SNPs that had available data.


Online tools

OMIM
OMIM can be searched using gene names to find human information.

KEGG Disease database
Search this database with protein names from OMIM to find associated diseases. The 'Pathway' field of the disease entry brings up helpful pathway diagrams.

UCSC Genome Browser
Human reference genome(s). Consider using a single version for the duration of the project.

GSEA from the Broad Institute



References

Court F, Tayama C, Romanelli V, Martin-Trujillo A, Iglesias-Platas I, Okamura K, et al. Genome-wide parent-of-origin DNA methylation analysis reveals the intricacies of human imprinting and suggests a germline meth-ylation-independent mechanism of establishment. Genome Res. 2014; 24(4):554–69. doi:10.1101/gr.164913.113PMID:24402520; PubMed Central PMCID: PMCPMC3975056.
Docherty LE, Rezwan FI, Poole RL, Jagoe H, Lake H, Lockett GA, et al. Genome-wide DNA methylation anal-ysis of patients with imprinting disorders identifies differentially methylated regions associated with novel can-didate imprinted genes. J Med Genet. 2014; 51(4):229–38. doi:10.1136/jmedgenet-2013-102116PMID:24501229; PubMed Central PMCID: PMCPMC3963529.