Difference between revisions of "Emilie Uffman"
| (19 intermediate revisions by the same user not shown) | |||
| Line 113: | Line 113: | ||
| 5. Selected genes with most significant p-value (max E-4) to investigate first, then sort by logFC | 5. Selected genes with most significant p-value (max E-4) to investigate first, then sort by logFC | ||
| Key: | Key: | ||
| − | |||
| − | |||
| − | + | Several genes on Chr 16 (Gabpa, Cct8, Usp16, Sod1, 1110004E09Rik,Synj1, Gart, Son, Donson, RUNX1, Paxb1, Ttc3, Dscr3, Psmg1, Zbtb21 ) have different transcript numbers but are located at the same location. All however, have similar logFC. | |
| − | |||
| − | + | ---- | |
| + | Genes to investigate: | ||
| + | |||
| + | ETS_2 (https://www.ncbi.nlm.nih.gov/pubmed/2149958) | ||
| + | |||
| + |  Hmgn1 | ||
| + | |||
| + |  Mis18a--MIS18A is required for recruitment of CENPA to centromeres and normal chromosome segregation, however if this is differentially expressed could this have an effect on the CENPA expression? | ||
| + | |||
| + | http://string-db.org/cgi/network.pl?taskId=p1PAptAgQut1 | ||
| + | |||
| + | http://www.sciencedirect.com/science/article/pii/S1097276512002286 | ||
| − | + |  Tiam1 | |
| − | + | http://www.informatics.jax.org/batch/summary | |
| − | - | + | ---- | 
| + | We are going to target genes that are triplicated and differentially expressed | ||
| + | Maternal: Atp5j, Cbr1, Dopey2, Pigp,  | ||
| ---- | ---- | ||
| + | Prdm15 | ||
| + | |||
| + | http://www.informatics.jax.org/batch/summary | ||
| + | |||
| + | |||
| + | Paternal: 1110004E09Rik, App, C2cd2, Chaf1b, Dyrk1a, Ltn1, Mrpl39, N6amt1, Ripk4, Runx1, Sic5a3, Urb1, Wrb | ||
| + | |||
| + | |||
| + | |||
| + | Today we worked with galaxy to filter out data more | ||
| + | 1. we compared parent dependent trisomy and parent dependent disomy | ||
| + | 2. we filtered these 2 datasets to have a pval less than 0.01 | ||
| + | 3. We merged these datasets to find those that matched in both columns | ||
| + | 4. Then used  the 'concatenate' tool to merge all   | ||
| + | 5. looking at chr 16 (Gabpa, Chaf1b, Tiam1, Jam2, Cox17) | ||
| + | |||
| + | re did these steps by limiting pval  to 0.001 | ||
| + | |||
| + | |||
| + | |||
| + | http://software.broadinstitute.org/gsea/register.jsp | ||
| + | |||
| + | |||
| + | |||
| + | Genes of interest after clarifying methods for p-val,0.5: Cacna1h, Son | ||
| + | |||
| + | |||
| + | http://www.informatics.jax.org/reference/phenotype/marker/MGI:98353?typeFilter=Literature | ||
| + | |||
| + | Genes we know to be triplicated for further investigation: | ||
| + | Cct8, Bach1, Son, Cryzl1, Runx1, Cbr1,Dscr3, Dyrk1a, | ||
| '''Excel tools:''' | '''Excel tools:''' | ||
| Truncate numbers in excel: =left (A1, 18) | Truncate numbers in excel: =left (A1, 18) | ||
Latest revision as of 14:46, 30 March 2017
Comparison of maternal trisomy and disomy (CB131,126,117,125,122,121) with Paternal trisomy and disomy (CB091,089,087,093,100,103)
-the limited variation in paternal disomy and trisomy is conserved
-maternal still has variation
-there is no clear divide between gender along PC1--what does this mean?
 
Sorting of paternal trisomy data(Emilie) and maternal trisomy data (Jon):
Use galaxy sort tool to sort p-value in ascending order
Why is it that for many of the data there is a base mean and all statistics but no p-value?
ENSMUST00000000674.12 Base mean: 8.06384358972545 LogFC: 0.0460016853044759 Std Error: 0.0883691926491015 Wald-Stats: 0.520562471212569 P-value: NA P-adjusted: N
- See Jon's notes for information on question**
 
Filtered data on galaxy to remove data with 'NA' -skip 0 header line- Only retained 70.41% of data when filtering to remove 'NA'
Copied data to excel and use shortcut to removal decimals without rounding
Copy data into MGI Database
http://www.informatics.jax.org/batch/summary#myDataTable=results%3D25%26startIndex%3D0
CENPA Gene (Centromere Protein A) located on Chromosome 5, 16.76
(ENSMUST000134372)
-up regulated
"Phenotype Overview: cardiovascular system, cellular, craniofacial, embryo, growth/size/body, mortality/aging, nervous system"
"A mutation in this gene can cause chromosomal missegregation, aneuploidy and apoptosis"
CENPA segmental haploidy of chr5 in mice is known to cause Wolf-Hirschhorn syndrome -conserved synteny to human 4p16.3
Symptoms:
-'Greek warrior helmut facial appearance'
-mental retardation
Naf D, Wilson LA, Bergstrom RA, Smith RS, Goodwin NC, Verkerk A, van Ommen GJ, Ackerman SL, Frankel WN, Schimenti JC, Mouse models for the Wolf-Hirschhorn deletion syndrome. Hum Mol Genet. 2001 Jan 15;10(2):91-8
https://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=full_report&list_uids=1058
CENP-A expression decreases in senescent cells
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4615257/
At the Crossroads of Chromosomes: Penn Study Reveals Structure of Cell Division's Key Molecule
https://www.pennmedicine.org/news/news-releases/2010/september/at-the-crossroads-of-chromoso
-Black Lab discoveries give insight into genetic inheritance -CENP-A could be the 'key' epigenetic marker protein -CENP-A changes shape of the nucleosome
Going forward:Want to investigate the CENP-A gene in maternal strains
ENSMUST000153398 (Brwd1)
-down regulated
-associated with Down Syndrome
-located on Chr16:95992449-96082526 bp within the DS region
-involved in chromatin remodeling
http://www.informatics.jax.org/reference/diseaseRelevantMarker/MGI:1890651
sorted data by chromosome, focussing on chr 16 and 17
Plan: sort data to filter for chr 16 and 17 while conserving log fold change information so we can target specific genes to investigate
Methodology:
1. Filtered data on galaxy to remove data with 'NA' -skip 0 header line- Only retained 70.41% of data when filtering to remove 'NA'
2. Copied data to excel and use shortcut to removal decimals without rounding
3. Copy data into MGI Database
http://www.informatics.jax.org/batch/summary#myDataTable=results%3D25%26startIndex%3D0
4. Used R Studio with R Script Jon created to merge the Galaxy data and data from MGI in order to be able to look at both logFC and gene name together
5. Selected genes with most significant p-value (max E-4) to investigate first, then sort by logFC Key:
Several genes on Chr 16 (Gabpa, Cct8, Usp16, Sod1, 1110004E09Rik,Synj1, Gart, Son, Donson, RUNX1, Paxb1, Ttc3, Dscr3, Psmg1, Zbtb21 ) have different transcript numbers but are located at the same location. All however, have similar logFC.
Genes to investigate:
ETS_2 (https://www.ncbi.nlm.nih.gov/pubmed/2149958)
Hmgn1
Mis18a--MIS18A is required for recruitment of CENPA to centromeres and normal chromosome segregation, however if this is differentially expressed could this have an effect on the CENPA expression?
http://string-db.org/cgi/network.pl?taskId=p1PAptAgQut1
http://www.sciencedirect.com/science/article/pii/S1097276512002286
Tiam1
http://www.informatics.jax.org/batch/summary
We are going to target genes that are triplicated and differentially expressed
Maternal: Atp5j, Cbr1, Dopey2, Pigp,
Prdm15
http://www.informatics.jax.org/batch/summary
Paternal: 1110004E09Rik, App, C2cd2, Chaf1b, Dyrk1a, Ltn1, Mrpl39, N6amt1, Ripk4, Runx1, Sic5a3, Urb1, Wrb
Today we worked with galaxy to filter out data more 1. we compared parent dependent trisomy and parent dependent disomy 2. we filtered these 2 datasets to have a pval less than 0.01 3. We merged these datasets to find those that matched in both columns 4. Then used the 'concatenate' tool to merge all 5. looking at chr 16 (Gabpa, Chaf1b, Tiam1, Jam2, Cox17)
re did these steps by limiting pval to 0.001
http://software.broadinstitute.org/gsea/register.jsp
Genes of interest after clarifying methods for p-val,0.5: Cacna1h, Son
http://www.informatics.jax.org/reference/phenotype/marker/MGI:98353?typeFilter=Literature
Genes we know to be triplicated for further investigation: Cct8, Bach1, Son, Cryzl1, Runx1, Cbr1,Dscr3, Dyrk1a,
Excel tools:
Truncate numbers in excel: =left (A1, 18)
