Difference between revisions of "Emilie Uffman"
| Line 111: | Line 111: | ||
| 4. Used R Studio with R Script Jon created to merge the Galaxy data and data from MGI in order to be able to look at both logFC and gene name together | 4. Used R Studio with R Script Jon created to merge the Galaxy data and data from MGI in order to be able to look at both logFC and gene name together | ||
| − | 5. Selected genes with  | + | 5. Selected genes with most significant p-value to investigate first, then sort by logFC | 
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
| − | |||
Revision as of 14:58, 14 February 2017
Comparison of maternal trisomy and disomy (CB131,126,117,125,122,121) with Paternal trisomy and disomy (CB091,089,087,093,100,103)
-the limited variation in paternal disomy and trisomy is conserved
-maternal still has variation
-there is no clear divide between gender along PC1--what does this mean?
 
Sorting of paternal trisomy data(Emilie) and maternal trisomy data (Jon):
Use galaxy sort tool to sort p-value in ascending order
Why is it that for many of the data there is a base mean and all statistics but no p-value?
ENSMUST00000000674.12 Base mean: 8.06384358972545 LogFC: 0.0460016853044759 Std Error: 0.0883691926491015 Wald-Stats: 0.520562471212569 P-value: NA P-adjusted: N
- See Jon's notes for information on question**
 
Filtered data on galaxy to remove data with 'NA' -skip 0 header line- Only retained 70.41% of data when filtering to remove 'NA'
Copied data to excel and use shortcut to removal decimals without rounding
Copy data into MGI Database
http://www.informatics.jax.org/batch/summary#myDataTable=results%3D25%26startIndex%3D0
CENPA Gene (Centromere Protein A) located on Chromosome 5, 16.76
(ENSMUST000134372)
-up regulated
"Phenotype Overview: cardiovascular system, cellular, craniofacial, embryo, growth/size/body, mortality/aging, nervous system"
"A mutation in this gene can cause chromosomal missegregation, aneuploidy and apoptosis"
CENPA segmental haploidy of chr5 in mice is known to cause Wolf-Hirschhorn syndrome -conserved synteny to human 4p16.3
Symptoms:
-'Greek warrior helmut facial appearance'
-mental retardation
Naf D, Wilson LA, Bergstrom RA, Smith RS, Goodwin NC, Verkerk A, van Ommen GJ, Ackerman SL, Frankel WN, Schimenti JC, Mouse models for the Wolf-Hirschhorn deletion syndrome. Hum Mol Genet. 2001 Jan 15;10(2):91-8
https://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=full_report&list_uids=1058
CENP-A expression decreases in senescent cells
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4615257/
At the Crossroads of Chromosomes: Penn Study Reveals Structure of Cell Division's Key Molecule
https://www.pennmedicine.org/news/news-releases/2010/september/at-the-crossroads-of-chromoso
-Black Lab discoveries give insight into genetic inheritance -CENP-A could be the 'key' epigenetic marker protein -CENP-A changes shape of the nucleosome
Going forward:Want to investigate the CENP-A gene in maternal strains
ENSMUST000153398 (Brwd1)
-down regulated
-associated with Down Syndrome
-located on Chr16:95992449-96082526 bp within the DS region
-involved in chromatin remodeling
http://www.informatics.jax.org/reference/diseaseRelevantMarker/MGI:1890651
sorted data by chromosome, focussing on chr 16 and 17
Plan: sort data to filter for chr 16 and 17 while conserving log fold change information so we can target specific genes to investigate
Methodology:
1. Filtered data on galaxy to remove data with 'NA' -skip 0 header line- Only retained 70.41% of data when filtering to remove 'NA'
2. Copied data to excel and use shortcut to removal decimals without rounding
3. Copy data into MGI Database
http://www.informatics.jax.org/batch/summary#myDataTable=results%3D25%26startIndex%3D0
4. Used R Studio with R Script Jon created to merge the Galaxy data and data from MGI in order to be able to look at both logFC and gene name together
5. Selected genes with most significant p-value to investigate first, then sort by logFC
Excel tools:
Truncate numbers in excel: =left (A1, 18)
