Difference between revisions of "Talk:Jenna Reed"
| Line 155: | Line 155: | ||
Annotations should automatically be generated when you create your set | Annotations should automatically be generated when you create your set | ||
Once you've created your gene sets, you can share it with groups and begin analysis | Once you've created your gene sets, you can share it with groups and begin analysis | ||
| + | |||
| + | |||
| + | == 2/22/18 == | ||
| + | |||
| + | |||
| + | Final Paper: 5-10 pages, including figures | ||
| + | Zotero style = Cell | ||
| + | Figures embedded in text | ||
| + | Have figure legends (not in a text box) | ||
| + | Suggested writing order | ||
| + | Title > 7 | ||
| + | Authors (you first, partner second) > 6 | ||
| + | Abstract (200-word limit) > 5 | ||
| + | Intro > 4 | ||
| + | Methods > 1 | ||
| + | Results > 2 (figures first, then writing) | ||
| + | Discussion > 3 | ||
| + | References > 8 (Zotero) | ||
| + | |||
| + | GO-rilla: | ||
| + | MUTvWT (ranked by p-value) | ||
| + | Nefm (medium polypeptide) | ||
| + | Nefh (heavy polypeptide) | ||
| + | Nefl (light polypeptide) | ||
| + | Calca (calcitonin/calcitonin-related polypeptide, alpha | ||
| + | Slc31a1 (solute carrier family 31, member 1) | ||
| + | Tnnt1 (troponin t1, skeletal, slow) | ||
| + | Atp1a1 (atpase, na+/k+ transporting, alpha 1 polypeptide) | ||
| + | Myh7 (myosin, heavy polypeptide 7, cardiac muscle, beta) | ||
| + | |||
| + | p = 10-7-10-9 | ||
| + | |||
| + | When sorted by fold change: MUTvWT | ||
| + | intermediate filament-based process | ||
| + | Prph - peripherin | ||
| + | |||
| + | |||
| + | Genes of Interest: | ||
| + | Nefm | ||
| + | Nefh | ||
| + | Nefl | ||
| + | |||
| + | If I need them (in order of use): | ||
| + | Calca | ||
| + | Atp1a1 | ||
| + | Prph | ||
| + | Tnnt1 | ||
| + | Slc31a1 & Myh7 | ||
| + | |||
| + | |||
| + | Nefl, Nefm, & Nefh: | ||
| + | Intermediate filament-based process | ||
| + | Intermediate filament cytoskeleton organization | ||
| + | Neurofilament cytoskeleton organization | ||
| + | Intermediate filament organization | ||
| + | Intermediate filament bundle assembly | ||
| + | Neurofilament bundle assembly | ||
| + | Axon development | ||
| + | Calca | ||
| + | Regulation of muscle contraction (not shown on figure) | ||
| + | Regulation of anatomical structure size (not shown on figure) | ||
| + | Atp1a1 | ||
| + | Regulation of muscle contraction | ||
Revision as of 18:50, 11 March 2018
2/8/18
2/8/18
Bio343 HTSeq Results
Male Mutant 25 and 24 = paired reads, Male mouse 2078, C201R 15 and 14 = paired reads, Male mouse 2073, C201R 3 and 2 = paired reads, Male mouse 2079, C201R
Male WT 21 and 20 = paired reads, Male mouse 2076, WT 19 and 18 = paired reads, Male mouse 2075, WT 17 and 16 = paired reads, Male mouse 2074, WT
Female Mutant 13 and 12 = paired reads, Female mouse 2072, C201R 11 and 10 = paired reads, Female mouse 2071, C201R 9 and 8 = paired reads, Female mouse 2070, C201R
Female WT 23 and 22 = paired reads, Female mouse 2077, WT 7 and 6 = paired reads, Female mouse 2081, WT 5 and 4 = paired reads, Female mouse 2080, WT
Gene Cards Gene Weaver GO rilla FNTM String NCBI
2/8/18 Loaded data from "Histories Shared with Me"
NefL >>> deltaCMT 1F 2E NefH >>> deltaALS Sphk2 >>> kinase
Male v Female MaleWT v MaleCMT FemaleWT v FemaleCMT
Gm4210 NefM (neurofilament medium) Close with NefL, and Gm2410 Calca >>> calcium regulator
About this data: Took the spinal cord of the mice and sent it off for RNASeq 2-fold change
-0.7 = 1.62 fold change -0.8 = 1.74 fold change -7.56 = 128fold change, p=10^-14
What's our fold-change cutoff? JAX had a strict cutoff and got only 20 genes Lack of signal because of CMT could cause repression of transcription (because usually ligand binding induces transcription) RNASeq data will always have differential expression from natural variation Is it enough differential expression that it could be caused by the disease?
2/13/18
2/13/18
mut/WT log2 500/1000 = 0.5 Anything less than 1 is negative
DESeq2 >>> run htseq-count data with features After doing a DESeq2 run, two files will appear in the history First one is tabular. Download that, open in Text Wrangler, then copy and paste that in excel Second one is a pdf. Download and open that to see data visualizations
54 & 55 >>> DESeq2 for "Female_MUTvWT" 56 & 57 >>> DESeq2 for "Male_MUTvWT" 58 & 59 >>> DESeq2 for "MUT_MALEvFEMALE" 60 & 61 >>> DESeq2 for "WT_MALEvFEMALE" 62 & 63 >>> DESeq2 for "MUT_v_WT"
Research method for today: look at what genes are significant in each dataset and see where there is overlap between datasets
In excel sheet: DESeq2 data for all 5 comparisons P-value < 0.05 highlighted in yellow P-adjusted <0.05 highlighted in green List of just the p-adjusted <0.05 genes for all 5 comparisons together
General observations:
of genes with p-adj < 0.05 Female_MUTvWT: 72
Sphk2 >>> p-adj is less than 0.05 for both MUT_MALEvFEMALE and WT_MALEvFEMALE Sphingosine Kinase 2 Paralog with Sphk1, which codes for the other enzyme that has the same function Encodes for an enzyme that catalyzes the phosphorylation of sphingosine into sphingosine 1-phosphate sphingosine 1-phosphate = important in cell migration, proliferation, apoptosis Implicated in some cancers (breast cancer proliferation, chemoresistance) Related pathways: Calcium signaling pathway (could be connection to CMT2D), Metabolism
Tsix >>> p-adj is less than 0.05 for both MUT_MALEvFEMALE and WT_MALEvFEMALE
2/15/18: GeneWeaver Lecture
Why GeneWeaver? Integrative functional genomics
Goal is to integrate the genetic investigation of humans and animal models
Data repository has a number of different types of data
Microarrays
Published data
Annotations
GWAS
QTL
Tools
Can do set-set matches
Can match your set of gene with your other set of genes or with another set of genes in the database
ODE IDs are a reference to find the consilience among associations of biomolecular entities and related concenpts
http://beta.geneweaver.org/ >>> new version
Types of functional genomics data
Mapping Data
QTL Positional candidates (mouse, rat)
GWAS Candidate (from original studies)
Expression Data
DRG (Drug Related Genes) >>> from literature
ABA (Allen Brain Atlas) >>> mice
CTD (Comparative Toxicogenomics Database) >>> from 9 different species
Functional Annotations
GO (Gene Ontology Annotations) > human, house
MP (Mammalian Phenotype Ontology)
HP (Human phenotype ontology)
OMIM (Online Mendelian Inheritance in Man)
MeSH (Medical Subject Headings)
Pathway Data
KEGG (Kyoto Encyclopedia of Genes and Genomes)
MSigDB (Molecular Signatures Databases)
PC (Pathway Commons)
How do we use GeneWeaver?
Tier III = data curated from literature (gold standard)
Tier I and II = public resource data (II is pre-processed somehow)
Search
On results, "+" give you more basic info (description, authors, etc.) or gene set
Clicking on gene set name will allow you to view the gene set
Gene list on bottom of gene set entry
Gene Symbol >>> can be changed to show identifier numbers for datasets
Homology shows which species there is a homologous gene in
Linkouts can take you to entry in each database
Under a search, you can select genesets, then hit "Add Selected to Project" (at top of search results) and you can add it to an existing project or create a new project
Top Toolbar: Analyze GeneSet
Can select all genesets in a project, or can expand project (+) and select specific sets
Analyzing a GeneSet
HiSim Graph
Usually default parameters are fine
On resulting graph
Can zoom in or out
Right = 4 genesets
Hover mouse over it to get a little data info
Moving right to left, you'll encounter nodes (intersection between our genesets)
Hovering over the node will show you what genes interact between the two genesets
Furthest left nodes show the most connected genes (genes that are in the most genesets)
Left clicking on a node will make it disappear so that we only show what we do care about (left clicking on it again will make it reappear)
Right clicking (or shift+left click) on the node will make a more detailed node summary page appear
Visualization
Classic = more descriptive without clicking on things
Modern = easier to see connections
Contains my projects and projects that have been shared with me
GeneSet Graph
Shows us which genes are connected to which datasets
Left to right = less connected gene sets to most connected gene sets
Color lines match color of gene set boxes
Difference between this and HiSim Graph
HiSim = emphasizes the sets and the relationship between the sets
GeneSet = highlights the genes
Different ways of visualizing the same thing
Tool Options >>> MinDegree
Allows you to limit it so that the graph only shows genes that have X number of connections
e.g. if you change the MinDegree to 4 and hit "Re-Run Tool," the graph will only show genes that are in at least 4 of the data sets
Jaccard Similarity Status
Top Toolbar
Manage Genesets > Manage Projects
Can view/edit all your projects
Little arrow with dots on points = share project
Can share project with a group that you're part of
Top Toolbar >>> Manage GeneSets >>> Upload GeneSet
Give set a descriptive name
Figure Label = shorter version of GeneSet Name (something that will fit on an analysis label)
Score Type: no way to upload a gene set with two different score types
Can upload gene set twice, each one with a different score type, then threshold the gene sets
Choose access restriction and you can share which groups you want
Select species (Mus musculus)
Gene Identifier: can't use a mix of gene identifiers, but some databases have conversion tools
Gene List: File Upload only takes plain text
If you copy and paste from excel, it should automatically format properly
Annotations should automatically be generated when you create your set
Once you've created your gene sets, you can share it with groups and begin analysis
2/22/18
Final Paper: 5-10 pages, including figures Zotero style = Cell Figures embedded in text Have figure legends (not in a text box) Suggested writing order
Title > 7
Authors (you first, partner second) > 6
Abstract (200-word limit) > 5
Intro > 4
Methods > 1
Results > 2 (figures first, then writing)
Discussion > 3
References > 8 (Zotero)
GO-rilla:
MUTvWT (ranked by p-value)
Nefm (medium polypeptide)
Nefh (heavy polypeptide)
Nefl (light polypeptide)
Calca (calcitonin/calcitonin-related polypeptide, alpha
Slc31a1 (solute carrier family 31, member 1)
Tnnt1 (troponin t1, skeletal, slow)
Atp1a1 (atpase, na+/k+ transporting, alpha 1 polypeptide)
Myh7 (myosin, heavy polypeptide 7, cardiac muscle, beta)
p = 10-7-10-9
When sorted by fold change: MUTvWT
intermediate filament-based process
Prph - peripherin
Genes of Interest:
Nefm
Nefh
Nefl
If I need them (in order of use):
Calca
Atp1a1
Prph
Tnnt1
Slc31a1 & Myh7
Nefl, Nefm, & Nefh:
Intermediate filament-based process
Intermediate filament cytoskeleton organization
Neurofilament cytoskeleton organization
Intermediate filament organization
Intermediate filament bundle assembly
Neurofilament bundle assembly
Axon development
Calca
Regulation of muscle contraction (not shown on figure)
Regulation of anatomical structure size (not shown on figure)
Atp1a1
Regulation of muscle contraction