Difference between revisions of "Talk:Jenna Reed"

From GcatWiki
Jump to: navigation, search
Line 155: Line 155:
 
             Annotations should automatically be generated when you create your set
 
             Annotations should automatically be generated when you create your set
 
             Once you've created your gene sets,  you can share it with groups and begin analysis
 
             Once you've created your gene sets,  you can share it with groups and begin analysis
 +
 +
 +
== 2/22/18 ==
 +
 +
 +
Final Paper: 5-10 pages, including figures
 +
Zotero style = Cell
 +
Figures embedded in text
 +
Have figure legends (not in a text box)
 +
Suggested writing order
 +
        Title > 7
 +
        Authors (you first, partner second) > 6
 +
        Abstract (200-word limit) > 5
 +
        Intro > 4
 +
        Methods > 1
 +
        Results > 2 (figures first, then writing)
 +
        Discussion > 3
 +
        References > 8 (Zotero)
 +
 +
GO-rilla:
 +
        MUTvWT (ranked by p-value)
 +
        Nefm (medium polypeptide)
 +
        Nefh (heavy polypeptide)
 +
        Nefl (light polypeptide)
 +
        Calca (calcitonin/calcitonin-related polypeptide, alpha
 +
        Slc31a1 (solute carrier family 31, member 1)
 +
        Tnnt1 (troponin t1, skeletal, slow)
 +
        Atp1a1 (atpase, na+/k+ transporting, alpha 1 polypeptide)
 +
        Myh7 (myosin, heavy polypeptide 7, cardiac muscle, beta)
 +
 +
p = 10-7-10-9
 +
 +
When sorted by fold change: MUTvWT
 +
        intermediate filament-based process
 +
        Prph - peripherin
 +
 +
 +
Genes of Interest:
 +
        Nefm
 +
        Nefh
 +
        Nefl
 +
 +
If I need them (in order of use):
 +
        Calca
 +
        Atp1a1
 +
        Prph
 +
        Tnnt1
 +
        Slc31a1 & Myh7
 +
 +
 +
Nefl, Nefm, & Nefh:
 +
        Intermediate filament-based process
 +
        Intermediate filament cytoskeleton organization
 +
        Neurofilament cytoskeleton organization
 +
        Intermediate filament organization
 +
        Intermediate filament bundle assembly
 +
        Neurofilament bundle assembly
 +
        Axon development
 +
Calca
 +
        Regulation of muscle contraction (not shown on figure)
 +
        Regulation of anatomical structure size (not shown on figure)
 +
Atp1a1
 +
        Regulation of muscle contraction

Revision as of 18:50, 11 March 2018

2/8/18

2/8/18

Bio343 HTSeq Results

Male Mutant 25 and 24 = paired reads, Male mouse 2078, C201R 15 and 14 = paired reads, Male mouse 2073, C201R 3 and 2 = paired reads, Male mouse 2079, C201R

Male WT 21 and 20 = paired reads, Male mouse 2076, WT 19 and 18 = paired reads, Male mouse 2075, WT 17 and 16 = paired reads, Male mouse 2074, WT

Female Mutant 13 and 12 = paired reads, Female mouse 2072, C201R 11 and 10 = paired reads, Female mouse 2071, C201R 9 and 8 = paired reads, Female mouse 2070, C201R

Female WT 23 and 22 = paired reads, Female mouse 2077, WT 7 and 6 = paired reads, Female mouse 2081, WT 5 and 4 = paired reads, Female mouse 2080, WT


Gene Cards Gene Weaver GO rilla FNTM String NCBI


2/8/18 Loaded data from "Histories Shared with Me"


NefL >>> deltaCMT 1F 2E NefH >>> deltaALS Sphk2 >>> kinase

   Male v Female
   MaleWT v MaleCMT
   FemaleWT v FemaleCMT

Gm4210 NefM (neurofilament medium) Close with NefL, and Gm2410 Calca >>> calcium regulator

About this data: Took the spinal cord of the mice and sent it off for RNASeq 2-fold change

   -0.7 = 1.62 fold change
   -0.8 = 1.74 fold change
   -7.56 =  128fold change, p=10^-14

What's our fold-change cutoff? JAX had a strict cutoff and got only 20 genes Lack of signal because of CMT could cause repression of transcription (because usually ligand binding induces transcription) RNASeq data will always have differential expression from natural variation Is it enough differential expression that it could be caused by the disease?


2/13/18

2/13/18

mut/WT log2 500/1000 = 0.5 Anything less than 1 is negative

DESeq2 >>> run htseq-count data with features After doing a DESeq2 run, two files will appear in the history First one is tabular. Download that, open in Text Wrangler, then copy and paste that in excel Second one is a pdf. Download and open that to see data visualizations

54 & 55 >>> DESeq2 for "Female_MUTvWT" 56 & 57 >>> DESeq2 for "Male_MUTvWT" 58 & 59 >>> DESeq2 for "MUT_MALEvFEMALE" 60 & 61 >>> DESeq2 for "WT_MALEvFEMALE" 62 & 63 >>> DESeq2 for "MUT_v_WT"

Research method for today: look at what genes are significant in each dataset and see where there is overlap between datasets

In excel sheet: DESeq2 data for all 5 comparisons P-value < 0.05 highlighted in yellow P-adjusted <0.05 highlighted in green List of just the p-adjusted <0.05 genes for all 5 comparisons together

General observations:

of genes with p-adj < 0.05 Female_MUTvWT: 72

Sphk2 >>> p-adj is less than 0.05 for both MUT_MALEvFEMALE and WT_MALEvFEMALE Sphingosine Kinase 2 Paralog with Sphk1, which codes for the other enzyme that has the same function Encodes for an enzyme that catalyzes the phosphorylation of sphingosine into sphingosine 1-phosphate sphingosine 1-phosphate = important in cell migration, proliferation, apoptosis Implicated in some cancers (breast cancer proliferation, chemoresistance) Related pathways: Calcium signaling pathway (could be connection to CMT2D), Metabolism

Tsix >>> p-adj is less than 0.05 for both MUT_MALEvFEMALE and WT_MALEvFEMALE


2/15/18: GeneWeaver Lecture

Why GeneWeaver? Integrative functional genomics

  Goal is to integrate the genetic investigation of humans and animal models
  Data repository has a number of different types of data
       Microarrays
       Published data
       Annotations
       GWAS
       QTL
  Tools
       Can do set-set matches
       Can match your set of gene with your other set of genes or with another set of genes in the database
  ODE IDs are a reference to find the consilience among associations of biomolecular entities and related concenpts
  http://beta.geneweaver.org/  	>>> new version
  Types of functional genomics data
       Mapping Data
            QTL Positional candidates (mouse, rat)
            GWAS Candidate (from original studies)
       Expression Data
            DRG (Drug Related Genes) >>> from literature
            ABA (Allen Brain Atlas) >>> mice
            CTD (Comparative Toxicogenomics Database) >>> from 9 different species
       Functional Annotations
            GO (Gene Ontology Annotations) > human, house
            MP (Mammalian Phenotype Ontology)
            HP (Human phenotype ontology)
            OMIM (Online Mendelian Inheritance in Man)
            MeSH (Medical Subject Headings)
       Pathway Data
            KEGG (Kyoto Encyclopedia of Genes and Genomes)
            MSigDB (Molecular Signatures Databases)
            PC (Pathway Commons)

How do we use GeneWeaver?

  Tier III = data curated from literature (gold standard)
  Tier I and II = public resource data (II is pre-processed somehow)
  Search
       On results, "+" give you more basic info (description, authors, etc.) or gene set
       Clicking on gene set name will allow you to view the gene set
  Gene list on bottom of gene set entry
       Gene Symbol >>> can be changed to show identifier numbers for datasets
       Homology shows which species there is a homologous gene in
       Linkouts can take you to entry in each database
  Under a search, you can select genesets, then hit "Add Selected to Project" (at top of search results) and you can add it to an existing project or create a new project
  Top Toolbar: Analyze GeneSet
       Can select all genesets in a project, or can expand project (+) and select specific sets

Analyzing a GeneSet

  HiSim Graph
       Usually default parameters are fine
       On resulting graph
            Can zoom in or out
            Right = 4 genesets
            Hover mouse over it to get a little data info
            Moving right to left, you'll encounter nodes (intersection between our genesets)
            Hovering over the node will show you what genes interact between the two genesets
            Furthest left nodes show the most connected genes (genes that are in the most genesets)
            Left clicking on a node will make it disappear so that we only show what we do care about (left clicking on it again will make it reappear)
            Right clicking (or shift+left click) on the node will make a more detailed node summary page appear
            Visualization
                  Classic = more descriptive without clicking on things
                  Modern = easier to see connections
  Contains my projects and projects that have been shared with me
  GeneSet Graph
            Shows us which genes are connected to which datasets
            Left to right = less connected gene sets to most connected gene sets
            Color lines match color of gene set boxes
            Difference between this and HiSim Graph
                  HiSim = emphasizes the sets and the relationship between the sets
                  GeneSet = highlights the genes
                  Different ways of visualizing the same thing
            Tool Options >>> MinDegree
                  Allows you to limit it so that the graph only shows genes that have X number of connections
                  e.g. if you change the MinDegree to 4 and hit "Re-Run Tool," the graph will only show genes that are in at least 4 of the data sets
            Jaccard Similarity Status
  Top Toolbar
            Manage Genesets > Manage Projects
            Can view/edit all your projects
            Little arrow with dots on points = share project
            Can share project with a group that you're part of
            Top Toolbar >>> Manage GeneSets >>> Upload GeneSet
            Give set a descriptive name
            Figure Label = shorter version of GeneSet Name (something that will fit on an analysis label)
            Score Type: no way to upload a gene set with two different score types
            Can upload gene set twice, each one with a different score type, then threshold the gene sets
            Choose access restriction and you can share which groups you want
            Select species (Mus musculus)
            Gene Identifier: can't use a mix of gene identifiers, but some databases have conversion tools
            Gene List: File Upload only takes plain text
                  If you copy and paste from excel, it should automatically format properly
            Annotations should automatically be generated when you create your set
            Once you've created your gene sets,  you can share it with groups and begin analysis


2/22/18

Final Paper: 5-10 pages, including figures Zotero style = Cell Figures embedded in text Have figure legends (not in a text box) Suggested writing order

       Title > 7
       Authors (you first, partner second) > 6
       Abstract (200-word limit) > 5
       Intro > 4
       Methods > 1
       Results > 2 (figures first, then writing)
       Discussion > 3
       References > 8 (Zotero)

GO-rilla:

       MUTvWT (ranked by p-value)
       Nefm (medium polypeptide)
       Nefh (heavy polypeptide)
       Nefl (light polypeptide)
       Calca (calcitonin/calcitonin-related polypeptide, alpha
       Slc31a1 (solute carrier family 31, member 1)
       Tnnt1 (troponin t1, skeletal, slow)
       Atp1a1 (atpase, na+/k+ transporting, alpha 1 polypeptide)
       Myh7 (myosin, heavy polypeptide 7, cardiac muscle, beta)

p = 10-7-10-9

When sorted by fold change: MUTvWT

       intermediate filament-based process
       Prph - peripherin


Genes of Interest:

       Nefm
       Nefh
       Nefl

If I need them (in order of use):

       Calca
       Atp1a1
       Prph
       Tnnt1
       Slc31a1 & Myh7


Nefl, Nefm, & Nefh:

       Intermediate filament-based process
       Intermediate filament cytoskeleton organization
       Neurofilament cytoskeleton organization
       Intermediate filament organization
       Intermediate filament bundle assembly
       Neurofilament bundle assembly
       Axon development

Calca

       Regulation of muscle contraction (not shown on figure)
       Regulation of anatomical structure size (not shown on figure)

Atp1a1

       Regulation of muscle contraction