Dahlquist:Notebook/Microarray Data Analysis/2008/10/02

From OpenWetWare
Jump to navigationJump to search

HeaderBlueBig.gif Home        Research        Protocols        Notebook        People        Publications        Courses        Contact       


Owwnotebook icon.png Microarray Data Analysis Report.pngMain project page
Resultset previous.pngPrevious entry      Next entryResultset next.png

Kevin's Edge Analysis from 10/2/2008

  • Login with your Keck lab username to mason (the names of the machines are on the lower-left corner of the login screens).
  • Right-click on the green tabula rasa.
  • Choose Terminal.
  • Type:
cd Desktop/edge_1.1.290
R
  • At this point, the R prompt shows up. Type:
source("edge.r")
edge()
  • The Edge GUI should now appear.
  • Create two tab-delimited text files for "genes" and "covariates".
    • Files in Desktop "Data analysis 2008-10-02"
    • Used gene file "wt-dCIN5_consolidated_Edge_genes-indexonly_20080715.txt"
    • Used covariate file "wt-dCIN5_consolidated_Edge_covariates_20080710.txt
  • Load both into an Edge session.
  • Select "Impute Missing Data" from the menu. Calculate Percent Missing Data by clicking on the button. The results are:
    • Percent of genes missing data: 7.63%
    • Percent of arrays missing data: 95.35%
    • Overall percent of missing data: 3.15%
  • For KNN Parameters, set:
    • Percent of missing values to tolerate in a gene: 100 (so all genes included)
    • Number of nearest neighbors to use (maximum of 15): 15
    • clicked GO to impute missing data.
  • Selected "Identify Differentially Expressed Genes"
    • Note: this is to compare between the wt and dCIN5 strains. Different parameters and gene/covariate files will need to be used to analyze individual strains.
    • Class Variable is: Strain
    • Differential Expression Type is: Time Course
    • Number of null iterations, set to 1000
    • Choose a seed for reproducible results, set to 47
    • Choose Time Course Settings
    • Covariate giving time points is: Timepoint
    • Covariate corresponding to individuals is: Flask
    • Choose spline type, accepted default of Natural Cubic Spline, dimension 4
    • Click "Apply" and then click "Go"
    • 1000 permutations looks like it will take about 10 minutes.
  • Save results as:
    • Choose Q-Value cutoff as 1, recalculate
    • Saved total list of genes as: "20081002_wt_dCIN5_comparison_results_genelist" in "Data analysis 2008-10-02"
    • Can cluster significant genes, did not do