Dahlquist:Microarray Data Analysis Workflow: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Steps 4-5: Within- and Between-chip Normalization: pasted in installation instructions and created subheadings)
(→‎Running the Normalization Scripts: download and unzip files)
Line 40: Line 40:


=== Running the Normalization Scripts ===
=== Running the Normalization Scripts ===
* Create a folder on your Desktop to store your files for the microarray analysis procedure.
* Download the [https://lionshare.lmu.edu/Users/kdahlqui/SURP%202015/wt-dCIN5-dGLN3-dHAP1-dHMO1-dSWI4-dZAP1-Spar_gpr-files.zip zipped file] that contains the <code>.gpr</code> files and move it to this folder.
** Unzip this file using 7-zip.  Right-click on the file and select the menu item, "7-zip > Extract Here".
* Download the [https://lionshare.lmu.edu/Users/kdahlqui/SURP%202015/GCAT_Targets.csv GCAT_Targets.csv] file and [https://lionshare.lmu.edu/Users/kdahlqui/SURP%202015/Ontario_Targets_wt-dCIN5-dGLN3-dHAP4-dHMO1-dSWI4-dZAP1-Spar_20150514.csv Ontario_Targets_wt-dCIN5-dGLN3-dHAP4-dHMO1-dSWI4-dZAP1-Spar_20150514.csv] files and move them to this folder.


=== Creating MA and box plots ===
=== Creating MA and box plots ===

Revision as of 14:18, 15 May 2015

Home        Research        Protocols        Notebook        People        Publications        Courses        Contact       


This is the most current version of the data analysis protocol for the Dahlquist Lab microarray data. We will perform this analysis as a group during Week 1 of SURP 2015.

Summary of steps for microarray data analysis

  1. Quantitate the fluorescence signal in each spot (GenePix Pro)
  2. Calculate the ratio of red/green fluorescence (GenePix Pro)
  3. Log transform the ratios (GenePix Pro)
  4. Normalize the ratios on each microarray slide (within-chip normalization)
  5. Normalize the ratios for a set of slides in an experiment (between-chip normalization)
  6. Perform statistical analysis on the ratios
    • Within-strain ANOVA
    • Modified t test for each timepoint
    • Between-strain ANOVA
    • Benjamini & Hochberg and Bonferroni p value corrections for the above three tests
    • "Sanity Check" on above three tests
  7. Pattern finding algorithms (clustering with stem)
  8. Gene Ontology term enrichment analysis (on clusters with stem or on gene sets with MAPPFinder)
  9. Pathway analysis (GenMAPP)
  10. Determining candidate transcription factors and gene regulatory network (YEASTRACT)
  11. Dynamical modeling with GRNmap; visualization with GRNsight

Steps 1-3: Generating Log2 Ratios with GenePix Pro

  • The protocol for gridding and generating the intensity (log2 ratio) data with GenePix Pro 6.1 is found on this page.
  • This protocol will generate a *.gpr file for each chip which is then fed into the normalization protocol below.

Steps 4-5: Within- and Between-chip Normalization

  • A more detailed protocol can be found on this page. An abbreviated protocol is summarized below.

Installing R 3.1.0 and the limma package

The following protocol was developed to normalize GCAT and Ontario DNA microarray chip data from the Dahlquist lab using the R Statistical Software and the limma package (part of the Bioconductor Project).

  • The normalization procedure has been verified to work with version 3.1.0 of R released in April 2014 (link to download site) and and version 3.20.1 of the limma package ( direct link to download zipped file) on the Windows 7 platform.
    • Note that using other versions of R or the limma package might give different results.
    • Note also that using the 32-bit versus the 64-bit versions of R 3.1.0 will give different results for the normalization out in the 10-13 or 10-14 decimal place. The Dahlquist Lab is standardizing on using the 64-bit version of R.
  • To install R for the first time, download and run the installer from the link above, accepting the default installation.
  • To use the limma package, unzip the file and place the contents into a folder called "limma" in the library directory of the R program. If you accept the default location, that will be C:\Program Files\R\R-3.1.0\library (this will be different on the computers in S120 since you do not have administrator rights).

Running the Normalization Scripts

Creating MA and box plots

Step 6: Statistical Analysis

Within-strain ANOVA

Modified t test for each timepoint

Between-strain ANOVA

Step 7-8: Clustering and GO Term Enrichment with stem

Step 9: GenMAPP & MAPPFinder

Step 10: YEASTRACT

Step 11: GRNmap and GRNsight