User:Matthew Whiteside/Notebook/Malaria Microarray/2009/10/21
|Project name||Main project page|
Following normalization, I perform differential expression analysis using the R Bioconductor package LIMMA. The LIMMA package determines statistically significant changes in expression by 1) fitting a logistic linear model for each gene (to obtain the mean expression, variability estimates etc) and 2) then determining by a modified t-test whether the gene expression changes are significant (Modified t-tests borrows variability information from other genes. This increase the sample size, which is important since each gene for each condition may only be measured a couple of times). Limma requires replicates to perform the t-test.
This last requirement means excluding analyses m2 and m3 (see Microarray Experiments). In both experiments, they pooled RNA from several bio-reps and spotted on a single Affy array for each test condition. This removes all ability to estimate gene-level variation and perform a LIMMA statistical analysis.
Analysis m1 is well formed. I will proceed by only analyzing m1.
Implementation of LIMMA analysis
Steps include probe annotation, selecting contrasts of interest, performing LIMMA analysis and multiple hyp correction, and some additional QC. This has been outlined before ([User:Matthew_Whiteside/Notebook/Malaria_Microarray/2009/05/29]).
The contrast of interest that i will be using for m1: CM - NCM. Each group CM & NCM are composed of biological blocks before & after infection - i created these by subtracting Infected - Baseline to create a relative expression during infection. This gives different relative expression levels than compared to the h2 study (which is a simple 2 group comparison). This study also included a number of technical replicates. I averaged these as was done in the m1 paper. This is not ideal, but was done for simplicity.
I also computed the other contrasts CM - Baseline, & NCM - Baseline, for reference. See User:Matthew_Whiteside/Notebook/Malaria_Microarray/2009/09/11 for the naming scheme used to identify the contrasts.
For the CM-NCM contrast (m1.1), 620 genes were DE (p-value < 0.05, Benjamini-Hochburg mult-hyp correction). Below is the volcanco plot. You can see the majority of DE genes are down-regulated in CM compared to NCM.
Below is the heatmap. Array samples (this is for each bio-block summarized (before and after subtracted) and technical rep averaged data). Samples cluster as expected.