Dahlquist:Notebook/Microarray Data Analysis/2008/10/21: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
(Autocreate 2008/10/21 Entry for Dahlquist:Notebook/Microarray_Data_Analysis) |
|||
Line 9: | Line 9: | ||
== Today's Workflow == | == Today's Workflow == | ||
'''The results generated on 10/14/2008 were downloaded and placed on the Desktop in "Edge Analysis" in Kevin's profile. Significant gene results were saved as tab-delimited files and the Pvalue Histograms and QPlots were saved into a powerpoint and printed.''' | |||
''' Previous run (10/14/2008) on dCIN5-only dataset gave interesting results. While the wt-only dataset produced about 1000 significant genes, the dCIN5-only one gave about 150 significant genes. To verify this result:''' | |||
* First the covariates and genelist files were uploaded to lion share. They will be opened with excel and checked for errors. | |||
* Then for an additional test, the difference between dCIN5 and wt at an individual timepoint was tested: | |||
** Files in Desktop "Data analysis 2008-10-02" | |||
** Used gene file "wt-dCIN5_consolidated_Edge_genes-indexonly_20080715.txt" | |||
** Used covariate file "wt-dCIN5_consolidated_Edge_covariates_20080710.txt" | |||
* Load both into an Edge session. | |||
* Select "Impute Missing Data" from the menu. Calculate Percent Missing Data by clicking on the button. The results are: | |||
** Percent of genes missing data: 7.63% | |||
** Percent of arrays missing data: 95.35% | |||
** Overall percent of missing data: 3.15% | |||
* For KNN Parameters, set: | |||
** Percent of missing values to tolerate in a gene: 100 (so all genes included) | |||
** Number of nearest neighbors to use (maximum of 15): 15 | |||
** clicked GO to impute missing data. | |||
* Selected "Identify Differentially Expressed Genes" | |||
** Note: this is to compare between the wt and dCIN5 strains. Different parameters and gene/covariate files will need to be used to analyze individual strains. | |||
** Class Variable is: Strain | |||
** Differential Expression Type is: Time Course | |||
** Number of null iterations, set to 1000 | |||
** Choose a seed for reproducible results, set to 47 | |||
** Choose Time Course Settings | |||
** Covariate giving time points is: Timepoint | |||
** Covariate corresponding to individuals is: Flask | |||
** Choose spline type, accepted default of Natural Cubic Spline, dimension 4 | |||
** Click "Apply" and then click "Go" | |||
** 1000 permutations looks like it will take about 10 minutes. | |||
* Results: (Saved in 2008-10-14 Results) | |||
** No significant genes under these settings. | |||
** Choose Q-Value cutoff as 1, recalculate | |||
*** Saved total list of genes as: "GeneList_20081014_wt-vs-dCIN5" | |||
** To save the plots, do the following command in the R console window. | |||
savePlot(filename = "PvalHistogram_wt-vs-dCIN5", type = c("png"), device = dev.cur()) | |||
* This will save the active plot window under a file name you choose. Saves in folder "edge_1.1.290" | |||
** Saved Q-Plot as "QPlot_20081014_wt-vs-dCIN5" | |||
** Saved Histograms as "PvalHistogram_20081014_wt-vs-dCIN5 | |||
Revision as of 12:40, 21 October 2008
Microarray Data Analysis | <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page <html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html> </html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html> |
Today's WorkflowThe results generated on 10/14/2008 were downloaded and placed on the Desktop in "Edge Analysis" in Kevin's profile. Significant gene results were saved as tab-delimited files and the Pvalue Histograms and QPlots were saved into a powerpoint and printed. Previous run (10/14/2008) on dCIN5-only dataset gave interesting results. While the wt-only dataset produced about 1000 significant genes, the dCIN5-only one gave about 150 significant genes. To verify this result:
savePlot(filename = "PvalHistogram_wt-vs-dCIN5", type = c("png"), device = dev.cur())
|