Dahlquist:Notebook/Microarray Data Analysis/2008/10/21: Difference between revisions

Revision as of 12:40, 21 October 2008

Home Research Protocols Notebook People Publications Courses Contact

Microarray Data Analysis

<html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>      </html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

Today's Workflow

The results generated on 10/14/2008 were downloaded and placed on the Desktop in "Edge Analysis" in Kevin's profile. Significant gene results were saved as tab-delimited files and the Pvalue Histograms and QPlots were saved into a powerpoint and printed.

Previous run (10/14/2008) on dCIN5-only dataset gave interesting results. While the wt-only dataset produced about 1000 significant genes, the dCIN5-only one gave about 150 significant genes. To verify this result:

First the covariates and genelist files were uploaded to lion share. They will be opened with excel and checked for errors.

Then for an additional test, the difference between dCIN5 and wt at an individual timepoint was tested:
- Files in Desktop "Data analysis 2008-10-02"
- Used gene file "wt-dCIN5_consolidated_Edge_genes-indexonly_20080715.txt"
- Used covariate file "wt-dCIN5_consolidated_Edge_covariates_20080710.txt"
Load both into an Edge session.
Select "Impute Missing Data" from the menu. Calculate Percent Missing Data by clicking on the button. The results are:
- Percent of genes missing data: 7.63%
- Percent of arrays missing data: 95.35%
- Overall percent of missing data: 3.15%
For KNN Parameters, set:
- Percent of missing values to tolerate in a gene: 100 (so all genes included)
- Number of nearest neighbors to use (maximum of 15): 15
- clicked GO to impute missing data.
Selected "Identify Differentially Expressed Genes"
- Note: this is to compare between the wt and dCIN5 strains. Different parameters and gene/covariate files will need to be used to analyze individual strains.
- Class Variable is: Strain
- Differential Expression Type is: Time Course
- Number of null iterations, set to 1000
- Choose a seed for reproducible results, set to 47
- Choose Time Course Settings
- Covariate giving time points is: Timepoint
- Covariate corresponding to individuals is: Flask
- Choose spline type, accepted default of Natural Cubic Spline, dimension 4
- Click "Apply" and then click "Go"
- 1000 permutations looks like it will take about 10 minutes.
Results: (Saved in 2008-10-14 Results)
- No significant genes under these settings.
- Choose Q-Value cutoff as 1, recalculate
  - Saved total list of genes as: "GeneList_20081014_wt-vs-dCIN5"
- To save the plots, do the following command in the R console window.

savePlot(filename = "PvalHistogram_wt-vs-dCIN5", type = c("png"), device = dev.cur())

This will save the active plot window under a file name you choose. Saves in folder "edge_1.1.290"
- Saved Q-Plot as "QPlot_20081014_wt-vs-dCIN5"
- Saved Histograms as "PvalHistogram_20081014_wt-vs-dCIN5

@@ Line 9: / Line 9: @@
 == Today's Workflow ==
-* Replace this text with your actual notebook entry (workflow).
-* Please sign your notebook entries with your wiki signature:
- <nowiki>~~~~</nowiki>
+'''The results generated on 10/14/2008 were downloaded and placed on the Desktop in "Edge Analysis" in Kevin's profile. Significant gene results were saved as tab-delimited files and the Pvalue Histograms and QPlots were saved into a powerpoint and printed.'''
-which would look like this on the page:  ''&mdash; [[User:Kam D. Dahlquist|Kam D. Dahlquist]] 19:35, 2 October 2008 (EDT)''
+''' Previous run (10/14/2008) on dCIN5-only dataset gave interesting results. While the wt-only dataset produced about 1000 significant genes, the dCIN5-only one gave about 150 significant genes. To verify this result:'''
+* First the covariates and genelist files were uploaded to lion share. They will be opened with excel and checked for errors.
+* Then for an additional test, the difference between dCIN5 and wt at an individual timepoint was tested:
+** Files in Desktop "Data analysis 2008-10-02"
+** Used gene file "wt-dCIN5_consolidated_Edge_genes-indexonly_20080715.txt"
+** Used covariate file "wt-dCIN5_consolidated_Edge_covariates_20080710.txt"
+* Load both into an Edge session.
+* Select "Impute Missing Data" from the menu.  Calculate Percent Missing Data by clicking on the button.  The results are:
+** Percent of genes missing data: 7.63%
+** Percent of arrays missing data: 95.35%
+** Overall percent of missing data: 3.15%
+* For KNN Parameters, set:
+** Percent of missing values to tolerate in a gene: 100 (so all genes included)
+** Number of nearest neighbors to use (maximum of 15): 15
+** clicked GO to impute missing data.
+* Selected "Identify Differentially Expressed Genes"
+** Note: this is to compare between the wt and dCIN5 strains. Different parameters and gene/covariate files will need to be used to analyze individual strains.
+** Class Variable is: Strain
+** Differential Expression Type is: Time Course
+** Number of null iterations, set to 1000
+** Choose a seed for reproducible results, set to 47
+** Choose Time Course Settings
+** Covariate giving time points is: Timepoint
+** Covariate corresponding to individuals is: Flask
+** Choose spline type, accepted default of Natural Cubic Spline, dimension 4
+** Click "Apply" and then click "Go"
+** 1000 permutations looks like it will take about 10 minutes.
+* Results: (Saved in 2008-10-14 Results)
+** No significant genes under these settings.
+** Choose Q-Value cutoff as 1, recalculate
+*** Saved total list of genes as: "GeneList_20081014_wt-vs-dCIN5"
+** To save the plots, do the following command in the R console window.
+ savePlot(filename = "PvalHistogram_wt-vs-dCIN5", type = c("png"), device = dev.cur())
+* This will save the active plot window under a file name you choose. Saves in folder "edge_1.1.290"
+** Saved Q-Plot as "QPlot_20081014_wt-vs-dCIN5"
+** Saved Histograms as "PvalHistogram_20081014_wt-vs-dCIN5

Dahlquist:Notebook/Microarray Data Analysis/2008/10/21: Difference between revisions

Revision as of 12:40, 21 October 2008

Today's Workflow

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools