GRNmap Testing Report: Strain Run Comparisons 2015-05-27: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎wt vs. dCIN5: Included information and files)
(→‎Discussion: Added poweroint for porr fitting T60 genes)
 
(53 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Code Version: "current" version from Dr. Fitzpatrick 2014-09-18
{{TOC right}}
*[[Media:Williams Matlab.v.1.0.zip| MATLAB code]]
*[[Media:NW Input 21 Gene Network Sigmoid Model Estimate All Strains.xlsx| Example Input Sheet for Estimation]]


MATLAB Version: 2014a
== Purpose ==


Computer on which the model was run: SEA120-03
* The purpose of this test is to analyze how the model behaves when running the same network with several different combinations of strain data.
*'''Last two test categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3) cannot be run on SEA120-03 because the computer only has MATLAB 2014b. Error displayed when Fitzpatrick's code version is run:
* Issue #10 on GitHub: [https://github.com/kdahlquist/GRNmap/issues/10]
 
==Test Conditions==
* Date: 2015-05-27
* Test Performed by: [[User:Katherine Grace Johnson]], [[Katherine Grace Johnson Electronic Lab Notebook]] and [[User:Natalie Williams]], [[Natalie Williams: Electronic Notebook]]
* Code Version: from the class [[BIOL398-04/S15]] (previously, code given by Fitzpatrick in September 2014. See below.)
* MATLAB Version: 2014b (previously, 2014a. See below)
*Computer on which the model was run: SEA120-03
**'''Last two test categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3) cannot be run with earlier code on SEA120-03 because the computer only has MATLAB 2014b. Error displayed when Fitzpatrick's code version is run:


  Undefined function 'max' for input arguments of type 'matlab.graphics.GraphicsPlaceholder'.
  Undefined function 'max' for input arguments of type 'matlab.graphics.GraphicsPlaceholder'.
Line 12: Line 18:
  Error in GRNmodel (line 15)
  Error in GRNmodel (line 15)
  nfig        = max(figHandles);
  nfig        = max(figHandles);
 
**'''For the first ten categories, we will upload data obtained from September 2014. To see if we can run the last two categories on 2014b without any glaring differences, we will first test wt alone on both 2014a (with Fitzpatrick's fall version of the code) and 2014b (with the code from the class [[BIOL398-04/S15]]). If the differences in estimated parameters are negligible, we could move on to run the last two categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3).'''
*'''For the first ten categories, we will upload data obtained from September 2014. To see if we can run the last two categories on 2014b without any glaring differences, we will first test wt alone on both 2014a (with Fitzpatrick's fall version of the code) and 2014b (with the code from the class [[BIOL398-04/S15]]). If the differences in estimated parameters are negligible, we could move on to run the last two categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3).'''
***Note: for these older versions of the code, the input file must be in the same folder as the code itself.
**New code version: from BIO398 (Note: file zipped with code was used to test wt)
**MATLAB version: 2014b
**Computer on which the model was run: SEA201-03




Line 26: Line 29:
  GRNstruct.GRNOutput.reg_out
  GRNstruct.GRNOutput.reg_out


==wt alone==
Excluding the wt, running the individual deletion strains through MATLAB was done with an iestimate set to 0.00E00. That value means that there would not be any estimation of the parameters.
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model wt NW.xls| Input Sheet]]
*Because of this observation, we had to first compare the wt from MATLAB versions 2014a vs. 2014b.
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model wt estimation output NW.xls| Output Sheet]]
*Next, we had to analyze the threshold values and the optimized weights in order to see if the differences between the outputs were negligible
*[[Media:WT figures NW.zip| Figures]]
**If they were negligible, we would proceed to run estimations of the individual strains
*analysis.xlsx containing bar graphs
*The comparisons of the individual strains were estimated, so those do not have to be re-run on MATLAB.
 
 
We have decided to standardize everything on the code from BIO398 with the 2014b version of MATLAB. All data below will be run on this model (excluding the wt alone, 2014a). We are standardizing because, although the difference was negligible, it could confound our results if they also have negligible differences in estimated parameters.
*'''Note: when using this version, ensure that "fix_b" is set to 0 (i.e. estimate b) and create a simtime row on the optimization_parameters worksheet.'''
 
= Results, Individual Strains =
==wt alone, 2014a==
*Input sheet: [[Media:2014.10.23.Input 21 Gene Network Sigmoid Model wt NW.xls]]
*Output sheet: [[Media:2014.10.23.Input 21 Gene Network Sigmoid Model wt estimation output NW.xls]]
*Figures: [[Media:WT figures NW.zip]]
*LSE: 7.0809
*LSE: 7.0809
*Penalty term: 0.0814
*Penalty term: 0.0814
==wt alone, 2014b==
*Input sheet: [[Media:GJ2 Input 21 Gene Network Sigmoid Model wt alone 2014b.xlsx]]
*Output sheet: [[Media:GJ2 Input 21 Gene Network Sigmoid Model wt alone 2014b estimation output.xlsx]]
*Figures: [[Media:Images wt alone 2014b.zip]]
*LSE: 6.8824
*Penalty term: 0.1794
*Bar graphs comparing estimated weights and estimated b's for wt alone runs on 2014a vs. 2014b. (Note: estimated production rates not included because the earlier version of the code did not estimate production rates). There appears to be negligible difference between the two runs.
**[[Media:Wt2014a vs wt2015b.xlsx]]


==dCIN5 alone==
==dCIN5 alone==
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model dCIN5 hard0 NW.xls| Input Sheet]]
*Input sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model dCIN5 hard0 NW.xls]]
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model dCIN5 hard0 estimation output NW.xls| Output Sheet]]
*Output sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model dCIN5 hard0 estimation output NW.xls]]
*[[Media:DCIN5 figures NW.zip| Figures]]
*Figures: [[Media:DCIN5 figures NW.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE: 10.6965
*LSE: 7.4496
*Penalty term: 0.1794
*Penalty term: 0.2174
*GRNSight
[[Image:NW GRNsight dCIN5.jpg|thumb|center|400px| dCIN5 visualized, normalized GRNSight network]]


==dGLN3 alone==
==dGLN3 alone==
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model GLN3 forward NW.xls| Input Sheet]]
*Input sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model GLN3 estimation NW.xls]]
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model GLN3 forward estimation output NW.xls| Output Sheet]]
*Output sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model GLN3 estimation output NW.xls]]
*[[Media:DGLN3 figures NW.zip| Figures]]
*Figures: [[Media:DGLN3 figures NW.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE: 15.4498
*LSE: 9.5367
*Penalty term: 0.0814
*Penalty term: 0.3792
*GRNSight Network
[[Image:NW GRNsight dGLN3.jpg| thumb| center| 400px| dGLN3 visualized, normalized GRNSight network]]


==dHMO1 alone==
==dHMO1 alone==
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model dHMO1 forward NW.xls| Input Sheet]]
*Input sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model dHMO1 estimation NW.xls]]
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model dHMO1 forward estimation output NW.xls| Output Sheet]]
*Output sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model dHMO1 estimation output NW.xls]]
*[[Media:DHMO1 figures NW.zip| Figures]]
*Figures: [[Media:DHMO1 figures NW.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE: 10.7760
*LSE: 6.9139
*Penalty term: 0.0814
*Penalty term: 0.1292
*GRNSight
[[Image:NW GRNsight dHMO1.jpg|thumb|center|400px| dHMO1 visualized, normalized GRNSight network]]


==dZAP1 alone==
==dZAP1 alone==
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model dZAP1 forward NW.xls| Input Sheet]]
*Input sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model dZAP1 estimation NW.xls]]
*[[Media:2014.10.23.Input 21 Gene Network Sigmoid Model dZAP1 forward estimation output NW.xls| Output Sheet]]
*Output sheet: [[Media:2015.05.28.Input 21 Gene Network Sigmoid Model dZAP1 estimation output NW.xls]]
*[[Media:DZAP1 figures NW.zip| Figures]]
*Figures: [[Media:DZAP1 figures NW.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE: 9.5471
*LSE: 6.9793
*Penalty term: 0.0814
*Penalty term: 0.3216
*GRNSight
[[Image:NW GRNsight dZAP1.jpg| thumb|center|400px| dZAP1 visualized, normalized GRNSight network]]


= Results, Multiple Strains =
==All Strains==
==All Strains==
*[[Media:Input 21 Gene Network Sigmoid Model.forward all NW.xls| Input Sheet]]
*Input sheet: [[Media:GJ input 21 Gene Network Sigmoid Model Estimate all strains.xlsx]]
*[[Media:Input 21 Gene Network Sigmoid Model.forward all estimation output NW.xls| Output Sheet]]
*Output sheet: [[Media:GJ input 21 Gene Network Sigmoid Model Estimate all strains estimation output.xlsx]]
*[[Media:Estimation All figures NW.zip| Figures]]
*Figures: [[Media:GJ all strains Figures.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE: 70.0277
*LSE: 45.7010
*Penalty term:  0.1794
*Penalty term:  0.7328


==wt vs. dCIN5==
==wt vs. dCIN5==
*[[Media:Input 21 Gene Network Sigmoid Model Estimate WTCIN5 NW.xlsx| Input Sheet]]
*Input sheet: [[Media:Input 21 Gene Network Sigmoid Model Estimate WTCIN5 NW.xlsx]]
*[[Media:Input 21 Gene Network Sigmoid Model Estimate WTCIN5 estimation output NW.xls| Output Sheet]]
*Output sheet: [[Media:Input 21 Gene Network Sigmoid Model Estimate WTCIN5 estimation output NW.xls]]
*[[Media:New WTCIN5 figures NW.zip| Figures]]
*Figures: [[Media:Wt dCIN5 figures NW.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE: 15.8220
*LSE: 15.1196
*Penalty term: 0.1011
*Penalty term: 0.4994
*Weighted Network visualized
[[Image:GRNSight wt dCIN5 NW.jpg| thumb| center| 400px| Visualized Network with wt and dCIN5 data]]


==wt vs. dGLN3==
==wt vs. dGLN3==
*input.xlsx
*Input sheet: [[Media:Input 21 Gene Network Sigmoid Model Estimate WTvdGLN3 NW.xlsx]]
*output.xlsx
*Output sheet: [[Media:Input 21 Gene Network Sigmoid Model Estimate WTvdGLN3 estimation output NW.xls]]
*zipped figures
*Figures: [[Media:Wt dGLN3 figures NW.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE:
*LSE: 18.1196
*Penalty term:
*Penalty term: 0.3529
*Weighted Network
[[Image:GRNSight wt dGLN3 NW.jpg| thumb| center| 400px| Visualized network with wt and dGLN3 data]]


==wt vs. dHMO1==
==wt vs. dHMO1==
*input.xlsx
*Input sheet: [[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dHMO1.xlsx]]
*output.xlsx
*Output sheet: [[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dHMO1 estimation output.xlsx]]
*zipped figures
*Figures: [[Media:GJ wt dHMO1 Figures.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE:
*LSE: 15.5341
*Penalty term:
*Penalty term: 0.1893
*Weighted Network
[[Image:GRNSight wt dHMO1 NW.jpg| thumb| center| 400px| Visualized network with wt and dHMO1 data]]


==wt vs. dZAP1==
==wt vs. dZAP1==
*input.xlsx
*Input sheet: [[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dZAP1s.xlsx]]
*output.xlsx
*Output sheet: [[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dZAP1s estimation output.xlsx]]
*zipped figures
*Figures: [[Media:GJ wt dZAP1 Figures.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE:
*LSE: 18.1215
*Penalty term:
*Penalty term: 0.1377
*Weighted Network
[[Image:GRNSight wt dZAP1 NW.jpg| thumb| center| 400px| Weighted visualized network with wt and dZAP1 data]]


==wt + dCIN5 + dZAP1==
==wt + dCIN5 + dZAP1==
*input.xlsx
*[[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1s.xlsx]]
*output.xlsx
*[[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1s estimation output.xlsx]]
*zipped figures
*[[Media:Wt dCIN5 dZAP1.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE:
*LSE: 26.6846
*Penalty term:
*Penalty term: 0.1682
*Weighted Network
[[Image:GRNSight wt dCIN5 dZAP1.jpg| thumb| center| 400px| Visualized network with wt, dCIN5, and dZAP1 data]]


==wt + dCIN5 + dZAP1 + dGLN3==
==wt + dCIN5 + dZAP1 + dGLN3==
*input.xlsx
*[[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1 dGLN3.xlsx]]
*output.xlsx
*[[Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1 dGLN3 estimation output.xlsx]]
*zipped figures
*[[Media:GJ wt dCIN5 dZAP1 dGLN3 Figures.zip]]
*analysis.xlsx containing bar graphs
*analysis.xlsx containing bar graphs
*LSE:
*LSE: 38.0868
*Penalty term:
*Penalty term: 0.2310
*GRNsight figure of weighted network:
[[Image:GJ wt dCIN5 dZAP1 dGLN3 network.jpg |thumb|center| 400px]]


=Results and Discussion=
=Discussion=
*Excel sheet comparing output weights for CIN5, FHL1, PHD1 and SKN7 regulators, estimated b values, and estimated production rates for all the above strain combinations: [[Media:GJ Estimated weight output comparison all combinations.xlsx]]
*Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
*Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 11 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
*Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
*Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
*What other questions should be answered to help us further analyze the data?
*What other questions should be answered to help us further analyze the data?
**Production rate vs. degradation rate. How do these combine?
**ANOVA p-value for within strain
**#Magnitude (large dynamics)?
**#Variance (spread of the data points)?
**#Some combination of the two?
**Fit of the model vs. parameter value stability
***To view the analysis for self-regulating genes, please view the following document: [[Media:2015.06.08.AutoReg Investigation NW.docx]]
*Ppt analyzing genes with no inputs and genes that only regulate themselves: [[Media:GRNmap Testing Analysis.pptx]]
*Powerpoint containing some genes that have poor T60 fits to the provided data: [[Media:Poor Fitting T60 Model to Data Points.pptx]]
[[Category:GRNmap]]
[[Category:Dahlquist Lab]]

Latest revision as of 09:49, 10 June 2015

Purpose

  • The purpose of this test is to analyze how the model behaves when running the same network with several different combinations of strain data.
  • Issue #10 on GitHub: [1]

Test Conditions

Undefined function 'max' for input arguments of type 'matlab.graphics.GraphicsPlaceholder'.
Error in GRNmodel (line 15)
nfig        = max(figHandles);
    • For the first ten categories, we will upload data obtained from September 2014. To see if we can run the last two categories on 2014b without any glaring differences, we will first test wt alone on both 2014a (with Fitzpatrick's fall version of the code) and 2014b (with the code from the class BIOL398-04/S15). If the differences in estimated parameters are negligible, we could move on to run the last two categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3).
      • Note: for these older versions of the code, the input file must be in the same folder as the code itself.


To get the LSE & the penalty term, type the following:

Code for LSE:
GRNstruct.GRNOutput.lse_out

Code for Penalty
GRNstruct.GRNOutput.reg_out

Excluding the wt, running the individual deletion strains through MATLAB was done with an iestimate set to 0.00E00. That value means that there would not be any estimation of the parameters.

  • Because of this observation, we had to first compare the wt from MATLAB versions 2014a vs. 2014b.
  • Next, we had to analyze the threshold values and the optimized weights in order to see if the differences between the outputs were negligible
    • If they were negligible, we would proceed to run estimations of the individual strains
  • The comparisons of the individual strains were estimated, so those do not have to be re-run on MATLAB.


We have decided to standardize everything on the code from BIO398 with the 2014b version of MATLAB. All data below will be run on this model (excluding the wt alone, 2014a). We are standardizing because, although the difference was negligible, it could confound our results if they also have negligible differences in estimated parameters.

  • Note: when using this version, ensure that "fix_b" is set to 0 (i.e. estimate b) and create a simtime row on the optimization_parameters worksheet.

Results, Individual Strains

wt alone, 2014a

wt alone, 2014b

dCIN5 alone

dCIN5 visualized, normalized GRNSight network

dGLN3 alone

dGLN3 visualized, normalized GRNSight network

dHMO1 alone

dHMO1 visualized, normalized GRNSight network

dZAP1 alone

dZAP1 visualized, normalized GRNSight network

Results, Multiple Strains

All Strains

wt vs. dCIN5

Visualized Network with wt and dCIN5 data

wt vs. dGLN3

Visualized network with wt and dGLN3 data

wt vs. dHMO1

Visualized network with wt and dHMO1 data

wt vs. dZAP1

Weighted visualized network with wt and dZAP1 data

wt + dCIN5 + dZAP1

Visualized network with wt, dCIN5, and dZAP1 data

wt + dCIN5 + dZAP1 + dGLN3

Discussion

  • Excel sheet comparing output weights for CIN5, FHL1, PHD1 and SKN7 regulators, estimated b values, and estimated production rates for all the above strain combinations: Media:GJ Estimated weight output comparison all combinations.xlsx
  • Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
  • Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 11 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
  • Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
  • Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
  • What other questions should be answered to help us further analyze the data?
    • Production rate vs. degradation rate. How do these combine?
    • ANOVA p-value for within strain
      1. Magnitude (large dynamics)?
      2. Variance (spread of the data points)?
      3. Some combination of the two?
    • Fit of the model vs. parameter value stability
  • Ppt analyzing genes with no inputs and genes that only regulate themselves: Media:GRNmap Testing Analysis.pptx
  • Powerpoint containing some genes that have poor T60 fits to the provided data: Media:Poor Fitting T60 Model to Data Points.pptx