# GRNmap Testing Report: Strain Run Comparisons 2015-05-27

From OpenWetWare

## Purpose

- The purpose of this test is to analyze how the model behaves when running the same network with several different combinations of strain data.
- Issue #10 on GitHub: [1]

## Test Conditions

- Date: 2015-05-27
- Test Performed by: User:Katherine Grace Johnson, Katherine Grace Johnson Electronic Lab Notebook and User:Natalie Williams, Natalie Williams: Electronic Notebook
- Code Version: from the class BIOL398-04/S15 (previously, code given by Fitzpatrick in September 2014. See below.)
- MATLAB Version: 2014b (previously, 2014a. See below)
- Computer on which the model was run: SEA120-03
**Last two test categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3) cannot be run with earlier code on SEA120-03 because the computer only has MATLAB 2014b. Error displayed when Fitzpatrick's code version is run:**

Undefined function 'max' for input arguments of type 'matlab.graphics.GraphicsPlaceholder'.

Error in GRNmodel (line 15) nfig = max(figHandles);

**For the first ten categories, we will upload data obtained from September 2014. To see if we can run the last two categories on 2014b without any glaring differences, we will first test wt alone on both 2014a (with Fitzpatrick's fall version of the code) and 2014b (with the code from the class BIOL398-04/S15). If the differences in estimated parameters are negligible, we could move on to run the last two categories (wt+dCIN5+dZAP1 and wt+dCIN5+dZAP1+dGLN3).**- Note: for these older versions of the code, the input file must be in the same folder as the code itself.

To get the LSE & the penalty term, type the following:

Code for LSE: GRNstruct.GRNOutput.lse_out Code for Penalty GRNstruct.GRNOutput.reg_out

Excluding the wt, running the individual deletion strains through MATLAB was done with an iestimate set to 0.00E00. That value means that there would not be any estimation of the parameters.

- Because of this observation, we had to first compare the wt from MATLAB versions 2014a vs. 2014b.
- Next, we had to analyze the threshold values and the optimized weights in order to see if the differences between the outputs were negligible
- If they were negligible, we would proceed to run estimations of the individual strains

- The comparisons of the individual strains were estimated, so those do not have to be re-run on MATLAB.

We have decided to standardize everything on the code from BIO398 with the 2014b version of MATLAB. All data below will be run on this model (excluding the wt alone, 2014a). We are standardizing because, although the difference was negligible, it could confound our results if they also have negligible differences in estimated parameters.

**Note: when using this version, ensure that "fix_b" is set to 0 (i.e. estimate b) and create a simtime row on the optimization_parameters worksheet.**

# Results, Individual Strains

## wt alone, 2014a

- Input sheet: Media:2014.10.23.Input 21 Gene Network Sigmoid Model wt NW.xls
- Output sheet: Media:2014.10.23.Input 21 Gene Network Sigmoid Model wt estimation output NW.xls
- Figures: Media:WT figures NW.zip
- LSE: 7.0809
- Penalty term: 0.0814

## wt alone, 2014b

- Input sheet: Media:GJ2 Input 21 Gene Network Sigmoid Model wt alone 2014b.xlsx
- Output sheet: Media:GJ2 Input 21 Gene Network Sigmoid Model wt alone 2014b estimation output.xlsx
- Figures: Media:Images wt alone 2014b.zip
- LSE: 6.8824
- Penalty term: 0.1794
- Bar graphs comparing estimated weights and estimated b's for wt alone runs on 2014a vs. 2014b. (Note: estimated production rates not included because the earlier version of the code did not estimate production rates). There appears to be negligible difference between the two runs.

## dCIN5 alone

- Input sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model dCIN5 hard0 NW.xls
- Output sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model dCIN5 hard0 estimation output NW.xls
- Figures: Media:DCIN5 figures NW.zip
- analysis.xlsx containing bar graphs
- LSE: 7.4496
- Penalty term: 0.2174
- GRNSight

## dGLN3 alone

- Input sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model GLN3 estimation NW.xls
- Output sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model GLN3 estimation output NW.xls
- Figures: Media:DGLN3 figures NW.zip
- analysis.xlsx containing bar graphs
- LSE: 9.5367
- Penalty term: 0.3792
- GRNSight Network

## dHMO1 alone

- Input sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model dHMO1 estimation NW.xls
- Output sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model dHMO1 estimation output NW.xls
- Figures: Media:DHMO1 figures NW.zip
- analysis.xlsx containing bar graphs
- LSE: 6.9139
- Penalty term: 0.1292
- GRNSight

## dZAP1 alone

- Input sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model dZAP1 estimation NW.xls
- Output sheet: Media:2015.05.28.Input 21 Gene Network Sigmoid Model dZAP1 estimation output NW.xls
- Figures: Media:DZAP1 figures NW.zip
- analysis.xlsx containing bar graphs
- LSE: 6.9793
- Penalty term: 0.3216
- GRNSight

# Results, Multiple Strains

## All Strains

- Input sheet: Media:GJ input 21 Gene Network Sigmoid Model Estimate all strains.xlsx
- Output sheet: Media:GJ input 21 Gene Network Sigmoid Model Estimate all strains estimation output.xlsx
- Figures: Media:GJ all strains Figures.zip
- analysis.xlsx containing bar graphs
- LSE: 45.7010
- Penalty term: 0.7328

## wt vs. dCIN5

- Input sheet: Media:Input 21 Gene Network Sigmoid Model Estimate WTCIN5 NW.xlsx
- Output sheet: Media:Input 21 Gene Network Sigmoid Model Estimate WTCIN5 estimation output NW.xls
- Figures: Media:Wt dCIN5 figures NW.zip
- analysis.xlsx containing bar graphs
- LSE: 15.1196
- Penalty term: 0.4994
- Weighted Network visualized

## wt vs. dGLN3

- Input sheet: Media:Input 21 Gene Network Sigmoid Model Estimate WTvdGLN3 NW.xlsx
- Output sheet: Media:Input 21 Gene Network Sigmoid Model Estimate WTvdGLN3 estimation output NW.xls
- Figures: Media:Wt dGLN3 figures NW.zip
- analysis.xlsx containing bar graphs
- LSE: 18.1196
- Penalty term: 0.3529
- Weighted Network

## wt vs. dHMO1

- Input sheet: Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dHMO1.xlsx
- Output sheet: Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dHMO1 estimation output.xlsx
- Figures: Media:GJ wt dHMO1 Figures.zip
- analysis.xlsx containing bar graphs
- LSE: 15.5341
- Penalty term: 0.1893
- Weighted Network

## wt vs. dZAP1

- Input sheet: Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dZAP1s.xlsx
- Output sheet: Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dZAP1s estimation output.xlsx
- Figures: Media:GJ wt dZAP1 Figures.zip
- analysis.xlsx containing bar graphs
- LSE: 18.1215
- Penalty term: 0.1377
- Weighted Network

## wt + dCIN5 + dZAP1

- Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1s.xlsx
- Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1s estimation output.xlsx
- Media:Wt dCIN5 dZAP1.zip
- analysis.xlsx containing bar graphs
- LSE: 26.6846
- Penalty term: 0.1682
- Weighted Network

## wt + dCIN5 + dZAP1 + dGLN3

- Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1 dGLN3.xlsx
- Media:GJ Input 21 Gene Network Sigmoid Model Estimate wt dCIN5 dZAP1 dGLN3 estimation output.xlsx
- Media:GJ wt dCIN5 dZAP1 dGLN3 Figures.zip
- analysis.xlsx containing bar graphs
- LSE: 38.0868
- Penalty term: 0.2310
- GRNsight figure of weighted network:

# Discussion

- Excel sheet comparing output weights for CIN5, FHL1, PHD1 and SKN7 regulators, estimated b values, and estimated production rates for all the above strain combinations: Media:GJ Estimated weight output comparison all combinations.xlsx
- Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
- Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 11 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
- Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- What other questions should be answered to help us further analyze the data?
- Production rate vs. degradation rate. How do these combine?
- ANOVA p-value for within strain
- Magnitude (large dynamics)?
- Variance (spread of the data points)?
- Some combination of the two?

- Fit of the model vs. parameter value stability
- To view the analysis for self-regulating genes, please view the following document: Media:2015.06.08.AutoReg Investigation NW.docx

- Ppt analyzing genes with no inputs and genes that only regulate themselves: Media:GRNmap Testing Analysis.pptx
- Powerpoint containing some genes that have poor T60 fits to the provided data: Media:Poor Fitting T60 Model to Data Points.pptx