Latest revision as of 15:19, 9 November 2016

GRNmap Testing Report, Testing WT with new format and degradation rates, 2016-10-26, NEW

Test Conditions

Date: 2016 Oct. 26
Test Performed by: [[User:Natalie Williams|Natalie Williams]], [[Natalie Williams: Electronic Notebook]]
Code Version: 1.4.4
MATLAB Version: R2014b
Computer on which the model was run: MXL626123M (boulardii 2)

Purpose

We are doing an initial run of our db-derived wt network. Degradation rates will be different between these runs and previous runs. The most up-to-date connections of the genes may also be used, causing variance in the network itself. If a specific gene is in the network and has the corresponding expression data from its deletion, it will be run through the model.
Link to Issue # on GRNmap @ GitHub #265.

Results

Input sheet: Media:16-genes_36-edges_NW-wt-fam_Sigmoid_estimation_all_strains.xlsx
Output sheet: Media:NEW-16-genes_36-edges_NW-wt-fam_Sigmoid_estimation_output-20161030.xlsx‎
Output .mat file (zipped): Media:NEW-16-genes_31-edges_NW-wt-fam_Sigmoid_estimation_output_20161026.zip‎
- LSE: 0.819447
- Penalty term: 2.592287
- Number of iterations (counter): 109,718
Figures (all expression graphs .jpg files zipped together): Media:NEW-wt_figures_20161030.zip‎
Save the progress figure containing the counts manually: Media:NEW-optimization diagnostic-wt-20161030.jpg
analysis.xlsx containing bar graphs: [[Media:]]
GRNsight figure of unweighted network: [[Image:|thumb|center|400px]]
GRNsight figure of weighted network:

Discussion

Running the WT network through the model is the preliminary testing and analysis of the code/GRNmap and its output for this semester. We hope to get consistent results and comparable LSE/minLSE for all strains before moving on to testing the random networks that will be generated through Brandon's R script. These networks that will be run will then be compared to our database-derived network. For that reason, we hope to get consistent results for the various runs we will do for each database-derived strain network.

Discuss the results of the test with regards to the stated purpose. Additionally, answer the relevant questions below:
- Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
- Overall, the model appears to model the dynamics for each gene better. In this statement, I mean that instead of just seeing two lines - where three strains are modeled by one and two by the other - the model of the strains appears independent of each other and has its own respective dynamics. The genes with the best fit are - ASF1, ASH1 except for the dCIN5 model, GCN4, GLN3, HAP4 - except for dGLN3 model, and YHP1. The genes with poor fit are YOX1 overall and the wt strain for ZAP1. I would not say that other genes have great fits or poor fits because when looking at their data points, there is a large difference between some of the expression data at specific time points. I believe the model was doing its best to navigate itself through all the points.
- Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. Does this seem to have an effect on the goodness of fit (see question above)?
- The following genes showed dynamics greater than or equal to one log fold change:
  - ABF1 went from 0 to -1 within the 60 minutes. However, the fit for the data is not the best, but that is due to the wide range of data points for each individual strain type (wt vs. dCIN5 vs. dGLN3 etc.). The model overlays all of the strains under the dZAP1 model curve.
  - AFT2 goes from 0 to 1.There is a difference between two identifiable strains (dHMO1 and dZAP1). These dynamics are for dZAP1 and most likely wt. The data points for dCIN5 and dHMO1 are not ly spread, but also see a general decrease. The data points of dHAP4 show high log fold expression changes for 15 and 30 minutes, but decreases at 60 minutes.
  - ASH1 is from 0 to -1 for the dHAP4 model. The model fits the data points for dHAP4 well and follows the general trends. The model for dCIN5 is overlaid with another strain model, and its data points suggest that there is a drop in expression compared to all other strains. Yet. there is no distinction between the other models (wt and dGLN3) and the dCIN5 model. Further, the fit of the dZAP1 model does not fit the 15 minute time point.
  - CIN5 goes from 0 to 2 for the dZAP1 model. It fits the data points for dZAP1 log expression well, too. The fit of dHAP4 fits the data well at 30 and 60 minutes, but fails to accurately depict the measurements of expression at t15.
  - GCN4 goes from 0 to -1.3 approximately for the dGLN3 model. Overall, the fit of these models are better than other figures. The model for dCIN5, however, does not represent the microarray data well for the given time stamps.
  - HAP4 goes from 0 to -1.4 for the dCIN5 and dHMO1 models. both of these models fit the provided data points at the given times well. While dGLN3 seems to be overlaid with dZAP1, its modeling of the network does not fit the given data points well. Further, wt gene expression is not modeled well for dHAP4. Instead of seeing the large drop in log expression, the wt model is non-distinctive whereas the data points show a clear difference in expression compared to other strains.
  - HMO1 goes from 0 to 1.5 in dZAP1 modeling. It appears that all other models, except dHMO1, show the same trends. I would say that the model fits the dZAP1 data points well, while other strains, which follow similar trends, are not as well fit.
- Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- For clarification, I am only commenting on the genes were there is a distinct difference between the wt model versus other models. While there are differences in the runs between the specific genes, other models may be overlaid on top of the wt, so it is difficult to discern which model (i.e. dCIN5 vs. dHAP4) wt modeling is grouped with.
- Only two genes had distinguishable wt models versus the other models:
  - GCN4:
  - GLN3:
- Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- What other questions should be answered to help us further analyze the data?

Back to Natalie's Electronic Notebook here

@@ Line 7: / Line 7: @@
 * Code Version: 1.4.4
 * MATLAB Version: R2014b
-* Computer on which the model was run: BIO-MJYKLDM (paradoxus); don't know what else...
+* Computer on which the model was run: MXL626123M (boulardii 2)
 == Purpose ==
-* We are doing an initial run of our db-derived wt network. We have different degradation rates.
+* We are doing an initial run of our db-derived wt network. Degradation rates will be different between these runs and previous runs. The most up-to-date connections of the genes may also be used, causing variance in the network itself. If a specific gene is in the network and has the corresponding expression data from its deletion, it will be run through the model.
-* Link to Issue # on GRNmap @ GitHub [https://github.com/kdahlquist/GRNmap/issues/265| #265].
+* Link to Issue # on GRNmap @ GitHub [https://github.com/kdahlquist/GRNmap/issues/265 #265].
 == Results ==
-* Input sheet: [[Media:]]
+* Input sheet: [[Media:16-genes_36-edges_NW-wt-fam_Sigmoid_estimation_all_strains.xlsx]]
-* Output sheet: [[Media:]]
+* Output sheet: [[Media:NEW-16-genes_36-edges_NW-wt-fam_Sigmoid_estimation_output-20161030.xlsx‎]]
-* Output .mat file (zipped): [[Media:]]
+* Output .mat file (zipped): [[Media:NEW-16-genes_31-edges_NW-wt-fam_Sigmoid_estimation_output_20161026.zip‎]]
-** LSE:
+** LSE: 0.819447
-** Penalty term:
+** Penalty term: 2.592287
-** Number of iterations (counter):
+** Number of iterations (counter): 109,718
-* Figures (all expression graphs .jpg files zipped together): [[Media:]]
+* Figures (all expression graphs .jpg files zipped together): [[Media:NEW-wt_figures_20161030.zip‎]]
-* Save the progress figure containing the counts manually: [[Media:]]
+* Save the progress figure containing the counts manually: [[Media:NEW-optimization diagnostic-wt-20161030.jpg]]
 * analysis.xlsx containing bar graphs: [[Media:]]
-* GRNsight figure of unweighted network: [[Image:]]
+* GRNsight figure of unweighted network: [[Image:|thumb|center|400px]]
-* GRNsight figure of weighted network: [[Image:*filename here*|thumb|center|400px]]
+* GRNsight figure of weighted network: [[Image:NEW weighted wt network-20161030.png|thumb|center|400px]]
 == Discussion ==
+Running the WT network through the model is the preliminary testing and analysis of the code/GRNmap and its output for this semester. We hope to get consistent results and comparable LSE/minLSE for all strains before moving on to testing the random networks that will be generated through Brandon's R script. These networks that will be run will then be compared to our database-derived network. For that reason, we hope to get consistent results for the various runs we will do for each database-derived strain network.
 * Discuss the results of the test with regards to the stated purpose.  Additionally, answer the relevant questions below:
-** Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
+** <b>Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?</b>
-** Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 11 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
+**Overall, the model appears to model the dynamics for each gene better. In this statement, I mean that instead of just seeing two lines - where three strains are modeled by one and two by the other - the model of the strains appears independent of each other and has its own respective dynamics. The genes with the best fit are - ASF1, ASH1 except for the dCIN5 model, GCN4, GLN3, HAP4 - except for dGLN3 model, and YHP1. The genes with poor fit are YOX1 overall and the wt strain for ZAP1. I would not say that other genes have great fits or poor fits because when looking at their data points, there is a large difference between some of the expression data at specific time points. I believe the model was doing its best to navigate itself through all the points.
-** Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
+** <b>Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. Does this seem to have an effect on the goodness of fit (see question above)?</b>
+** The following genes showed dynamics greater than or equal to one log fold change:
+***ABF1 went from 0 to -1 within the 60 minutes. However, the fit for the data is not the best, but that is due to the wide range of data points for each individual strain type (wt vs. dCIN5 vs. dGLN3 etc.). The model overlays all of the strains under the dZAP1 model curve.
+***AFT2 goes from 0 to 1.There is a difference between two identifiable strains (dHMO1 and dZAP1). These dynamics are for dZAP1 and most likely wt. The data points for dCIN5 and dHMO1 are not ly spread, but also see a general decrease. The data points of dHAP4 show high log fold expression changes for 15 and 30 minutes, but decreases at 60 minutes.
+***ASH1 is from 0 to -1 for the dHAP4 model. The model fits the data points for dHAP4 well and follows the general trends. The model for dCIN5 is overlaid with another strain model, and its data points suggest that there is a drop in expression compared to all other strains. Yet. there is no distinction between the other models (wt and dGLN3) and the dCIN5 model. Further, the fit of the dZAP1 model does not fit the 15 minute time point.
+***CIN5 goes from 0 to 2 for the dZAP1 model. It fits the data points for dZAP1 log expression well, too. The fit of dHAP4 fits the data well at 30 and 60 minutes, but fails to accurately depict the measurements of expression at t15.
+***GCN4 goes from 0 to -1.3 approximately for the dGLN3 model. Overall, the fit of these models are better than other figures. The model for dCIN5, however, does not represent the microarray data well for the given time stamps.
+***HAP4 goes from 0 to -1.4 for the dCIN5 and dHMO1 models. both of these models fit the provided data points at the given times well. While dGLN3 seems to be overlaid with dZAP1, its modeling of the network does not fit the given data points well. Further, wt gene expression is not modeled well for dHAP4. Instead of seeing the large drop in log expression, the wt model is non-distinctive whereas the data points show a clear difference in expression compared to other strains.
+***HMO1 goes from 0 to 1.5 in dZAP1 modeling. It appears that all other models, except dHMO1, show the same trends. I would say that the model fits the dZAP1 data points well, while other strains, which follow similar trends, are not as well fit.
+**<b> Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?</b>
+**For clarification, I am only commenting on the genes were there is a distinct difference between the wt model versus other models. While there are differences in the runs between the specific genes, other models may be overlaid on top of the wt, so it is difficult to discern which model (i.e. dCIN5 vs. dHAP4) wt modeling is grouped with.
+**Only two genes had distinguishable wt models versus the other models:
+***GCN4:
+***GLN3:
 ** Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
 ** What other questions should be answered to help us further analyze the data?
+Back to Natalie's Electronic Notebook [[Natalie Williams: Electronic Notebook| here]]
 [[Category:GRNmap]]
 [[Category:Dahlquist Lab]]

GRNmap Testing Report NEW 2016-10-26 WT: Difference between revisions

Latest revision as of 15:19, 9 November 2016

Contents

GRNmap Testing Report, Testing WT with new format and degradation rates, 2016-10-26, NEW

Test Conditions

Purpose

Results

Discussion

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools