Natalie Williams Week 14
Running the GRNmodel on MATLAB
April 22, 2015
I ran a version of the model before coming to class. To view the outputs and figures, download the following link:
Scer vs. Spar
April 23, 2015
In class:
This file output was incorrect due to the optimization parameters. iestimate was set to 0, when it should have been equal to 1. The alpha input was also set to 1.00E-10 when it should have been equal to 0.01.
These changes were made and were rerun during class.
These are the following figures from MATLAB estimation outputs.
Out of Class
To visualize the first run where the parameters and production rates were optimized and "guessed", the out_network_weights sheet had to be deleted from the workbook. I then renamed the out_network_optimized_weights to network_optimized_weights. When that was uploaded to GRNSight, the colors as well as the thickness of the lines were seen.
Analyzing the Results of the GRNmodel
Graphs
Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Why do you think that is? How does this help you to interpret the microarray data?
Fixed_b
- STB5: seems to fit the data well; the wt and spar data points are dispersed evenly along the curve
- YOX1: the model fits the data points pretty well
- CYC8: these data points are distributed evenly on the curve of the model and do not stray far from the model's curve - no strong outliers
- YLR278C: seems to fit the curve well; the wt and spar are dispersed randomly for the time points
- SWI3: data points are close to the curve
- MGA2: looks even on the curve
- ZAP1: looks good
- CIN5: looks good; see that wt sits above the models' curve while spar sits below the curve
Estimated_b
- YOX1: data fits the model pretty well; for 30 and 60 see that spar sits above the curve while wt is below
- CYC8: pretty evenly dispersed on the curve
- SWI3: the model fits the data well; the data points lie close to the curve
- MGA2: model lies in the middle of the data points
- ZAP1: even though the data points are spaced out, the model fits the general trend and falls within the middle of the points
- CIN5: the data points are spaced out, yet the model fits the trend that the data point lie out
I believe these fit the model the best due to number of inputs going into these selected number. In analyzing the normalized weight values, the arrows or bunts are thing or have moderate control (with other transcription factors involved) of the target gene. Due to the moderation, the optimized values could fit the data points better and in the middle of their respective time points. Also the data points that lie closer together will have a better overall fit of the model because the model is not trying accommodate for outliers in the data.
Between Strains
Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
Overall, the dynamics of the model for the wild-types of S. cerevisiae and S. paradoxus were pretty linear. However, for some strain, the curve fit the specific species better than the other. Generally, the curve of the model seems to accommodate more for the data points of the S. paradoxus even though both species' data points had multiple variations.
Fixed_b
- STB5: down regulated initially and remained down regulated; expression level was -1
- YHP1: seems initially down regulated before it becomes slightly up-regulated
- MIG2: seems to accommodate for the two high wt data points at 30 minutes
- MCM1: up-regulated continuously until its expression was equal to 1
- ZAP1: up-regulated as well; saw expression levels almost equal to 1
- CIN5: up-regulated; almost looks sinusoidal
Estimated_b
- STB5: negative trend towards expression level -1
- YHP1: initially greatly down regulated before its expression level heads towards +1
- FKH2: it is down regulated toward expression levels of -1
- YLR278C: down regulated; expression level at -1
- MIG2: seems to accommodate for time 30 before it is strongly down regulated to an expression level below 0
- MCM1: up-regualted
- ZAP1: up regulated to expression level 1
- CIN5: up-regulated with a steady rate of change toward expression level 1
Frequencies, Weights, and Production Rates
Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
Because we are not looking at a deletion strain, the results are going to be similar. Because a transcription factor is not being deleted, another one in the network does not have to compensate for its deletion. Therefore, when looking at our results, the slight difference in the weights as well as the production rates comes from the estimation of the threshold value. Instead of being set equal to 0, MATLAB has to predict the moment at which there is either gene expression or repression for each gene in the network. Due to these new optimizations of the threshold values, variances are going to be seen in production rates as well as the weights to solve for the best "b" values.
Optimized Weights
There were small differences seen among the optimized weights when the threshold value was fixed vs. estimated. The general trend was that the estimated b value resulted in slightly smaller values. In looking at the comparison chart, the greatest difference was seen between CIN5 as controller targeting MIG2 (-4.95 vs. -4.01).
Production Rates
Small differences were seen between the estimated and fixed b value outputs. There seems to be equal amounts of estimated_b's being larger and fixed_b's being larger. The highest production rate fell around 2.
Documents
To view the documents for Week 14:
Input workbook with fixed b values: here
Output workbook with fixed b values: here
Input workbook for estimated b values: here
Output workbook for estimated b values: here
Updated Powerpoint: here
Single Runs
I ran both S. cerevisiae and S. paradoxus by themselves on MATLAB. I then plotted their output graphs on top of each other.
Input for Scer ONLY: here
Input for Spar ONLY: here
Back to User Page: User:Natalie Williams
To view the Course and Assignments:BIOL398-04/S15
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5
- Week 6
- Week 7
- Week 9
- Week 10
- Week 11
- Week 12
- Week 13
- Week 14
- Week 15