The purpose of this test is to see how model outputs for the same network are affected by different initial weight guesses (other than 1, the standard on which all previous models have been run). By looking at LSE values and estimated parameters, we hope to discover if there is any bias toward certain initial weight guesses.
both LSE and penalty term same as control (w=1) run, maybe the number of iterations is different? I did not save the counter image, but will do with future runs.
Initial quick comparisons show that estimated weights, b values, and production weights were slightly different (out to the third decimal place) between the w=1 and w=0 runs.
both LSE and penalty term same as control (w=1) run
Initial quick comparisons show that estimated weights, b values, and production weights were slightly different (out to the third decimal place) between the w=1 and w=0 runs.
both LSE and penalty term same as control (w=1) run
Initial quick comparisons show that estimated weights, b values, and production weights were slightly different (out to the third decimal place) between the w=1 and w=0 runs.
both LSE and penalty term same as control (w=1) run
Initial quick comparisons show that estimated weights, b values, and production weights were slightly different (out to the third decimal place) between the w=1 and w=0 runs.
both LSE and penalty term same as control (w=1) run
Initial quick comparisons show that estimated weights, b values, and production weights were slightly different (out to the third decimal place) between the w=1 and w=0 runs.
both LSE and penalty term same as control (w=1) run
Initial quick comparisons show that estimated weights, b values, and production weights were slightly different (out to the third decimal place) between the w=1 and w=0 runs.
Wt Alone with various initial weights
wt alone, Random Positive and Negative values for weights assigned 1
To create this sheet, the following formula was entered into the network_weights tab:
Next, the resulting adjacency matrix was copied with its values pasted into a new worksheet. For values that were greater than one, one was subtracted from them. The formula used for that was:
=1.xyz - 1, where xyz are random numbers from 0-9 for a decimal point. The resulting number in the cell was 0.xyz
The workbook was then saved under another name with a .xlsx extension.
Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
In comparing the all strains output sheets, we noticed a trend in the production rates as well as the threshold values. Weights equal to 1 or 3 had more values in common while the all 0 and the random distribution of weights ranging from -1 to 1 had similarities.
In analyzing the LSE, it appears that a slightly larger value is obtained when the network contains a bias for positive numbers.
For the wt alone, the values differed once reaching the third value or the thousandths place in the decimal. The LSEs for all the runs - weights set at 1, 0, 3 and than randomly distributed between -1 and 1 - were identical at 6.8824
Discuss the results of the test with regards to the stated purpose. Additionally, answer the relevant questions below:
Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 11 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
What other questions should be answered to help us further analyze the data?