GRNmap Testing Report: Non-1 Initial Weight Guesses 2015-05-28

From OpenWetWare
Jump to navigationJump to search

Purpose

  • The purpose of this test is to see how model outputs for the same network are affected by different initial weight guesses (other than 1, the standard on which all previous models have been run). By looking at LSE values and estimated parameters, we hope to discover if there is any bias toward certain initial weight guesses.
  • GitHub Issue #95: [1]

Test Conditions

Results

Unweighted network used in the following comparisons

Two strain Comparison

wt and dHMO, Initial Weight Guess: 0

Individual Strains

wt alone, all initial Weights 0

dCIN5, all initial weights 0

dGLN3, all initial weights 0

dHMO1, all initial weights 0

dZAP1, all initial weights 0

Wt Alone with various initial weights

wt alone, Random Positive and Negative values for weights assigned 1

To create this sheet, the following formula was entered into the network_weights tab:

=IF(network!D16=1,(RANDBETWEEN(-1,1)+ROUND((RAND()),3)),0)

Next, the resulting adjacency matrix was copied with its values pasted into a new worksheet. For values that were greater than one, one was subtracted from them. The formula used for that was:

=1.xyz - 1, where xyz are random numbers from 0-9 for a decimal point. The resulting number in the cell was 0.xyz

The workbook was then saved under another name with a .xlsx extension.

wt alone, all initial weights 3

All strains with various initial weights

all strains, all initial Weights 0

all strains, initial weights -1

all strains, Random distribution of weights (-1,1) assigned 1: Run 1

all strains, Random distribution of weights (-1,1) assigned 1: Run 2

all strains, Random distribution of weights (-1,1) assigned 1: Run 3

all strains, all initial weights 3

all strains, all initial weights -3

all strains, all initial weights 10

all strains, initial weights distributed between -3 and 3

Discussion

  • Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
    • In comparing the all strains output sheets, we noticed a trend in the production rates as well as the threshold values. Weights equal to 1 or 3 had more values in common while the all 0 and the random distribution of weights ranging from -1 to 1 had similarities.
    • In analyzing the LSE, it appears that a slightly larger value is obtained when the network contains a bias for positive numbers.
    • To view the comparison of all strains: Media:NW All strains 0 1 3 rand.xlsx
    • For the wt alone, the values differed once reaching the third value or the thousandths place in the decimal. The LSEs for all the runs - weights set at 1, 0, 3 and than randomly distributed between -1 and 1 - were identical at 6.8824
    • To view the comparison of wt alone: Media:2015.06.01.Comparing wt 0 1 3 random NW.xlsx
    • To view the comparison of all strains that included the new weight values: -1, -3, 10, and the 3 additional random distribution of weights, click Media:2015.06.02.Comp all strains 0 pn1 pn3 10 4rand.xlsx
  • What other questions should be answered to help us further analyze the data?

Summary ppt for 2013-06-03 meeting: Media:GRNmap Testing.pptx

Preliminary Analysis of Resulting Data from Strain Run Comparisons and Non-One Initial Weight Estimations:
Media:2015.06.03.Strain Run Comparison Weight Estimation KG NW.docx

  • Discuss the results of the test with regards to the stated purpose. Additionally, answer the relevant questions below:
    • Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
    • Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 11 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
    • Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
    • What other questions should be answered to help us further analyze the data?