Electronic Notebook

Assignment for 4/21/15

Instructed in class to attempt to run GRNmodel matlab code and insert network when told, then upload any and all figures as jpg to openwetware.
I was able to run my network and it produced 20 figures. However, it then terminated itself with the errors in the following picture (I apologize for it being upside down, no matter what I tried, I was unable to get this image flipped around... sorry!):
Unsure if the code ran to completion as these are errors I have never seen before, and it did not run to the point we got to in class with the two figures with changing plots. Dr. Fitzpatrick said those might take a while to run through, but my program never got to that point, likely due to the errors discussed above.

Pictures of resulting graphs

Changed long list of figures to zip file
Media: HorstmannWeek14_Run1.zip

Notes and Errors

After comparing Kara's and my MATLAB figures, we realized we had different numbers of data points for time point 30. We went onto our dzap1 and wt networks on the Excel and realized I had mislabeled the 1st time point 60 as a 5th data point for 30 in dzap1 and had added too many time points for 30 in wt as well. This is why our figures looked slightly different.
This error was corrected and the corrected figures are in the following zip file
- ZIP FILE

Assignment for 4/23/15

In the optimization_parameters sheet of the old input workbook [[Media:]], we set fix_b to 1, fix_P to 0, iestimate to 1, and alpha to 0.01
Ran GRNmodel through MATLAB and uploaded input workbook when prompted
Saved all graphs as jpegs
Created new worksheet in estimation output xlsx workbook: called estimated_weights
- Created column of labels of the form ControllerGeneA -> TargetGeneB, replacing the generic names with the standard gene names for each regulatory pair in the network. The first gene, the controller, is the top row of the previous excel matrix, naming all the columns. The target gene aligns with the first column of the matrix, and names all the rows. So if, for example, there is a non-zero number in a cell that is in the "ACE2" column and the "YOX1" row, then the regulatory pair labeling would be ACE2->YOX1
Extracted non-zero optimized weights from worksheet and put them in a single column next to the corresponding ControllerGeneA -> TargetGeneB label. This column was labeled "weights fixed-b"
Saved input workbook as a new file and changed the input parameter fix_b to 0. Changing fix_b to 0 means that now the data will be estimated. Reran GRNmodel with this new input sheet.
Repeated step 2 with new output and copied new weight information into the column directly next to the "weights fixed-b" column. Labeled new column "weights estimated-b"
Through excel, created bar chart to compare the "fixed b" and "estimated b" weights
Repeated copying, labeling, and bar-chart process with the production rates of the fixed and estimated-b outputs
Copied the two bar charts into powerpoint with meaningful slide titles
1. Used GRNsight to visualize the output of each of my models
Arranged genes in the same order used to display them in Week 12 assignment for both output runs
Took a screenshot of each of the results and pasted it into PPT presentation with clear labels
- Note: GRNsight displays results differently after weights have been estimated
- Sign of Weight
  - If weight is positive, the edge will be a regular (pointy) arrowhead to indicate activation
  - If negative, the edge will be a blunt arrowhead (line segment perpendicular to the edge direction) to indicate repression
- Thickness of the edge
  - determined by absolute value of the weight
  - specifically, GRNsight divides all weight values by the absolute value of the matrix's maximum weight to normalize the values (between 0 and 1)
  - then, GRNsight can change the thicknesses based on this scale between 0 and 1
    - Minimum thickness: weights near 0 after normalization (larger magnitude)
    - Maximum thickness: weights near 1 after normalization (smaller magnitude)
- Color of the edge
  - Magenta: edges with positive normalized weight values from 0.05 to 1
  - Cyan: edges with negative normalized weight values from -0.05 to
    - These are near 0, and thus, have a weak influence on the target gene
Uploaded PPT, two input workbooks, and two output workbooks to OpenWetWare and linked to them in my individual journal. Also uploaded two zipped files with the jpg Matlab results and an excel sheet with the weights and production rates used to create bar graph
Interpreted results
- Examined the various graphs, paying special attention to which genes in the model have the closest fit between the model data and actual data
  - Suggested reasons that may account for this
  - Tried to draw a connection for how this helps me interpret the microarray data
- Looked for the gene(s) that showed the largest dynamics over the time course
- Looked for the gene(s) that showed the biggest difference in dynamics between wild type and dzap1
  - Attempted to connect this to my results from the GRNsight visualization
- Examined bar charts that compared the weights and production rates between the two runs.
  - Noted any major differences between the two runs, and tried to offer an explanation for why this was
  - Attempted to connect this to my results from the GRNsight visualization

Results

fix_b to 1

Input:
- Media:Input_4_gene_forward_correct_params_KMH_take2_corrected.xlsx
Output:
- Media:Input_4_gene_forward_correct_params_KMH_take2_corrected_estimation_output.xlsx
Zip File of all figures:
- Media:JPGS_of_non-estimated_b_Horstmann.zip

fix_b to 0

Input:
- Media:Input_4_gene_forward_correct_params_KMH_take2_estimate-b.xlsx
Output:
- Media:Input_4_gene_forward_correct_params_KMH_take2_estimate-b_estimation_output.xlsx
Zip File of all figures:
- Media:JPGS_of_estimated_b_Horstmann.zip

Overall Documents

Powerpoint with all requested results and figures including bar charts:
- Media:KristenHorstmann_Week14_figuresandcharts.ppt
Excel sheet with estimated weights, production rates, and bar charts:
- Media:Horstmann_Output_Estimated_Weights_and_Production_Rates.xlsx

Interpretation of Results

Examine the graphs that were output by each of the runs.

Which genes in the model have the closest fit between the model data and actual data? Why do you think that is? How does this help you to interpret the microarray data?
- It seems like CYC8 and SWI5 have the closest respective fits between model data and actual data. This indicates that the experimental errors were rather small and the data that was collected fits well with the model, possibly due to increased precision in the data for these factors. This helps me interpret the microarray data as it helps show how the data fits to times and how any outliers affect precision.
Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- PDR1, RIF1, and YLR278C showed the largest dynamics over time
- ACE2 mainly showed the biggest difference in dynamics between the wildtype and dzap1 as these graphs did not match each other between runs. This makes a lot of sense as the network for dzap1 pretty clearly shows that dzap1 is self-controlling and only controls one othe gene- ACE2. Therefore, it would make sense that all the other graphs (except ACE2) do not match because they are unaffected by dzap1.
Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- I did not notice any major differences between graphs. There seem to be minor changes for some of the transcription factors (like slope and where the model passes through the data points) which were most notable in CIN5, CYC8, HMO1, MSN2, and YLR278C. These changes were very small and are best seen by comparing the two graphs most major slopes and where the model line hits the last data point.
- I cannot compare these connections to the GRNsight network as I was unable to get GRNsight to work (see "Notes and Difficulties")

Notes and Difficulties

Both Kara's and my first runs resulted in output weight values of only "1"
- After discussing this with Dr. Dahlquist, noticed that the values in the parameters sheet had changed, specifically alpha was set at 1x10^-10, not .01.
- This is still a mystery to both of us, as we had distinctly remembered changing these and found it odd that both of our alpha values were changed to such a dramatic difference
Once we had solved this, we moved on to running the inputs again.
- Both of Kara's ran correctly the first time around
- Mine resulted in outputting the graphs and figures, but not the excel sheet. After much troubleshooting, and trying again multiple times for both b-values, I eventually got it to work.
- I believe the issue was that I think I pulled the sub-file "math388GRNmap" out of the original file "Horstmann" that I sent to myself, expecting it to still work since everything was kept in the "math388" file. But once I started working out from the "Horstmann" file, everything worked smoothly again.
After fixing the Matlab issues and creating the excel, we moved on to the GRNsight aspect, but encountered problems here as well.
- We were unable to get the Excel files to upload in GRNsight in such a way as to generate networks that help display the varying weight values noted in the directions
- After careful interpretation of the directions, consulting other classmates, and attempting several times to change the output network sheets to something more readable by GRNsight, we were unable to create the symbolically different GRNsight plot
- I made the executive decision not to upload the GRNsight matrix to my Powerpoint as it is exactly the same as the map created in Week 12 and found in the last 3 slides of my Week 12 powerpoint and I plan on redoing Week 12's GRNsight assignment when I have time to put the transcription factors into an alphabetical grid for easier reading in the future and for the audience for my final presentation.
  - Also, Kara, who reached this point before me, included her plots in her powerpoint before she realized they were not what the directions wanted. You could find her powerpoint to see the regular-looking GRNsight plots here. These naos should be the same as our data is the same.
  - I plan on discussing this with Dr. Fitzpatrick, Dr. Dahlquist, or more of my classmates, in order to figure out what is the correct way to format the output sheet in order to create the transcription map
- Once the issue is solved, I will add the correct network into my powerpoint and reupload it to this week's assignment.
- Decided to keep the GRNsight section in my Journal methods section as it was expected but did not result. Plus, I did do the directions of uploading the output excel into GRNsight, so I felt it was pertinent to keep in

Edits for 4/30

As Kara and I were comparing graphs and data, we noticed that a particular graph, MIG2 was extremely different between both of ours (one of hers was sloping and the other was spiked, while both of mine were spiked) and that our "model" lines had very slightly different slopes. After consulting Dr. Dahlquist, we realized that perhaps we still had an issue in our spreadsheets, even though we thought we had checked them already.
- We exchanged data and ran them both again, hoping to find a solution for the MIG2 graph. This was an easy fix, as we both got the sloping curve that Kara had before. We are accrediting this to a saving error on my part, as I may have taken the estimated-b graph and overwritten the fixed-b graph. I replaced the image in my powerpoint, which can be found in this section.
- The error in the slopes we figured would take a little bit longer to find. We downloaded each other's input excel sheets (Kara's can be found here for non-estimated and here for estimated) with the intention of subtracting every cell in our networks from the other's network, hoping that all our differences were zero, meaning every number was the same. We realized quickly that there were very small errors (a difference of 10^-10) in the ASG1 row. However, once we were examining this, we realized only I had ASG1; Kara had NDT80. We went back in our weeks and realized that when Kara encountered pruning errors in Week 11, she had accidentally deleted the row and column of ASG1 and never caught the error when she went back. This may have been because NDT80 and ASG1 were next to each other in the original spreadsheet, or perhaps because ASG1's column was all zeros (meaning it wasnt controlling anything, only being controlled) so it may have accidentally been deleted from there.
We also fixed the errors for using GRNsight
- Reminder: The graphs we were getting (as noted above) were all black and not colored and weighted like they were supposed to be. We figured there was an error in the naming of the sheets, so after discussing the issue, we renamed the sheet in the output from "out_network_optimized_weights" to "network_optimized_weights" and this problem was immediately fixed. The two GRNsight maps (fixed b and estimated b) were screenshot and uploaded to the new powerpoint.
We had not realized that the output images for ZAP1 were not automatically saving like the other images were. We went back and ran the two matlab codes, saved the ZAP1 images, and uploaded them in the appropriate order to the new powerpoint.
I realized I had accidentally uploaded one of my inputs for the link to an output. I edited this on Tuesday, 4/28, in class and can be found in the prior "results" section where I replaced the old input with what I had originally intended to upload.

Week 14 Powerpoint edits

- Wasn't sure if I should replace the old powerpoint or upload the new one. Decided to keep the older one in results and upload the new one here.
- Media:Horstmann_Week14_ppt_redo.ppt

Week 12 GRNsight redo

Decided to reupload my data from the Week 12 assignment so I could rearrange the genes and place them in an alphabetical grid. This way, if I use them in the final, it will be easier to compare Week 12's graph with Week 14's. I should have done this alphabetical arrangement in the first place, but did not think about it.
- The original powerpoint can be found on my Week 12 page but also can be found here
- I only made changes to the last three slides of the original powerpoint and can be seen here.

Expanded answers to interpretation questions

Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- PDR1, RIF1, and YLR278C showed the largest dynamics over time
- ACE2 mainly showed the biggest difference in dynamics between the wildtype and dzap1 as these graphs did not match each other between runs. This makes a lot of sense as the network for dzap1 pretty clearly shows that dzap1 only controls one othe gene- ACE2. Therefore, it would make sense that all the other graphs (except ACE2) do not match because they are unaffected by dzap1.
Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- There does not seem to be a massive difference between charts. MIG2 and MSN2 of course stand out because they are the largest, but they are the largest for both production times and weights. There are some (like HMO1. YHP1, and ZAP1) that show a high weight but a relatively lower production time.
- This makes some sense when examining the GRNsight networks as the weights between estimated and fixed netwoks change drasticallly. The only heavily weighted pair in the estimated-b network is CIN5 repressing MIG2. In the fixed-b network, there are very heavy weights connected to CIN5, MSN2, YHP1, and MIG2 and somewhat heavy weights connected to ZAP1 and HMO1. So yes, the difference in the weights and production is going to be dramatic between the two charts.

Template

Back to User: User: Kristen M. Horstmann

BIOL398-04/S15

Kristen M. Horstmann Week 14 Journal

Contents

Electronic Notebook

Assignment for 4/21/15

Pictures of resulting graphs

Notes and Errors

Assignment for 4/23/15

Results

fix_b to 1

fix_b to 0

Overall Documents

Interpretation of Results

Notes and Difficulties

Edits for 4/30

Week 14 Powerpoint edits

Week 12 GRNsight redo

Expanded answers to interpretation questions

Template

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools