This journal entry is due on Thursday, April 30 at midnight PDT (Wednesday night/Thursday morning). NOTE that the server records the time as Eastern Daylight Time (EDT). Therefore, midnight will register as 03:00.
Individual Journal Assignment
- Store this journal entry as "username Week 14" (i.e., this is the text to place between the square brackets when you link to this page).
- Create the following set of links. (HINT: These links should all be in your personal template that you created for the Week 1 Assignment; you should then simply invoke your template on each new journal entry.)
- Link to your journal entry from your user page.
- Link back from your journal entry to your user page.
- Link to this assignment from your journal entry.
- Don't forget to add the "BIOL398-04/S15" category to the end of your wiki page.
For your assignment this week, you will keep an electronic laboratory notebook on your individual wiki page that records all the manipulations you perform on the data and the answers to the questions throughout the protocol. We will be working on the protocols in class this week. Whatever you do not finish in class will be homework to be completed by the Week 14 journal deadline.
REMINDER: you should "turn on" the file extensions using the instructions found on the Help page before beginning today's work.
Introduction to GRNmap and Gene Regulatory Network Modeling
For this week's assignment, you will finally run the GRNmap model on the input workbook you created for the Week 13 Assignment. You will run the optimization twice; once where the threshold parameters, b, are not estimated and once where the threshold parameters 'are estimated. You will compare the estimated weight and production rate parameters outputted by these two runs with each other.
- In the optimization_parameters sheet of your input workbook, set:
- fix_b to 1
- fix_P to 0
- iestimate to 1
- alpha to 0.01
- kk_max should be 1
- MaxIter and MaxFunEval should be 1e08 (one hundred million in plain English)
- TolFun and TolX should be 1e-6
- Sigmoid should be 1
- igraph should be 1
- simtime should be 0 5 <...fill by steps of 5...> 60, each number in a different cell.
- Strain, Sheet, Deletion depend on how your data is organized in the workbook and what problem you are solving. You should have two strains, one of which is wt (wild type). Sheet denotes the number of the tab, reading left to right along the tabs at the bottom, where the strain data is positioned in the workbook. Deletion should be 0 for the wild type, 0 for paradoxus (looking at you Natalie and Karina), and for your deletion strain, the position in your list of genes for that deleted gene. For example, if you have dCIN5, and CIN5 is the 7th gene in your list (as organized in your degradation_rates, production_rates, network, etc sheets), then put a 7 for dcin5 in the Deletion parameter row.
- After prepping this sheet, run GRNmodel.
- Save all your graphs as jpegs and paste them into a powerpoint file. Please label things clearly, placing an appropriate number of graphs on each page for a readable visual. Take some care to make sure that the graphs are the same size and the aspect ratio has not been changed.
- Note that there is a bug in GRNmap currently where the last graph is not saved automatically in your folder like the other graphs. You will need to save this one manually once GRNmap completes its run.
- Create a new workbook for analyzing the weight data. In this workbook, create a new sheet: call it estimated_weights. In this new worksheet, create a column of labels of the form ControllerGeneA -> TargetGeneB, replacing these generic names with the standard gene names for each regulatory pair in your network. Remember that columns represent Controllers and rows represent Targets in your network and network_weights sheets.
- Extract the non-zero optimized weights from their worksheet and put them in a single column next to the corresponding ControllerGeneA -> TargetGeneB label.
- Save your input workbook as a new file with a meaningful name (e.g. append "estimate-b" to the previous filename), and change fix_b to 0 in the "optimization_parameters" worksheet, so that the thresholds will be estimated. Rerun GRNmodel with the new input sheet.
- Repeat Parts (2) through (4) with the new output.
- Create an empty excel workbook, and copy both sets of weights into a worksheet.
- Create a bar chart in order to compare the "fixed b" and "estimated b" weights.
- Repeat (7) and (8) with the production rates.
- Copy the two bar charts into your powerpoint.
- Visualize the output of each of your model runs with GRNsight.
- In order for this to work, you need to alter your output workbook slightly. You need to change the name of the sheet called "out_network_optimized_weights" to "network_optimized_weights"; i.e., delete the "out_" from that sheet name.
- Arrange the genes in the same order you used to display them in your Week 12 assignment for both of your model output runs. Take a screenshot of each of the results and paste it into your PowerPoint presentation. Clearly label which screenshot belongs to which run.
- Note that GRNsight will display differently now that you have estimated the weights. For positive weights > 0, the edge will be given a regular (pointy) arrowhead to indicate an activation relationship between the two nodes. For negative weights < 0, the edge will be given a blunt arrowhead (a line segment perpendicular to the edge direction) to indicate a repression relationship between the two nodes. The thickness of the edge will vary based on the magnitude of the absolute value of the weight. Larger magnitudes will have thicker edges and smaller magnitudes will have thinner edges. The way that GRNsight determines the edge thickness is as follows. GRNsight divides all weight values by the absolute value of the maximum weight in the matrix to normalize all the values to between zero and 1. GRNsight then adjusts the thickness of the lines to vary continuously from the minimum thickness (for normalized weights near zero) to maximum thickness (normalized weights of 1). The color of the edge also imparts information about the regulatory relationship. Edges with positive normalized weight values from 0.05 to 1 are colored magenta; edges with negative normalized weight values from -0.05 to -1 are colored cyan. Edges with normalized weight values between -0.05 and 0.05 are colored grey to emphasize that their normalized magnitude is near zero and that they have a weak influence on the target gene.
- Upload your powerpoint, your two input workbooks, and your two output workbooks and link to them in your individual journal. Also upload the workbook where you made the bar charts comparing the weights from both runs.
- Interpreting your results.
- Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Why do you think that is? How does this help you to interpret the microarray data?
- Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- This part of the assignment is due May 5 (Monday night/Tuesday morning) at midnight PDT
- Store your shared journal entry in the shared Class Journal Week 14 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first :) )
- Link to your journal entry from your user page.
- Link back from the journal entry to your user page.
- Sign your portion of the journal with the standard wiki signature shortcut (
- Add the "BIOL398-04/S15" category to the end of the wiki page (if someone has not already done so).
Look at the uploads of at least one other group in the class. State which group you compared yours to. Is there any overlap between your group and theirs with regard to which genes are in the networks (there should be at least for CIN5, GLN3, HMO1, and ZAP1)? If so, do the weights compare? Are there any weights that indicate activation in one network and repression in another? If so, what do you make of it?
Reflect back on your learning for this project and for the entire semester and answer the following:
- What is the value of combining biological and mathematical approaches to scientific questions?
- Looking back on your reflections on the Janovy and Steward readings from the Week 1 Class Journal, do you have any further insights to share? Have your answers changed to those original reflection questions? Why or why not?