BIOL388/S19:Week 7
This journal entry is due on Thursday, March 7 at midnight PST (Wednesday night/Thursday morning).
Individual Journal Assignment
- Store this journal entry as "username Week 7" (i.e., this is the text to place between the square brackets when you link to this page).
- Create the following set of links. (HINT: These links should all be in your personal template that you created for the Week 1 Assignment; you should then simply invoke your template on each new journal entry.)
- Link to your journal entry from your user page.
- Link back from your journal entry to your user page.
- Link to this assignment from your journal entry.
- Don't forget to add the "BIOL388/S19" category to the end of your wiki page.
Homework Partners
Please meet with your partner (either face-to-face or virtually) at least once when preparing this assignment. Even though you may work together to understand the assignment, your journal assignment must be completed individually. It is not acceptable to do a joint assignment and copy it over to each others' journal page.
- Wild type: Angela and Sahil
- Δgln3: Austin and Leanne
- Δhap4: Desiree, Ava, Brianna, Fatimah
- Δzap1: Alison and Edward
Electronic Lab Notebook
Complete your electronic notebook that gives the details of what you did for the assignment this week. Your notebook entry should contain:
- The purpose: what was the scientific purpose of your investigations?
- Note that this is different than the learning purpose.
- Your workflow or methods: what did you actually do? Give a step by step account.
- There should be enough detail provided so that you or another person could re-do it based solely on your notebook.
- You may copy protocol instructions to your page and modify them as to what you actually did, as long as you provide appropriate attribution in the acknowledgments and references section.
- Take advantage of the electronic nature of the notebook by providing screenshots, links to web pages, etc.
- Your results: the answers to the questions in the protocol, plus any other results you gathered. Your results will include some or all of the following: images, plots, data, and files.
- Note that files left on the Desktop or My Documents or Downloads folders on the Seaver 120 computers will be deleted upon restart of the computers. Files stored on the
T:
drive will be saved. However, it is not a good idea to trust that they will be there when you next use the computer. - Thus, it is a critical skill for data and computer literacy to back-up your data and files in at least two ways:
- Upload the files to this wiki.
- Upload the files to Box.
- Back them up on your personal flash drive.
- References to data and files should be made within the methods and results section of your notebook, listed above.
- In addition to these inline links, create a Data and Files section of your notebook to make a list of the files generated in this exercise.
- Note that files left on the Desktop or My Documents or Downloads folders on the Seaver 120 computers will be deleted upon restart of the computers. Files stored on the
- A scientific conclusion: what was your main finding for today's project? Did you fulfill the purpose? Why or why not?
- The Acknowledgments section.
- You must acknowledge your homework partner or team members with whom you worked, giving details of the nature of the collaboration. You should include when and how you met and what content you worked on together. An appropriate statement could be (but is not limited to) the following:
- I worked with my homework partner (give name and link name to their user page) in class. We met face-to-face one time outside of class. We texted/e-mailed/chatted online three times. We worked on the <details> portion of the assignment together.
- Acknowledge anyone else you worked with who was not your assigned partner. This could be Dr. Dahlquist or Dr. Fitzpatrick (for example, via office hours), the TA, other students in the class, or even other students or faculty outside of the class.
- If you copied
wiki syntax
or a particular style from another wiki page, acknowledge that here. Provide the user name of the original page, if possible, and provide a link to the page from which you copied the syntax or style. - If you need to reference content, include the formal citation in your References section (see below).
- You must also include this statement:
- "Except for what is noted above, this individual journal entry was completed by me and not copied from another source."
- Sign your Acknowledgments section with your wiki signature.
- You must acknowledge your homework partner or team members with whom you worked, giving details of the nature of the collaboration. You should include when and how you met and what content you worked on together. An appropriate statement could be (but is not limited to) the following:
- The References section. In this section, you need to provide properly formatted citations to any content that was not entirely of your own devising. This includes, but is not limited to:
- methods
- data
- facts
- images
- documents, including the scientific literature
- Do not include citations/references to sources that you did not use.
- You should include a reference to this week's assignment page.
- The references should be formatted according to the APA guidelines.
Analyzing Results of First Model Run (Due for the Week 7 deadline, midnight March 7
Here is what you need to consider when analyzing the results of your model.
- What is the overall least squares error (LSE) for your model?
- You will find this on the "optimization_diagnostics" worksheet of your output workbook.
- Since the input data are noisy, the model can only minimize the error so far. It is more "fair" to look at the ratio of the least squares error to the minimum theoretical least squares error that the model could have achieved given the data. We call this the LSE:minLSE ratio. You should be able to compute it with the values given on the "optimization_diagnostics" worksheet.
- We will compare the LSE:minLSE ratios for the ten models run by everyone in the class.
- You need to look at the individual fits for each of the genes in your model. Which genes are modeled well? Which genes are not modeled well?
- Look at the individual expression plots to see if the line that represents the simulated model data is a good fit to the individual data points.
- Upload your output Excel spreadsheet to GRNsight. Use the dropdown menu on the left to choose the data you will display on the nodes (boxes). Compare the actual data for a strain with the simulated data from the same strain. If the model fits the data well, the color heatmap superimposed on the node will match top and bottom. If the fit is less good, the colors will not match.
- What explains the goodness of fit to the model?
- How many arrows are incoming to the node?
- What is the ANOVA Benjamini & Hochberg corrected p value for the gene?
- Is the gene changing its expression a lot or is the log2 fold change mostly near zero?
- Make bar charts for the b and P parameters.
- Is there something about these parameters that explains the goodness of fit for the individual genes?
Tweaking the Model and Analyzing the Results
For the Week 7 deadline (midnight, March 7) state which of these "tweaks" you would like to try and explain why. You don't have to have done it yet, but you need to pick one and explain why you chose it. It is a good idea to choose the same "tweak" as a group so that you can help each other compare results.
You will carry out an additional in silico experiment with your model. Let Dr. Dahlquist or Dr. Fitzpatrick know what you are planning to do to get approval and suggestions on how to do it. You will report out your results in a research presentation in Week 9. Some ideas are:
- For our initial runs, we estimated all three parameters w, P, and b.
- How do the modeling results change if P is instead fixed and w and b are estimated?
- How do the modeling results change if b is fixed and w and P are estimated?
- How do the modeling results change if P and b are fixed, and only w is estimated?
- For our initial runs, we included all three microarray datasets, wt, Δgln3, and Δhap4.
- What happens to the results if we base the estimation on just two strains (wt + one deletion strain)?
- What happens to the results if we base the estimation on just the wt strain data?
- When viewing the modeling results in GRNsight, you may determine that one or more genes in the network does not appear to be doing much.
- What happens to the modeling results if you delete this gene from the network and re-run the model (remember you will have to delete references to this gene in all worksheets of the input file).
- You also might think that a particular edge (regulatory relationship) is not needed. What happens if you delete that edge?
- What happens if you include the t90 and t120 expression data?
Final Research Presentation Due Thursday, March 21
This section is due on midnight, Thursday, March 21.
- You and your partner(s) together will prepare a 10-12 minute PowerPoint presentation that will present the results of your final project. Please follow these guidelines when creating your presentation. You will need approximately 10-12 slides (1 slide per minute) for your presentation. You will be graded according to this rubric.
- Upload your slides to Box by midnight on March 21. Each partner should have a link to the same PowerPoint file on their individual journal page. You may make changes to your slides in advance of your presentation, but you will be graded on what you upload by the journal deadline.
- Your presentation will include the following:
- Title slide that gives the main take-home message as the title of your presentation, the authors, date, and venue (course number and title).
- Outline slide that is a summary of take-home messages of your talk (should mirror your conclusion slide)
- The body of your talk (more details below)
- Conclusion slide that mirrors your outline
- Future directions
- Acknowledgments
- References
Introduction & Background Slide
The introduction gives the background information necessary to understand the motivation for your project and your research results. The introduction should be in the form of a logical argument that "funnels" from broad to narrow. Include the following:
- States importance of the problem
- Why are we studying gene regulation and cold shock?
- States what is known about the problem
- Introduce the DNA microarray experiment that was performed.
- States what is unknown about the problem
- Little is known about which transcription factors regulate the early response to cold shock
- States clues that suggest how to approach the unknown
- Each of the journal club articles that you all presented has a piece of the puzzle that motivates this project
- States the question the project is trying to address
- Using the model to estimate the relative contribution of each transcription factor to the regulation of gene expression
Methods Slide
- Describe the entire workflow of this project using a flow chart diagram.
- Experimental design of the microarray experiment
- Statistical analysis of the microarray data
- Clustering and GO term analysis
- Finding candidate transcription factors with YEASTRACT
- Generating and paring down the adjacency matrix
- The differential equation and least squares equation that were used for performing the initial estimation
- Creating the input workbook and how that relates to those equations
- Analyzing the modeling results
- Additional in silico modeling experiments
Body of the Talk (Results & Discussion)
- Table of ANOVA results from the Week 4/5 Assignment, discussing the interpretation of the p values.
- From the STEM analyis, include as figures the overall results (the screenshot showing all of the clusters) and then focus on the ones you interpreted for your journal assignment.
- Include a table showing the GO results for that cluster (just the narrowed down list of terms that you have interpreted).
- Discuss the biological interpretation of your GO terms.
- Include a figure of the unweighted networks (black and white) visualized with GRNsight.
- Describe how and why you and your partner chose these transcription factors for your networks.
- Modeling results (from the Week 6), including the LSE/minLSE ratio for each model run. Include the following parameter bar charts
- Optimized weight parameters (w)
- Optimized production rates (P)
- Optimized threshold b parameters
- Show the individual expression plots for each transcription factor for one of the initial runs, with the MSE and ANOVA values for each gene superimposed. You will want to organize these so that they can be compared easily. It may not be possible to show every plot on the slide, choose at least 4 that are interesting. For the subsequent runs, compare plots for "interesting" genes with each other.
- Show the GRNsight visualization of the weighted networks, making sure that the genes are placed in the same relative location as each other an as the unweighted network figure (using the "block" layout makes this easy).
- When you interpret the results of the model simulation:
- Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Which genes have the worst fit between the model and actual data? Why do you think that is? (Hint: how many inputs do these genes have?) How does this help you to interpret the microarray data?
- Which genes showed the largest dynamics over the timecourse? In other words, which genes had a log fold change that is different than zero at one or more timepoints. The p values from the Week 4/5 ANOVA analysis are informative here. Does this seem to have an effect on the goodness of fit (see question above)?
- Which genes showed differences in dynamics between the wild type and the other strain your group is using? Does the model adequately capture these differences? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
- When you interpret the results of the model simulation:
- Finally, based on the results of your entire project, which transcription factors are most likely to regulate the cold shock response and why?
- How do you interpret the results of the additional in silico experiments you performed with the model in light of the above?
- What future directions would you take if you were to continue this project?
- Store your shared journal entry in the shared Class Journal Week 7 page. If this page does not exist yet, go ahead and create it (congratulations on getting in first :) )
- Link to your journal entry from your user page.
- Link back from the journal entry to your user page.
- Sign your portion of the journal with the standard wiki signature shortcut (
~~~~
). - Add the "BIOL388/S19" category to the end of the wiki page (if someone has not already done so).
Reflection
- If you had to do this project over again, what would you do differently? What would you keep the same?
- What future directions would you take if you were to continue this project as it stands now?