Brandon J. Klein Electronic Lab Notebook
From OpenWetWare
Jump to navigationJump to search
Spring 2017
Protocols
- Analyzing GRNmap Output Workbooks Using Multiple Regression and SPSS
- Generating Distribution Charts and Cumulative Plots for GRNmap Weight Values in SPSS
Fall 2016
Week 1: August 29--September 5
Trajectory for the Semester:
- Goal: Work towards validating the current literature network or finding the "true" network regulating the response to cold shock in S. cerevisiae.
- Assess models of small "real networks" and compare to "random networks".
Tasks:
- Read GRNsight paper.
- Automate the creation of random networks in R.
Progress:
- Reviewed GRNsight paper.
- Investigated ways to code for the creation of random networks in R.
Week 2: September 5--September 12
Tasks:
- Read and consider adding the Contributor Covenant Code of Conduct to GRNmap.
- Develop an R script for the creation of random networks for input into GRNmap.
- Review the GRNmap paper published in the Bulletin of Mathematical Biology.
Progress:
- Read through and approved the Contributor Covenant Code of Conduct.
- Developed a functional script for creation of random matrices and posted it to GitHub.
- Reviewed GRNmap paper.
Week 3: September 12--September 19
Tasks:
- Make changes to the random network generator script:
- Change terminology within the script to match GRNmap nomenclature (e.g. change "connections" to "edges").
- Make note of the fact that GRNmap only accepts symmetric adjacency matrices.
- Prompt users to set their working directory before outputting random matrices.
- Refine comments within the code.
- Add sample headers where gene names can be inserted.
- Work through chapter 1 of Dr. Dahlquist's recommended R tutorial.
- Begin work on TRACE documentation for GRNmap.
Progress:
- Refined and expanded the functionality of my random network generator.
- For notes on the changes that were made, see my GitHub update.
- Worked through chapter 1 of the R tutorial.
- Chose sections of the TRACE documentation to work on.
Week 4: September 19--September 26
Trajectory Update:
- We will focus on the 5 different "real 15-gene networks" generated last year. We will go through these input sheets and double check rates as well as formatting to make them as perfect as possible before working with them.
- We can start doing production runs. We will do this with the 5 real networks as well as random networks.
Tasks:
- Read the bar chart module in chapter 2 of the R Tutorial.
- Begin writing a degree distribution generator script.
- Work on TRACE documentation.
Progress:
- Completed chapter 2 of the R Tutorial.
- Wrote a preliminary version of the degree distribution generator to act upon outputs from the random network generator.
Week 5: September 26--October 3
Tasks:
- Create and upload a fully functional version of the degree distribution generator to GitHub
- Generate and vett input workbooks for 5 database-derived networks (GitHub Issue)
Progress:
- Created fully functional version of the degree distribution generator
- Will return to this later, when time permits
- Generate and vett input workbooks for 5 database-derived networks (GitHub Issue)
- See protocol here: OWW Instructions
- Worked on the dHAP4, dGLN3, and dZAP1 families of networks input sheets
- Current versions were uploaded to the Dahlquist Lab repository
Week 6: October 3--October 10
Tasks:
- Generate and vett input workbooks for 5 database-derived networks (GitHub Issue)
Progress:
- Completed input sheets for dGLN3, dHAP4, and dZAP1. These are now available on the Dahlquist Lab Repository.
- Some lingering concerns to be addressed:
- "optimization_parameters" sheet values (test files vs. OWW protocol)
- misaligned time points between strains
- preferred font (if any)
- Some lingering concerns to be addressed:
Week 7: October 10--October 17
Tasks:
- Vet Natalie's input workbooks for the wild type and dCIN5 families of networks
- Update OWW Microarray Data Analysis Workflow and GRNmap Wiki to reflect "optimization_parameters" input sheet adjustment (include 2 column headers)
- Update the GRNmap sixteen_tests files to reflect current input workbook conventions
- If time, do the degree-distribution charts for the "real" networks (store in PowerPoint)
- If time, generate some (around 10) random networks for each of the "real" ones and store them in a workbook by themselves, and make the degree distribution charts for them (don't do dZAP1 until it has been vetted)
Progress:
- Vetted Natalie's input workbooks, offering feedback as needed. During this process, missing expression data was identified in the input sheets. Thus far, missing expression data has been added to the input sheets for dHAP4 and dZAP1 (a new network for the latter was generated this week by Dr. Dahlquist).
- The remaining tasks from this week will carry over to Week 8.
Weeks 8-9: October 17--October 31
- Vetted Natalie's updated input workbooks for the wild type and dCIN5 families of network.
- Updated OWW Microarray Data Analysis Workflow and GRNmap Wiki to reflect "optimization_parameters" input sheet adjustment (include 2 column headers) and proper rounding protocol for production & degradation rate data.
- Updated the GRNmap sixteen_tests files to reflect current input workbook conventions.
- Generated degree-distribution charts for the "real" networks and visualized these networks using a uniform grid in GRNsight.
Week 10: October 31--November 7
- Altered the powerpoint presentation containing the modeling results from the 5, "15-gene" networks based on feedback received during our lab meeting.
- Corrected a formatting issue present in all files within the sixteen_tests folder on GitHub.
Week 11: November 7--November 14
- Created new issues on GitHub to guide our analysis of the modeling results from the 5, "15-gene" networks.
- Created a bar chart analysis of the modeling results from the 5, "15-gene" networks.
Week 12: November 14--November 21
- Compiled within-strain ANOVA data from each strain and added this information to the modeling results powerpoint.
Week 14: November 28--December 5
- Updated the bar chart analysis powerpoint presentation with corrected threshold_b values.
- Ran all sixteen updated input sheets from the sixteen_tests folder through the beta version of GRNmap and posted the resulting output sheets to GitHub for review.
Week 15: December 5--December 12
- Corrected an issue with the updated version of the bar chart analysis powerpoint presentation.
- Backed up all of my files from the semester onto a CD and included a table of contents.
- Updated OWW to reflect all work done this semester.
Spring 2016
Week 1: January 15--22
Research Meeting Notes
- Early Milestones
- Branch Clean-Up
- Organize Test Files
- Adjust Automated L-Curve Analysis Code
- Address Bugs in the Code
- Assignments
- Set-Up OpenWetware User Page and Electronic Lab Notebook
- Alphabetize Genes in the Test Files
- On the network sheets, use the following method: alphabetize column, transpose data, alphabetize new column, transpose back.
- Ensure that All Expression Data is Complete in the Test Files
- Include Bell Data on expression and degradation rates.
- Address Missing Values in the Test Files
- Highlight these cells in yellow and paste in averages.
Note: For assignments 3-4 I am to shadow Tessa and Kristen.
Progress
- Assignment 1: Set-Up OpenWetware User Page and Electronic Lab Notebook
- I updated my User Page and created my Electronic Lab Notebook:
- Assignments 2-4: Shadow Tessa & Kristen as they Update the Test Files
- I met with Tessa & Kristen on Wednesday, January 20th.
- I watched and asked questions as they made the following edits to the test files: gene names were alphabetized, expression data was completed, and missing values were addressed using the designated fix.
- I contributed as well by introducing an Excel method that finds & highlights missing values using Conditional Formatting.
- Method for highlighting missing values using Conditional Formatting (adapted from this forum):
- Select the data you would like to edit
- Go to Home > Conditional Formatting > New rule
- Click on “Format only cells that contain”
- Change “Cell Value” option to “Blanks”
- Set up formatting you want by clicking on Formatting button
- In this case we introduce a yellow fill.
- Click ok.
- I met with Tessa & Kristen on Wednesday, January 20th.
Week 2: January 22--29
Research Meeting Notes
- Coding Updates
- In approx. 2 weeks, the data analysis team will use the Master branch on GitHub to access GRNmap code
- Current input-sheet format will be the same as presently used for the Beta branch
- Goal: be able to run models by next Friday
- Assignments
- Download the Beta branch and try to run a newly formatted input-sheet (has to be on a PC)
- This can be used to identify errors (if any) are present prior to the upcoming update to the code
- Process:
- Go to code in GitHub
- Go to the Beta branch
- Extract the code as a .zip file
- Open code in MatLab
- Open Input Sheet in MatLab
- Read ecological modelling standards paper
- Go to modelling standards web page and contribute based on your reading of the paper
- Do research to gain a better understanding of how GRNmap works
- Update work that was done on GitHub
- Download the Beta branch and try to run a newly formatted input-sheet (has to be on a PC)
Progress
- A newly formatted input-sheet was successfully run using the code from the Beta branch
- Read the TRACE paper on documenting model formation
- Reviewed information from several sources to further understand the project:
Week 3: January 29--February 5
Research Meeting Notes
- Assignments
- Help Kristen catch up with formatting her input sheets
- Add new formatting changes
- See GitHub to reference 3 necessary changes to input sheets
- Generate L-curve Analyses
- 4 Total
- Largest network + deletion strain
- Largest network - deletion strain
- Smallest network + deletion strain
- Smallest network - deletion strain
- Graphs will have to be plotted manually in Excel
- LSE vs. penalty with each point's alpha value labelled
- To bypass a temporary bug, make_graphs may have to be turned off
- Once this function has been fixed, we will be able to do mass data generation.
- Use Beta branch for once more week.
- 4 Total
Progress
- Meeting with Tessa on February 1, 2016.
- The 4 designated input sheets for L-curve analysis were properly formatted using the updated guidelines. Instances in which errors were triggered in GRNmap were resolved by searching for and correcting formatting errors in the input sheets.
- All 4 L-curve analyses were started and left running on the machines in Seaver 120 (with notes not to disturb these processes).
- Tessa and I went over the format of GRNmap research and the overall systems biology workflow of the Dahlquist Lab.
- Meeting with Tessa and Kristen on February 3, 2016.
- 4/10 L-curve analyses were complete
- I created a command sequence in R that can be used to output labelled graphs of the L-curves.
- A tutorial and samples can be found here: Graphing L-Curves in R
Week 4: February 5--February 12
Research Meeting Notes
- Goal: Complete L-curve analyses to enable selection of alpha values
- Several runs remain incomplete, but they have completed enough iterations to produce usable data
- Data from all runs will be compiled and formatted for R
- The above data will be used to generate a complete set of L-curves for the 4 runs from all tested families of networks
Progress
- Graphing L-curves
- I helped members of the research team format their data for R and showed those with knowledge of R how to execute the command sequence to generate their L-curves
- I personally graphed many of the L-curves
- I compiled a Powerpoint presentation of all L-curves generated this week: Media:LcurveAnalyses 20160205.pptx
- This presentation was sent to Dr. Dahlquist for review.
Week 5: February 12--February 19
Research Meeting Notes
- Goals
- Begin the Dahlquist:Microarray Data Analysis Workflow (focus: dCIN5 and WT)
- Check normalization against Tessa's normalization data
- Set up a meeting with Maggie to go through this workflow in parallel
- Account for missing values during ANOVA testing if time permits
- Create a script for generating parameter plots in R (for different alpha values) if time permits
- Begin the Dahlquist:Microarray Data Analysis Workflow (focus: dCIN5 and WT)
Progress
- Began the Microarray Data Analysis Workflow
- Thursday, February 11- Began the Dahlquist:Microarray Data Analysis Workflow with Dr. Dahlquist
- Completed the normalization
- The results of this normalization were cross-checked using Tessa's normalization from SURP 2015
- Wednesday, February 17- Walked Maggie through the normalization and continued the Dahlquist:Microarray Data Analysis Workflow
- Completed the within-strain ANOVA for wt & dCIN5
- Sanity check results: Media:Wt-dCIN5 StatsResults BK20160217.pptx
- Completed the within-strain ANOVA for wt & dCIN5
- Updated the R command sequence used to generate L-curves
- This sequence was scripted to automatically generate L-curves
- The script is interactive, prompting users to input the pathway to the data and provide a name for the graph
- An updated description can be found here: Graphing L-Curves in R
Week 6: February 19--February 26
Research Meeting Notes Next Meeting: Friday February 26 (2:30-3:30)
- Goals:
- Continue the Dahlquist:Microarray Data Analysis Workflow with Maggie and cross-check results
- Write a script in R to convert adjacency matrices to edge lists
- If time permits...
- Account for missing values during ANOVA testing
- Create a script for generating parameter plots in R
Progress
- Wrote a script to convert adjacency matrices to edge lists using the package igraph