# Moneil5 Dahlquist Lab Notebook

From OpenWetWare

## Contents

- 1 Spring 2016
- 2 Fall 2016
- 2.1 August 30, 2016
- 2.2 September 6, 2016
- 2.3 September 12, 2016
- 2.4 September 13, 2016
- 2.5 September 19, 2016
- 2.6 September 20, 2016
- 2.7 October 3, 2016
- 2.8 October 17, 2016
- 2.9 October 18, 2016
- 2.10 October 24, 2016
- 2.11 October 25, 2016
- 2.12 October 31, 2016
- 2.13 November 1, 2016
- 2.14 November 7, 2016
- 2.15 November 8, 2016
- 2.16 November 28, 2016
- 2.17 November 29, 2016
- 2.18 December 5, 2016
- 2.19 December 6, 2016

- 3 Spring 2017
- 4 Fall 2017
- 5 Spring 2018

# Spring 2016

### January 15, 2016

- branch
- date time downloaded
- name file link to download
- bug, functionality, priorities
- priority level
- 0-greatest priority
- 0.5- next up to work on
- 1- …
- 2- least priority

- data analysis- data not code
- question- asking people questions
- dont close issues on your own- write comment “resolved because…” and label review requested
- purely website ones
- assignments- assign issues to people (sparingly assign things to him)
- give updates when working in between meetings
- make electronic lab notebook that describes what was done each day- use as repository for files and such
- go through wiki checklist and edit user page to skills
- assignment for data bases class
- format like resume
- alphabetize the genes - gonna take some time

### January 22, 2016

- p= production rate
- w=weight
- b=”threshold”
- Can control any of these parameters
- Production and threshold for every gene in network
- weight for every edge in network
- Number of timepoints vs number of parameters is out of whack
- trying to find overall set of values to closest set of values- might never converge on an answer
- LSE vs. penalty being plotted
- “sweet spot” for alpha value found in “elbow” of curve
- questions trying to answer:
- ex. what happens if ….?
- estimate w,p,b
- estimate just p
- estimate just w
- estimate just b

- compare sigmoid to mm
- want to look at just wild type or wild type plus mutant
- strain influence/ strain #

### January 29th, 2016

- abstracts for undergrad symposium due by the 12th
- honors research grant also due by the 12th
- with grace on poster for symposium
- read trace paper
- not separating transcription and translation
- implementation verification
- output- tough because where we’d be making predictions
- change model to production_function in excel
- l-curve function call it 0
- put between production function and estimate_params
- run l curve analysis this week\
- Do 4 runs this week- do largest and smallest networks
- +/- deletion strains
- generates 4 l curves
- LSE on y-axis
- penalty on x-axis
- Should look like l (put labels for alpha values)

- make_graphs=0

### February 5, 2016

- figure out how to run multi-core processor
- name of file- remove “dahlquist data” and put in initials of person running it instead
- make sure everyone deleted the same strains
- will be working with wild type data from beginning to understand process

### February 11, 2016

- meet up with Brandon in Dahlquist’s lab to work on project on some day next week
- meeting next week is at 3:15

- plot data from LSE runs
- by next week- alpha selected, data collected
- replace 41998 #VALUE!
- 23 is correct # of data points wt
- t15=4
- t30=5
- t60=4
- t90=5
- t120=5
- total=23

- 20 is correct # data points dcin5
- t15=4
- t30=4
- t60=4
- t90=4
- t120=4
- total=20

### February 17, 2016

- Quantitate the fluorescence signal in each spot (GenePix Pro)
- Calculate the ratio of red/green fluorescence (GenePix Pro)
- Log transform the ratios (GenePix Pro)
- Normalize the ratios on each microarray slide (within-chip normalization)
- Normalize the ratios for a set of slides in an experiment (between-chip normalization)
- Perform statistical analysis on the ratios
- Within-strain ANOVA
- Modified t test for each timepoint
- Between-strain ANOVA
- Benjamini & Hochberg and Bonferroni p value corrections for the above three tests

- "Sanity Check" on above three tests
- Determining candidate transcription factors and gene regulatory network (YEASTRACT)
- Dynamical modeling with GRNmap; visualization with GRNsight

### February 19, 2016

- Grace to finish honors ambassadorial grant for Experimental Biology Conference in April
- Output parameter comparison for largest network with added strains for alpha values:
- 0.01
- 0.008
- 0.005
- 0.002
- 0.001

- To complete for the poster (two weeks after spring break)
- Stick with subfamily with strains_added
- "Production runs:" Evaluate (with graphs) the networks of 15, 34, 25, 20, and 30 genes.
- Help Grace run these networks

### February 26, 2016

- Mistakes in a couple input sheets, need to look into helping Grace fix those
- Explanation behind this error may be connected to the weird weight parameters output in the file on 2/23

- Look into helping Grace redo run for largest network with alpha value of 0.002

### March 10, 2016

- Helped Grace in putting graph outputs of interesting genes on the poster for the spring/symposium. Also helped in adding MSE/ANOVA values, and parameter comparisons as well as in degree and out degree figures.
- Helped re-run some of the networks to remedy for an error in the first run.

### March 27, 2016

- Looked at/worked on editing the abstract for ASBMB conference with Grace
- Told how the random networks might work, better understand the principles behind these networks

### April 15, 2016

- Only a few weeks left, will be working to help Grace compile a powerpoint containing the graphs, figures, and tables for the dHAP4 network analysis done for the semester.
- Focusing on in-degree and out-degree distribution, small networks and whatever else Grace needs help on

# Fall 2016

### August 30, 2016

Mathematically modeling networks - how will graph theory help?

- Mostly focusing on using gene regulatory networks in relation to graph theory
- Are there papers that suggest you can determine what is happening in a system based on outputs of graph theory-based statistics?
- What do strict numbers from stats tell you about what is happening in a system? Or is it a mostly visually based interpretation that’s needed?

Are the feed forward loops AND or OR type loops? (i.e. A and B needed for C activation, alternatively A or B is needed for C activation) How does suppression and activation play a role in these feed forward loops?

Cursory searches:

### September 6, 2016

- Searched “graph theory and yeast”
- Network properties in “using graph theory to analyze biological networks” - don’t pay attention to clustering, probably doesn’t relate to what we’re doing
- Pay attention to paragraph on gene regulatory networks
- Documentation of model TRACE model, see dry lab protocols
- Look at papers referenced in the model
- Issue #170 - goal is to get words on a page to describe GRNmap

### September 12, 2016

- Code of conduct:
- Re-read, agree, post to issue saying read and agree

- Need way to check calculations
- Look for pre-packaged ways to compute betweenness centrality and shortest path
- Start into mode of what we can get done in MatLab
- Start googling MatLab documentation, implementation in MatLab and use as independent check for GRNsight team
- Systems biology package for matlab
- Values computed for weighted and unweighted networks
- Look for code to do analysis with, continue literature search
- Looking for way to do degree distribution quickly an easily
- Projection: mma deg rates, good random network -> proceed to run simulations

### September 13, 2016

- Look up and complete matlab tutorials
- Do write-up of data
- Find articles that focus on betweenness centrality and and shortest distance models for graph theory
- Worked with Kristen to make powerpoint with quick summations of the articles we read
- Can be found here

- Looked into using a systems biology toolbox for MATLAB can be found online

### September 19, 2016

- R in degree out degree generator- random networks only?
- Use bibliographic software to format references zoterro- web and standalone. *Can type in DOI and get field with everything. Export to whatever format ( AP etc.), best thing to do
- Find betweenness centrality program in matlab
- First 4 points of TRACE documentation [2]
- Double check everything about 5 15 gene networks and do production runs, generate random networks and collect data
- Test same network on same computer twice, different computers, and other control experiments

### September 20, 2016

- Spent most of time getting familiar with basic Matlab functions using the Matlab tutorial found at this link: [3]
- Reviewed matlab basic tutorials and looked into tutorials for systems biology package and graph theory. Nothing concrete found yet, but will look more into it at the next research session

### October 3, 2016

- Keep working on TRACE documentation
- Get past admin block to download SBEToolbox on all of the Dahlquist Lab computers
- Figure out formatting for toolbox, use networks of different size, see what works with the program and how it’s interface works
- Kristen going to contact authors to ask for how we should format everything

### October 17, 2016

- Keep working with Kristen on exploring SBEToolbox capabilities and its application to the graph statistics of interest
- Create a small graph first to test with
- 4 nodes, 6 edges might be a good size to look at

- Update Github more often with results
- TRACE documentation can be found on the TRACE wiki

### October 18, 2016

- Worked on figuring how SBEToolbox works with Kristen
- Runs as a program in matlab called SBEGUI
- Once running offers different organisms to choose from - Kristen selected S. cerevisiae and notes when MatLab and SBEGUI are restarted, you are not asked about organisms again

- Interface is pretty user-friendly for SBEGUI pop up
- Some interesting features include:
- Creation of random networks (small world, Erdos-Renyi, and Ring Lattice)
- Can upload own networks in .txt (tab delimited) format
- Allows you to select nodes and the program tells you it’s functionality
- Lots of graph statistics that can be run, will look into each further moving forward

### October 24, 2016

- Convert to SIF files using GRNsight website
- SIF instructions are on GRNsight website - also GRNsight able to convert from excel to SIf format

- Create our own documentation of the package tomorrow. Make note of everything that works and everything that doesn’t work

### October 25, 2016

- Worked with Kristen running a 4 node network through all programs/functions included in the package
- Will run 21-gene network next week on more specific stats now that we know what works and what doesn’t
- Powerpoint of our work can be found here Media:HorstmannOneil_SBEToolbox_Tests.pptx

### October 31, 2016

- Focus on running the more informative stats for our own networks, HAP4, GLN3 and ZAP1 and run visualization
- Compile for a powerpoint to present next week

### November 1, 2016

- While working on powerpoint realized that betweenness centrality and shortest path that the statistics are being run without taking directionality of the edges into account
- Look into program features to figure out if there is a way to change this to view the networks as directed
- Only ran unweighted networks because can’t find where the weighted networks live in GitHub
- Powerpoint of findings can be found here: Media: SBEToolbox_TEST.ppt

### November 7, 2016

- Brandon and Natalie to try different types of motifs when creating random networks. This includes regulatory, feed-forward, etc.
- Drop out grey connections- rescale and it might show some of the stronger connections actually become less important
- SBEToolbox confirmed to use assumptions of undirected networks. Does every program do this?
- Might be looking at math rather than comp sci packages
- Start doing google searches of shortest path/betweenness centrality of directed networks
- Look into other programs, YeD and Gephi, etc. and email Dahlquist about other programs

### November 8, 2016

- In looking at other programs, seems like Gephi is the top choice
- Yed seems like it might just be a figure generator, can’t find anywhere on their website where stats might be done

### November 28, 2016

- After exploring Gephi and getting familiar with Gephi, now going to run all 6 graphs/ directed networks through Gephi
- Run weighted and unweighted and directly compare results
- Look up definitions and calculations of each statistic (my primary task)
- Add GRNsight visualization for each network
- Think about analysis and conclusions that can be drawn from this semester for next semester’s presentations and conferences. UCI systems biology conference might be a possibility, going to be coming up soon, start thinking about it.

### November 29, 2016

- Worked on compelling the definitions and equations used by Gephi in their graph statistics
- Results can be found here Media:MO_Statistics_Models_Used_in_Gephi.pptx
- Kristen ran the weighted and unweighted graphs through Gephi

### December 5, 2016

- In reviewing Gephi work from last week, most interesting stats seem to be strong component, closeness centrality, betweenness centrality and eigenvalues
- Look into plotting closeness and harmonic against each other to see if there’s a major relationship between the two
- Find strong component and why some numbers are so much greater than others
- Work on table of contents and CD’s of work to wrap up the semester

### December 6, 2016

- Made CD with table of contents of work and reviewed Gephi stats
- Turned CD in

# Spring 2017

### January 12, 2017

- Attending the UCI Systems Biology conference on January 28. Abstracts due next week by the 20th
- Moving forward only one computer will be used for running models with GRNmap to reduce variables

### January 19, 2017

- Worked with Kristen on writing the abstract for the conference
- Focusing on old HAP4 results from last spring and what other information the statistics from Gephi can tell us about the smallest HAP4 network

### January 24, 2017

- Worked on finding old HAP4 input sheets and poster documents
- Kristen uploaded Gephi results and HAP4 outputs

### January 26, 2017

- Worked with Kristen to get poster completed
- Discussed poster during the meeting and what is important to keep/not keep.
- Worked on and completed the poster after research meeting and uploaded it to repository

### February 2, 2017

- Ended up presenting on my own at the conference due to unforeseen circumstances for Kristen. No problem at all, very interesting conference to attend.
- Worked on symposium abstract
- MSE relationship with ANOVA
*do p <0.05 genes have better fits
- Dont
- No relationship

- Are genes with no inputs modeled worse?
- Compare list of genes no inputs to list with - which modeled better?
- How does b value play into MSE
- Number of inputs

- Genes decreasing - expression worse fit?
- How do centrality measures connect to MSE?

### February 9, 2017

- Continued writing abstract for undergrad symposium
- Initially my project was the poster version of Kristen’s talk, but now have it changed to compare all 6 db’s in Gephi and figure out what those stats might be saying about the networks and specifically what is being said about the nodes. Can be found in GRNmap GitHub repository. Got help and editing from Dahlquist and Fitzpatrick

### February 16, 2017

- Working on compiling Gephi statistics for all dibs
- Start thinking about comparing nodes across networks and what this might mean
- Compile into single document similar to Brandon’s - make all node based

### February 23, 2017

- Working on compiling Gephi statistics based on new naming scheme for the different families of networks
- db1 - wt
- db2 - dCIN5 14 nodes
- db3 - dCIN5 17 nodes
- db4 - dGLN3
- db5 - dHAP4
- db6 - dZAP1

- Talk to Brandon about stats that might be used

### February 27, 2017

- All raw files for Gephi outputs (db2, db3, and uploaded and completed to Dahlquist Repository in [4]
- If time available between midterms this week, will try my best to upload the in degree and out degree totals (Issue 328) before Thursday but not looking likely

### March 2, 2017

- Downloaded all 6 db sheets with comment "uploaded output sheets from first round to modeling" to use for Github Issue #329
- Created a compiled excel workbook for the network outputs with sums of rows and columns in each matrix/ sheet for all dbs
- Ran average, median, min and max of all rows and columns sums

**Meeting**

- Take column summation (out degree) and row summation (in degree) and compute average in degree for each gene in each network, excluding zeros in the average calculation
- Use COUNTIF for this etc.

- For next week try and have most of poster complete

### March 30,2017

- Worked on symposium for the past couple weeks, completed poster can be found here
- Created a file detailing how to upload files to Gephi, view graph statistics, and export the file to excel. A Word document version can be found here and a PDF version of the document can be found here

**Meeting**

- Symposium good, make research for rest of the semester more robust.
- 6 dbs
- 30 random networks
- LSE/minLSE ratios- bar chart (Natalie's presentation)
- Degree distributions
- Unweighted - R script done DB1-6, need to do random
- Weighted - bar chart, cumulative plot done for db1-6. SPSS

- Gephi - tables on in and out degree for both weighted and unweighted, and total degree; all db-derived of random
- MSE/minMSE for db 5, Natalie will set up excel spreadsheet to facilitate
- Gephi stats
- I have db1-6
- Kristen has random 20

- Convert so each sheet is a statistic rather than just 1 sheet
- Look into how gephi is computing each statistic (both weighted and unweighted) to know for sure we know how the statistic is being calculated
- By-hand calculation for each statistic, come back to eccentricity later because it's hard to tell what's going on

- Develop group report that sums up our work for the semester along with tables, graphs, etc.
- Both as word doc and powerpoint, describe everything like technical report that points to specific files

### April 6, 2017

- Worked on Issue #290 on Github, compiling Gephi outputs for weighted and unweighted dbs1-6.
- Successfully uploaded the weighted sheet to GitHub
- In progress of compiling and formatting the unweighted data sheet

##### Meeting

- Upload raw input and output files for gephi, weighted and unweighted
- Use Boulardii 2 input and output files up for gephi
- Upload existing files to gephi
- Rather see we do hand-calculation to compare to existing than generate other statistics
- run calculation/compute stats by hand instead of doing unweighted stats

### April 27, 2017

- Last week calculated the Gephi stats by hand for closeness centrality. Results and notes can be found here
- Need to add Gephi Input files
- Need output csv files to gephi
- ^Completed the above already
- Started computing the betweenness centrality stat by hand, will upload documentation when completed

##### Meeting

- Discussed github repository documents
- Discussed spoke and wheel 13-node, 16-edge graph from class. Planning on running random network (x), deleting ACE2 (x), and then going through and breaking feed forward motifs without cutting anything out of the network (o)
- Look at #328, #290, and by-hand stuff for Issues. By-hand not a priority but to be done if time left in the semester

# Fall 2017

### August 30, 2017

Lab Meeting

- Going to receive syllabus for capstone/thesis sometime this semester to look ahead at the project
- Research times decided, need OneCard access for some rooms on campus
- Dahlquist - in process of working on sorting out GRNmap
- L-curve work, shows some might be weird for random networks
- Error fixed where GRNmap runs the same on different computers now
- This academic year - can trust the code, last year more exploration. Now at a point where we can be focused on specific questions and answering them. Validating the model now. In this exploration, determining what's impacting the outcome of the model. "Does it do better if we fix certain things? Effect of more or less data? Effect of size of the network?"
- Served 2 purposes - generate L-curve and analyze model results. Pick different alpha, etc. Now have a body of data to analyze
- Aim for THIS year - work on publication worthy stuff. Now really running experiments.

- Brandon - going to work on R code to make L-curve plot more user friendly
- Will discuss my L-curve task tomorrow

### August 31, 2017

Research

- Went over lab safety protocol and got forms filled out for key access
- Instead of looking at L-curve, worked on looking at how to calculate graph statistics by hand, in order to understand what Gephi is doing to calculate the different statistics.
- Worked on weighted betweeness centrality statistics, however not enough time in the given research period to complete the calculation. Will be posted at a later date shortly

### September 13, 2017

Meeting

- Openwetware not working last week, will update after meeting today notes from last week
- Work with much simpler networks for calculations, fully connect 6 nodes, all nodes are equivalent, compute that, match Gephi. Take as given can compute larger networks.
- Will read paper from Gephi detailing eccentricity algorithm
- Need to have honors thesis abstract complete by 9.20.17 for review by Dr. Dahlquist and Dr. Fitzpatrick. Will edit in person with Dr. Dahlquist and Dr. Fitzpatrick

### September 14, 2017

Research

- Created and ran a 6 node network with 30 edges, discovered that the betweeness centrality of every node was being calculated as 0 because in connecting all edges, there seemed to be an infinite loop created/ every node was connected directly.
- Broke an edge going B -> A, resulted in a 6 node, 29 edge network that did have some values for betweeness centrality, as B has to now use C,D,E or F in order to reach A.
- Calculations for betweeness centrality can be found here

# Spring 2018

## January 12, 2018

Lab meeting of Data Analysis and GRNmap:

- John to work on GRNmap and GRNsight website improvements
- Lauren to apprentice with both me and Brandon on what we're working on because carrying torch
- LMU symposium on March 24th, everyone to present, abstracts due by mid-February
- Outside conference - tri beta conference on March 17th at Concordia University in Irvine <- most likely
- Overview of github nomenclature and tagging for new people
- Review of last semester and directions for this semester:
- Last semester focused primarily on thorough investigation of graph statistics, and determining what graph statistics tell us about the network when combined
- Do experiment to delete each of 28 edges in db5 systematically, and compare LSE:minLSE ratios to intact network and look at graph statistics for each

## January 16th, 2018

Research lab time from 1pm-6pm on Tuesday

- Determined the specific steps for the project moving forward:
- Working with Lauren to compile data for the systematic edge deletion of db5, deleting each edge at a time
- Run db5 on the latest version of GRNmap with missing and no missing values as a control for the experiment
- Generate and run the 28 different single-edge-deletion input workbooks through GRNmap
- Generate and run the output workbooks from the previous point through Gephi to get the graph statistics
- Compile the production rates and LSE:minLSE ratios for each workbook
- Compile graph statistics for all networks in a single excel file and run paired t-tests for each to determine the significance of the deleted edges

- Working with Lauren to compile data for the systematic edge deletion of db5, deleting each edge at a time
- Today, going to start with point 1, but first need to update the matlab licences on the Dahlquist lab computers
- The input files can be found here for the missing and not missing db5 files

## January 19, 2018

Lab meeting

- Github issues created to post work for the semester
- Moving forward use the beta code to run the models, so that minLSE is calculated
- Save beta code that we have right now, as of January 19, 2018, save to a flashdrive that's designated for beta version of GRNmap
- Dr.Fitzpatrick to walk us through how to use multi-core computers
- This weekend/week to generate the input workbooks, and then run the 28 models

## January 26, 2018

- Lab meeting:
- Take MSE:minMSE spreadsheet from Brandon to continue doing analysis
- LMU symposium deadline for abstracts on February 9th, have a draft of an abstract done by next Friday's meeting
- Might look into adding/looking at the networks generated by Biomethematical modeling, and biological databases courses, totals around 10 networks, would be interested in looking at a network that has genes we have not investigated yet.
- Haven't systematically looked at how things are estimated, might want to cut things out, and alter software given that decisions were made at the time of the creation of the model and software. A decision hardcoded in when it might've been more appropriate to make it flexible
- If we were to run a 2-edge deletion, it would be appropriate to use random networks with the same number of edges and nodes as the comparison

## February 2, 2018

Lab meeting

- The divide by zero error that is popping up is due to the data and results being identical - no variation in the data, paired t-test being used for this is fine
- Next week to do minMSE calculations, and do further analysis of the significance of each edge in the network