Alice Finton Online Lab Notebook

From OpenWetWare
Jump to navigationJump to search

Summer 2019

Microarray data for each of the strains used in the analysis:

  1. dCIN5: dCIN5_one_strain_ANOVA_out_data.xls
  2. dGLN3: dGLN3_one_strain_ANOVA_out_data.xls
  3. dHAP4: dHAP4_one_strain_ANOVA_out_data.xls
  4. dHMO1: dHMO1_one_strain_ANOVA_out_data.xls
  5. dZAP1: dZAP1_one_strain_ANOVA_out_data.xls
  6. Wild-Type: wt_one_strain_ANOVA_out_data.xls

Week 2: May 28 - 30

  • Goals:
  1. Perform an exploratory search for software that can be used for comparing gene ontologies of the wild-type, dCIN5, dGLN3, dHAP4, dHMO1, and dZAP1 strains of Saccharomyces cerevisiae.
  2. Organize STEM profile results for all strains to make comparison of the profile types.
  3. Present a Journal Club on "Parameter Estimation for Gene Regulatory Networks from Microarray Data: Cold Shock Response in Saccharomyces cerevisiae"
  • Progress
  1. Alice and Mihir May 28 Journal Club Powerpoint
  2. Comparative gene ontology web search
    1. Software reviewed:
      • Revigo- Allows for the creation of scatterplots from a single list of gene ontology IDs, with the option of including p-values. The resulting graphs colorize the functional groups according to the p-value.
        • Does not allow for a comparison between GO lists. Therefore, it can be used only for a single strain.
      • WEGO- Creates comparative plots, but supports narrow file types.
        • Need to determine how to change file format of GO list from STEM software to work on WEGO.
      • Panther Classification System- Creates bar graphs or pie charts of the gene ontology terms from the gene list data from STEM.
        • There is no option to restrict the GO terms based on p-value significance.
        • You have to use the gene list, not the GO list from STEM.
      • ClueGO- Creates networks from the gene ontology terms and allows for a comparison between two lists of data.
        • Requires Cytoscape, as it is a plug-in for the software.
          • On May 30, Dr. Dahlquist requested access to the plug-in. (On June 2, the license was given to Dr. Dahlquist)
      • CompGO
        • Requires R software to run. Need to figure out how to use CompGO in the R software.
      • Comparative GO- "A webserver for comparative gene ontology, gene ontology network and gene ontology based gene selection"
        • Limited species inclusion: Bacteria, Virus, Zebrafish, Human, Rice. Unsure if it supports S. cerevisiae genes.
      • BiNGO- Cytoscape plug-in that creates networks of the gene ontologies.
        • Determines which gene ontology categories are overrepresented in the gene list.
        • Supports a wide range of organisms.
      • GOTaxExplorer- Allows for the comparison between gene sets.
        • " It is possible to compare arbitrarily selected organisms or groups of organisms from the taxonomic tree on the basis of the functionality of their genes" (GO Tools: Visualization)
        • Unable to download the software. May need to request access to use the software.
      • Panther Compare Lists- Allows for a comparison between gene ID lists. Statistical overrepresentation test.
        • Produces p-values from the analysis. (Gives the option of using Bonferroni corrected p-values)
        • Allows for the visualization of the gene ontology groups with a bar chart, multiple pie diagram, overlaid area chart of difference, and bar chart of difference.
    2. Simple comparison of GO lists
      • Venn Diagrams
        • Pangloss- Creates a venn diagram from only two lists of data, but states which terms are overlapping and which are unique
        • InteractiVenn- Allows for the creation of a venn diagram for up to six sets of data.
          • Does not indicate which terms are overlapping or unique.
      • Microsoft Excel to Compare IDs
        • Allows for the comparison of gene ontology terms to determine which are overlapping and which are unique between the strains.
      • Compare Two Lists
        • Allows you to input sets of data and compare them. It offers information about which inputs are unique to either list and those that are overlapping.

Week 3: June 3 - 6

  • Goals:
  1. Run ClueGO with Cytoscape and create powerpoint of various functions it can do
  2. Try to figure out CompGO and how to use the R software
  3. Present a Journal Club on "Physiological and Transcriptional Responses of Anaerobic Chemostat Cultures of Saccharomyces cerevisiae Subjected to Diurnal Temperature Cycles"
  • Progress:
  1. Alice June 3 Journal Club Powerpoint
  2. ClueGO
    1. ClueGO does not allow for unique gene ontology groups to be included in the network. Only the categories that are common to all sets of genes are included in the network.
    2. There is the option of using gene list IDs or gene ontology terms for the analysis. Gene IDs offer the creation of a network, but the gene ontology terms only create separate nodes with no edges.
    3. There is a ClueGO plugin that gives the option to make the nodes of the network pie charts, giving information about the percentage of genes in each cluster that are part of the specific functional group. I downloaded the plugin and used it to determine the percentages of wt and dCIN5 genes were part of the functional categories in the network, but each pie chart showed the same percentages. Therefore, the number of genes belonging to the specific functional category could be the same for each node.
    4. When I created the network comparing wild-type profile 45 and dCIN5 profile 45 gene IDs, every functional category was more highly overrepresented in the wild-type strain than the dCIN5 strain. Therefore, each node in the network was colorized red (indicating wild-type) based on the color settings in ClueGO.
    5. I am currently working on creating a PowerPoint with the various functions of ClueGO.
      • Run ClueGO analysis on the clusters, comparing different strains in each cluster that would be useful for analysis.
    6. When using ClueGO for analyzing the GO terms that are given through STEM, the title of the columns for p-value and GO ID are sensitive.
      • In order to run an analysis with GO terms, you need to select "Preselected Functions"' instead of "Functional Analysis' at the top of the window.
      • Initially when I pasted the columns into the box, I wrote "GOID" for the gene ontology ID column, and "p value" for the column listing the respective p-values. When I ran the test, the result was not in the form of a network, rather just a grid of functional categories.
      • I ran another test without putting a header for the columns. Initially, I began getting networks for the results, with nodes and edges. But after restarting Cytoscape, running the software with no headers resulted in grids of functional categories rather than the network.
      • I looked back through the ClueGO documentation to see if there was any information about what to include in the headers. I found that when running GO terms, the header of the column including the GO Ids should be labelled "GOID:PathwayID" and the column including the p-values should be labelled "p value" (Fig. 1). After correcting the format, the results of the tests were networks.
    7. When I tried to save the networks that I had created on June 5, Cytoscape crashed because too much memory was being used. Therefore, it is important to save your work as you go.
      • Additionally, do not save as a Cytoscape file. Save as a file on ClueGO, otherwise only the networks can be seen, not the analysis tables.
      • You should save your work as a ClueGO file and also as a Cytoscape session. It takes a long time to load a large ClueGO file (I tried loading the file from the work I had done on June 6 and it got stuck and crashed Cytoscape. I will attempt to open that session later.)
      • On June 5, I ran an analysis on all of the profiles and the gene list IDs and took screenshots of the resulting networks. The actual ClueGO files were lost because Cytoscape crashed.(I will have to redo the analysis on gene list)
  3. How to run ClueGO:
    1. File Formating:
      • Choose either the Gene ID list or GO term list from the STEM analysis. Open the txt file and put in an excel sheet.
      • For a gene list run:
        • Open an excel workbook and create a sheet for each of the strains included in the data. Include the profile number for each of the new sheets. For example, a sheet can be named "wt Gene List 45" or "dCIN5 Gene List 45", where '45' refers to the STEM profile.
        • Copy and paste the contents of the Gene list txt file from the STEM software into the excel sheet. You should then have columns A-J filled.
          • Delete columns A-B and D-J, leaving only one column in the sheet. Delete the "gene symbol" column header, leaving only a column of genes.
        • Repeat for every strain in the STEM analysis for each profile.
      • For a GO term run:
        • Open an excel workbook and create a sheet for each strain included in the STEM analysis as done for the gene list analysis. For example, a sheet can be named "wt GO 45" or "dCIN5 GO 45", where '45' refers to the STEM profile.
        • Open the GO list txt output file generated by the STEM software.
          • Copy and paste the txt file into excel. You should then have columns A-I filled.
          • Delete columns B-F and H-I, so that only two columns remain, one with the GO IDs and one with the p-values.
        • Repeat for the rest of the strains in the profile.
      • Example of an excel workbook with Gene lists and GO terms.
    2. Open Cytoscape and run the ClueGO plugin
      • For a gene list run:
        • Analysis Mode: keep "Functional Analysis" selected.
        • When first using ClueGO, Saccharomyces cerevisiae has to be downloaded from ClueGO for use.
          • Click the button next to the box that reads "Homo Sapiens [9606]". Then search for Saccharomyces cerevisiae in the list and download. It will now be in the drop-down menu for the organism selection.
        • For a single strain run:
          • In the empty box in the "Load Marker List(s)" section, copy and paste the list of genes from the excel workbook. Do not include any column header.
          • In the "ClueGO Settings" section under the Ontologies/Pathways, select the GO term analysis you want.
          • In addition, you can change the network specificity and select whether to show pathways with p value restriction.
          • Once you have selected all of the settings, run the analysis by clicking the "Start" button under the "ClueGO Functional Analysis" section.
          • A network will be shown on the right side and a table will be shown below the network.
        • For a comparison:
          • Copy and paste the list of genes into the empty box in the "Load Marker List(s)" section. For example, the gene list from "wt Gene List 45".
          • Click the "+" button below the box to add more strain data.
            • A new box will appear. Copy and past a gene list from another strain into this box. For example, the gene list from "dCIN5 Gene List 45".
          • The automated coloring for each input is red for the first list and blue for the second list, this can be changed by selecting the box with the colored line that appears to the right of the input box.
          • Select the settings the same way as for a single run and click "Start".
          • Visualize the network in different ways using the "Visual Style" section. You can visualize the network by the functional groups, clusters (or strain), or significance.
      • For a GO term run:
        • Analysis Mode: select 'Preselected Functions'
          Figure 1: ClueGO Headers for GO Term Analysis
        • For a single strain run:
          • Copy and paste the list of GO terms and p-values from the excel sheet into the empty input box in the "Load Marker List(s)" section. You should have two columns in the ClueGO input box. Scroll to the top of the lists in the box and rename the columns as follows: (Fig 1)
            • For the GO term list: "GOID:PathwayID", hit tab
            • For the p-values: "p value"
          • Select the ClueGO settings as you would for a run with the gene IDs and then click "start".
        • For a strain comparison run:
          • Copy and paste the list of GO Ids into the empty box in the "Load Marker List(s)" section. For example, the GO ID list from "wt GO 45". Make sure to name the columns with "GOID:PathwayID" and "p value".
          • Click the "+" button to add more data. A new box will appear. Copy and past a GO ID list from another strain into this box. For example, the GO ID list from "dCIN5 GO 45". Rename the columns.
          • Select the CluGO settings as you would for a single strain run. Click "Start".
          • Visualize the network in different ways using the "Visual Style" section.
  4. CompGO
    1. CompGO Vignettes, CompGO paper, and CompGO user manual

Week 4 : June 10 - 13

  • Goals:
  1. Present a journal club on the ClueGO paper and the progress that I have made on ClueGO so far.
  2. Begin modeling experiments
    • Variable inclusion of strain data for db5
      • First make a list of experiments in Excel to keep track.
      1. wt-only
      2. wt + each strain individually
      3. wt + two strains
      4. wt + three strains
      5. wt + 4 of the five deletion strains (in other words, leaving one deletion strain out)
        Figure 2: Example of ClueGO GO Term Comparison - Profile 45 wt vs. dCIN5
    • looking at production rates
  • Progress:
  1. Week 4 ClueGO Journal Club
    • Includes all of the networks that have been created with ClueGO so far.
  2. ClueGO
    • Run a ClueGO analysis on the rest of the strains for the GO terms. Run a comparison analysis between wild-type and deletion strain (not including dHMO1) for profile 45, 9, 22, and 48. Figure 2 shows an example of a ClueGO comparison.
  3. Modeling experiments:
    • Models will be run on the strains (wt, dCIN5, dGLN3, dHAP4, dHMO1, and dZAP1) by deleting data entirely. For example, a model will run with all but one strain (i.e. wt-dCIN5-dGLN3-dHAP4-dHMO1). I have created an Excel sheet that shows all of the models that will be run. Performing multiple runs on the same computer with MATLAB GRNmap
    • On June 11, I have started to run 26 models on GRNsight MATLAB version 1.10.
      • There was an issue with the CPU affinity selections for the trials. I would select an affinity for the MATLAB run, but when I would check the CPU again, all of the processors would be selected. After that, I did not know which CPU correlated to which deletion run. For instance, CPU 0 was chosen for the all-strain model, but when I went back through the affinities, I could not find a MATLAb.exe with CPU 0 selected.
        • In order for one processor to be used, the CPU has to be chosen after the model has started to run, not when the MATLAB command window pops up. Once the model ("Figure 1") window pops up, the CPUs return to being all checked. Therefore, in order to keep each model restricted to one CPU, it has to be selected after the file has been chosen and the "Figure 1" window pops up.
    • On June 12, I started to run the rest of the models. In total, there are 32 models.
Times and computer used for all of the model runs
Cerevisiae Computer June 11, 2019                           Paradoxus Computer June 11, 2019                     Paradoxus Computer June 12, 2019       
   Time              Strains Included                          Time               Strains Included                  CPU         Time           Strains Included   
1:20 - 4:43      all-strain                                 2:38 - 6:30      wt - dCIN5 - dHMO1 - dZAP1              0      11:02 - 12:15        wt-only
1:27 - 7:27      wt - dCIN5 - dGLN3 - dHAP4 - dHMO1         2:41 - 7:08      wt - dGLN3 - dHAP4 - dHMO1              1      11:03 - 12:14       wt - dCIN5
1:33 - 7:31      wt - dCIN5 - dGLN3 - dHAP4 - dZAP1         2:44 - 8:07      wt - dGLN3 - dHAP4 - dZAP1              2      11:06 - 11:51       wt - dGLN3
1:37 - 8:42      wt - dCIN5 - dGLN3 - dHMO1 - dZAP1         2:47 - 7:00      wt - dGLN3 - dHMO1 - dZAP1              3      11:09 - 12:30       wt - dHAP4
1:40 - 6:16      wt - dCIN5 - dHAP4 - dHMO1 - dZAP1         2:52 - 6:17      wt - dHAP4 - dHMO1 - dZAP1              4      11:12 - 2:22        wt - dHMO1
1:41 - 3:27      wt - dGLN3 - dHAP4 - dHMO1 - dZAP1         2:55 - 6:42      wt - dCIN5 - dGLN3                      5      11:14 - 1:47        wt - dZAP1
1:47 - 3:14      wt - dCIN5 - dGLN3 - dHAP4                 2:57 - 6:11      wt - dCIN5 - dHAP4
1:49 - 6:33      wt - dCIN5 - dGLN3 - dHMO1                 3:00 - 7:01      wt - dCIN5 - dHMO1
1:53 - 5:15      wt - dCIN5 - dGLN3 - dZAP1                 3:02 - 7:11      wt - dCIN5 - dZAP1
1:55 - 4:12      wt - dCIN5 - dHAP4 - dHMO1                 3:04 - 6:55      wt - dGLN3 - dHAP4
2:01 - 3:36      wt - dCIN5 - dHAP4 - dZAP1                 3:07 - 7:01      wt - dGLN3 - dHMO1
                                                            3:09 - 8:57      wt - dGLN3 - dZAP1
                                                            3:11 - 5:59      wt - dHAP4 - dHMO1
                                                            3:14 - 7:23      wt - dHAP4 - dZAP1
                                                            3:16 - 7:25      wt - dHMO1 - dZAP1
    • After all of the models have run, they create an output file that includes optimized parameters.
      • I have created an Excel workbook for the optimized production rates, threshold (b), and weights for each of the strain deletions. In addition, I have compared the LSE, minLSE, and LSE:minLSE ratios for each of the strain deletions and have created a bar graphs for each. Excel workbooks
    • The output files were run through GRNsight, and the weighted SIF files were downloaded and used in the creation of heat maps. Using previous data by Lauren Kelly, I was able to normalize the data and create the heat maps for visualization of the activation and repression of the genes.
    • Heat maps were created for the strain deletions. They were sorted based on how I wrote them, increasing LSE:minLSE ratio, minLSE, and LSE.

Week 5: June 17 - 20

  • Goals:
  1. Clustering
  2. Analysis of variable inclusion of strain data runs
    • Bar charts for P's and b's
    • Look at expression plots and GRNsight networks. Maybe the LSE:minLSE ratios are getting bigger for the dGLN3 and dZAP1 data because the expression is more divergent from the other strains and the model has to balance matching all datasets. In this view, an increasing LSE:minLSE ratio is not necessarily bad.
    • Look at MA plots, box plots, and QA reports for strains and chips to see if there is a relationship between data quality and these results.
    • Investigate why removing strain data leads to a smaller minLSE.
  3. Look at production rates from Neymotin, B., Athanasiadou, R., & Gresham, D. (2014). Determination of in vivo RNA kinetics using RATE-seq. Rna, 20(10), 1645-1652.
  4. Compile another set of modeling runs, this time using db1-db7 (28 runs).
    • estimate P, b, w
    • estimate P, w, fix b
    • estimate b, w, fix P
    • estimate w only, fix P and b
      • note that for the fix P runs, we can either choose P like we currently set the initial guess, or choose from published work like Neymotin et al. (2014).
  5. Dr. Dahlquist will provide between-strain ANOVA stats for GO analysis
  6. Upload the output analysis files to GitHub under an analysis folder.
  7. Present on the progress made throughout the week.
  8. Bar graphs for threshold (b) and production rates, pnew/pall, bnew/ball, log2 of b and p.
  • Progress:
  1. Lab Meeting Week 5 Presentation
  2. I have continued the analysis of the estimated parameters from GRNmap (specifically, optimized threshold (b) and production rates). I created bar charts for each parameter under each gene. I plotted the optimized threshold (b) and production rate values as they were in the output file and then plotted the ratio of each respective value : the all-strain value. Then I plotted the log2 of the ratio for both the threshold and production rates. For each of the bar charts generated from this analysis, I sorted by increasing value.
  3. I have run each of the 32 networks through GRNsight and have compiled the resulting GRNs from the runs.
  4. I have compiled the expression networks from the GRNmap runs for the strain deletions.
    • The addition of dHMO1 as a strain increases the number of divergences in the expression plots (76.3% of expression plots that contain dHMO1 in the strain deletion were divergent).
    • There was never divergence in expression for GCR2, HMO1, and dZAP1.
    • The inclusion of dGLN3 or dHAP4 in the strain deletion did not cause any divergence in the expression plots. However, the inclusion of dCIN5, dHMO1, or dZAP1 caused divergence to occur.
      • dHMO1 caused the most divergence in expression, then dCIN5, then dZAP1.
Gene            The expression diverges if:                 Gene             The expression diverges if:
ACE2             dZAP1                                      MSN2              dHMO1
ASH1             dCIN5, dHMO1, dZAP1                        SFP1              dCIN5, dHMO1
CIN5             dHMO1                                      STB5              dCIN5, dHMO1
GCR2             never                                      SWI4              dHMO1
GLN3             dCIN5, dHMO1, dZAP1                        SWI5              dCIN5, dHMO1
HAP4             dCIN5, dHMO1                               YHP1              dCIN5, dHMO1, dZAP1
HMO1             never                                      YOX1              dHMO1
                                                            ZAP1              never
* The expression diverges if dCIN5, dHMO1, or dZAP is included.
  1. Determination of in vivo RNA kinetics using RATE-seq - Neymotin et al., 2014
    • Looking for production rates
      • Estimation of in vivo rates of RNA synthesis or degradation
      • "Many transcripts in budding yeast have similar steady-state levels but differ greatly in their rates of production and degradation" (Neymotin et al., 2014).
      • Degradation rates are proportional to the abundance of RNA. The rate of change in RNA abundance is determined using d[RNA]/dt = k - α[RNA], where d[RNA]/dt is the rate of change in abundance, k is the constant rate of synthesis, and α[RNA] is the RNA abundance where α = α(RNA) + α(growth).
      • α is the RNA concentration and α(RNA) is the degradation rate constant and α(growth) is the cell's division rate constant. α = α(RNA) + α(growth)
        • α(RNA) degradation rate constant was determined using a nonlinear weighted regression from the data set.
      • At a steady state. the abundance of RNA remained constant and the transcript synthesis and degradation are related by k = α[RNA], where k is the rate of transcript synthesis and α[RNA] is the RNA abundance and α is the degradation rate constand plus the cell's division rate.
        • "The rate of transcript production can be estimated using the degradation rate constant and the steady-state abundance of the transcript" (Neymotin et al., 2014).
        • They published estimates for the rates of mRNA synthesis in steady-state conditions.
    • Genes under the same Gene Ontology (GO) terms tended to have similar degradation rates.
    • They calculated their production rates by using k = α[RNA].
      • They have an excel spreadsheet where they include the α values, half life values, steady-state abundance, and their calculated synthesis values.
      • In order to understand how they obtained their production rates, I attempted to do the same calculation. I found that they multiplied the alpha value and the ss.abundance value together. They included the α low and α high values, along with the synthesis rate low and synthesis rate high values.
Production Rates from Neymotin et al., 2014 compared to Dahlquist Lab Estimated Production Rates and Initial Guesses
 Gene         Neymotin Production Rate        Estimated Production Rate         Initial Guess Production Rate
 ACE2                  0.18                          0.2017                               0.2236
 ASH1                  1.037                         1.6767                               0.4332
 CIN5                  0.063                         0.6556                               0.2009
 GCR2                  0.327                         0.2315                               0.1925
 GLN3                  0.365                         0.3021                               0.3224
 HAP4                  1.827                         1.3023                               0.2718
 HMO1                  1.406                         0.3067                               0.0990
 MSN2                  0.487                         2.5565                               0.4077
 SFP1                  1.199                         1.5545                               0.6931
 STB5                  0.08                          0.1197                               0.1400
 SWI4                  0.157                         0.3157                               0.2829
 SWI5                  0.34                          1.9213                               0.3224
 YHP1                  0.283                         0.2080                               0.1733
 YOX1                  1.028                         1.3911                               0.7296
 ZAP1                  0.082                         0.1282                               0.1042
* The Neymotin production rates are given in the table above alongside the estimated production rates for the all-strain model run.
  1. GRNsight was used to create GRNs for the strain deletions.
    • GRNsight provides a conversion of the weight matrix to a vertical excel sheet providing the optimized weights for each of the edges in the network. For db5, there are 28 edges and 15 nodes.
db5 Gene Regulatory Network

Week 6: June 24 - 27

  • Goals:
  1. Clean up the slides for the research presentation and add annotations and labels to everything.
    • Add the MSE values to the expression plots in the slides. These values may correlate with the results.
    • Make a 3 x 3 chart of all the expression plots for one gene that seemed to diverge across the strain deletions. (Add the MSE to those plots as well and add a title to explain what it is)
    • Make sure that all of the sentences have periods and you don't switch between sentence and fragments on the same slide. You can vary between slides, but not within slides.
    • Add to the p-value slide and explain which ones were used in the STEM and further analyses.
  2. Make sure that the online notebook has everything annotated and has links to the tables and graphs created.
  3. Upload everything that needs to be uploaded to github.
  • Progress:
    • Summer 2019 Research Presentation
    • Expression plots
      • The inclusion of dCIN5, dHMO1, and dZAP1 data, the simulated model data did diverge for the strains, indicating a better fit. However, when dCIN5, dHMO1, and dZAP1 data were not included, the simulated model data did not diverge for the strains, suggesting a worse fit.

Fall 2019

  • Where we left off:
    • Over the summer we ran a gene ontology analysis on the different profiles from the STEM analysis, determining the over-represented functional categories. In addition, using GRNmap, a parameter re-estimation for trials with deletion of strain data was conducted. It was determined that dCIN5, dHMO1, and dZAP1 showed a better fit when they were included in the data, while the inclusion of dGLN3 and dHAP4 data showed a worse fit. In a future study, another GRNmap analysis will be run, this time on db1-db7 running models where:
      • estimate P, b, w
      • estimate P, w, fix b
      • estimate b, w, fix P
      • estimate w only, fix P and b
      • *P being the production rate, b the threshold, and w the weight.

Week 1: September 3-4

  • Modelling Experiment
    • Modified the Excel input workbooks for db1-db7 for the new runs (28 total).
    • Production rate values for the new runs:
      • For the "Fixed P" runs, the production rates that we use could be the initial guesses, like the ones we used in the previous model, or they could be taken from literature, like the Neymotin et al. (2014) paper.
      • I reviewed the Neymotin et al. paper, from which the degradation rates were taken for the previous model runs, and I determined that they calculated their rates using k= α[RNA]. (More information about the Neymotin et al. paper is in Summer 2019: Week 5.)
        • The production rates in the table below were determined using the excel spreadsheet provided in the paper's supplemental material.
Comparison of the production rates from the Neymotin et al.(2014) paper and the runs conducted in the Dahlquist lab
Production Rates from Neymotin et al., 2014 compared to Dahlquist Lab Estimated Production Rates and Initial Guesses
 Gene         Neymotin Production Rate        Estimated Production Rate*        Initial Guess Production Rate
 ACE2                  0.18                          0.2017                               0.2236
 ASH1                  1.037                         1.6767                               0.4332
 CIN5                  0.063                         0.6556                               0.2009
 GCR2                  0.327                         0.2315                               0.1925
 GLN3                  0.365                         0.3021                               0.3224
 HAP4                  1.827                         1.3023                               0.2718
 HMO1                  1.406                         0.3067                               0.0990
 MSN2                  0.487                         2.5565                               0.4077
 SFP1                  1.199                         1.5545                               0.6931
 STB5                  0.08                          0.1197                               0.1400
 SWI4                  0.157                         0.3157                               0.2829
 SWI5                  0.34                          1.9213                               0.3224
 YHP1                  0.283                         0.2080                               0.1733
 YOX1                  1.028                         1.3911                               0.7296
 ZAP1                  0.082                         0.1282                               0.1042
* The Neymotin et al. production rates are given in the table above alongside the estimated production rates for the all-strain model run and the initial guesses for the model runs.

Week 2: September 10-11

  • The variance may be due to the sensitivity of the model.
  • Add two columns to the production rate bar graphs: the Neymotin et al. production rates and the initial guesses for the production rate.
  • Look at other papers to see the production rates and see how they compare.
  • Run the forward model with the estimated production rates.
  • Run the db5 all strain model a few more times.
  • Running Models
    • On 9/10/19, the models for the "Estimate P, w, b"; "Estimate P, w; Fix b", and db 1 and 2 for "Estimate w; Fix P, b" were run.
    • On 9/11/19, the models for the "Estimate b, w; Fix P" and db3 - db7 for "Estimate w; Fix P, b" were run.

Week 3: September 17-18

  • The optimized production rates from the data deletion runs for db5 were compared to the initial guesses and the Neymotin et al. (2014) calculated production rates.
  • Look at clustering and sensitivity
  • 28 model runs on db1-db7:
    • Estimate P, b, w
    • Estimate w, Fix P, b
    • Estimate P, w, Fix b
    • Estimate b, w, Fix P
  • Compilation of production rates, thresholds, weights, and LSE:minLSE ratio for the four runs for each of db1-db7.


Dahlquist Lab

Personal User Page