Kara M Dismuke Week 13 Journal

From OpenWetWare
Jump to: navigation, search

Corrections to Week 12 Individual Assignment

NOTE (4/16/2015): Upon further analysis, I realized I deleted a column or row of 0s for a particular gene whereas I was supposed to only delete a column/row if the column AND the row had 0s for the particular gene.

Week 13 Electronic Lab Notebook

Create the Input Excel Workbook for the Model

  1. Downloaded file "Input_4_gene_forward_correct_params.xlsx" off of LionShare., changed the name, and edited it to include my data
    • Gave file meaningful filename that includes your last name or initials: "Dismuke_wk13_ONLY"
  2. Determined the transcription factors that I am including in my network, using the "transposed" Regulation Matrix that you generated from YEASTRACT in the Week 12 Assignment
    • Copied transposed matrix from "network" sheet and pasted it into the worksheets: "network" and "network_weights"
    • Kept transcription factor names in the same order and same format across the top row and first column.
  3. Edited sheet: "degradation_rates"
    • Pasted list list of transcription factors from your "network" sheet into the column named "StandardName" and then looked up up the "SystematicName" of your genes via YEASTRACT
    • Looked up the degradation rates for my list of transcription factors using this file and included them in my "degradation_rates" worksheet
      • Note: these rates have been calculated from protein half-life data from a paper by Belle et al. (2006)
    • For transcription factors not appearing in the file above, used the value "0.027182242" for the degradation rate.
      • These TFs were: CYC8 (YBR112C), RIF1 (YBR275C), SNF6 (YHL025W), and HMO1 (YDR174W)
  4. Edited sheet: "production_rates"
    • Pasted the "SystematicName" and "StandardName" columns from my "degradation_rates" sheet into "production_rates" sheet
    • Computed these values from your degradation rates and pasted the values into the column titled "ProductionRate"
      • Note: initial guesses for the production rates used for the model are two times the degradation rate
  5. Inputted expression data for the wild type strain and dzap1 strain and kept data in the same order as they appear in the other worksheets.
    • Put wild type data in sheet: "wt"
    • Put zap1 data in sheet "dcin5" and then changed the name of this sheet to "dzap1"
    • Pasted the SystematicName and StandardName columns from one of your previous sheets into this one
    • Note: this data in this sheet is the Log Fold Changes for each replicate and each time point from my Week 11 Assignment
    • Note: we only used the cold shock time points for the modeling, and thus, my column headings for the data were be "15", "30", and "60" (4 columns for time point for both the wt and dzap1 data sets)
    • Copied and pasted the data from your Week 11 spreadsheet into this one., only including the data for the genes in my network, and kept genes in the same order
  6. Edited sheet: "optimization_parameters" worksheet
    • For parameter "time" (Cell A13), replaced what was in the sample file with our data's time points ( "15", "30", and "60")
    • For parameter "Strain" (Cell A14), replaced "dcin5" with "dzap1"
    • For parameter "Deletion", left the zero in cell B15, and in cell C15, put a number corresponding to the position in the list of gene names that the gene that was deleted appears
      • Put 20 in C15 for dzap1 because it was in position 20 out of our list of 20 transcription factors (it was the last one in the list)
    • For parameter "simtime", performed the forward simulation of the expression in five minute increments from 0 to 60 minutes. Thus, this row should read: "simtime", "0", "5", "10", ..., "60".
  7. Edited sheet: "network_b".
    • Pasted in list of standard names for my transcription factors from a previous sheet
      • Note: in this case, don't need column for systematic name
    • Changed Cell A1 to say "StandardName"
    • "threshold" value for each gene should be "0"
  8. After completing this modifications, I uploaded my file to LionShare and sent it to Dr. Dahlquist and Dr. Fitzpatrick via e-mail (with a link to the file in the email).
    • NOTE (4/21/15): Upon receiving the feedback from Dr. Dahlquist, I went back into this Excel file and made minor changes. These changes involved adding the row of ASG1 back in (it had for some reason been deleted) and deleting the column of NDT80 (the row had been deleted but I forgot to delete the column too (both had all 0s)). Then, Kristen and I tried to figure out which two edges were different in our two networks but were unable to do so after a closer look. So, we have an email into Dr. Dahlquist to hopefully get word back on what we need to do to proceed.
    • NOTE (4/29/15): I realized I there were some resulting inaccuracies relating to my previous error that I had not caught beforehand. Thus, I created a new Excel file, which is now updated and correct.


Appendix: Full explanation of the "optimization_parameters" sheet

  • alpha: Penalty term weighting (from an L-curve analysis)
  • kk_max: Number of times to re-run the optimization loop: in some cases re-starting the optimization loop can improve performance of the estimation.
  • MaxIter: Number of times MATLAB iterates through the optimization scheme. If this is set too low, MATLAB will stop before the parameters are optimized.
  • TolFun: How different two least squares evaluations should be before it says it's not making any improvement
  • MaxFunEval: maximum number of times it will evaluate the least squares cost
  • TolX: How close successive least squares cost evaluations should be before MATLAB determines that it is not making any improvement.
  • Sigmoid: =1 if sigmoidal model, =0 if Michaelis-Menten model
  • iestimate: =1 if want to estimate parameters and =0 if the user wants to do just one forward run
  • iGraphs: =1 to output graphs; =0 to not output graphs
  • fix_P: =1 if the user does not want to estimate the production rate, P, parameter, use initial guess and never change; =0 to estimate
  • fix_b: =1 if the user does not want to estimate the b parameter, use initial guess and never change; =0 to estimate
  • time: A row containing a list of the time points when the data was collected experimentally. Should correspond to the timepoint column headers in the expression sheets.
  • Strain: A row containing a list of all of the strains for which there is expression data in the workbook. Should correspond to the names of the sheets for each strain.
  • Sheet: A row where each entry is the order number of the sheet (left to right) that corresponds to the list of strains above.
  • Deletion: Gives the index of the gene in the network sheet that has been deleted in each strain listed above. For example, if data has been provided for the CIN5 deletion strain, then give the index number from the network sheet corresponding to CIN5.
  • simtime: A list of times for which the forward simulation should be evaluated.

Template:Kara M Dismuke
Back to User: User: Kara M Dismuke

  1. Week 1
  2. Week 2
  3. Week 3
  4. Week 4
  5. Week 5
  6. Week 6
  7. Week 7
  8. Week 8
  9. Week 9
  10. Week 10
  11. Week 11
  12. Week 12
  13. Week 13
  14. Week 14
  15. Week 15