Alyssa N Gomes Week 14 Journal: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(added mig2 errror part)
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{Template: Alyssa N Gomes}}
=== Introduction to GRNmap and Gene Regulatory Network Modeling ===
=== Introduction to GRNmap and Gene Regulatory Network Modeling ===


For this week's assignment, you will finally run the GRNmap model on the input workbook you created for the Week 13 Assignment. You will run the optimization twice; once where the threshold parameters, b, are '''not''' estimated and once where the threshold parameters ''''are''' estimated.  You will compare the estimated weight and production rate parameters outputted by these two runs with each other.
[[BIOL398-04/S15:Week 14|Assignment]]
# In the optimization_parameters sheet of your input workbook, set fix_b to 1, fix_P to 0, and iestimate to 1. Set alpha to 0.01. kk_max should be 1, MaxIter and MaxFunEval should be 1e08 (one hundred million in plain English), and TolFun and TolX should be 1e-6. Sigmoid should be 1. igraph should be 1. simtime should be 0 5 <...fill by steps of 5...> 60, each number in a different cell. Strain, Sheet, Deletion depend on how your data is organized in the workbook and what problem you are solving.  You should have two strains, one of which is wt (wild type). Sheet denotes the number of the tab, reading left to right along the tabs at the bottom, where the strain data is positioned in the workbook. Deletion should be 0 for the wild type, 0 for paradoxus (looking at you Natalie and Karina), and for your deletion strain, the position in your list of genes for that deleted gene. For example, if you have dCIN5, and CIN5 is the 7th gene in your list (as organized in your degradation_rates, production_rates, network, etc sheets), then put a 7 for dcin5 in the Deletion parameter row. After prepping this sheet, run GRNmodel.
This week's procedure involves running the GRNmap model on the input workbacks from last week using two different parsmeters for the threshold. We will use these two separate parameters in order to compare the runs and production rates.
# Save all your graphs as jpegs and paste them into a powerpoint file. Please label things clearly, placing an appropriate number of graphs on each page for a readable visual.  Take some care to make sure that the graphs are the same size and the aspect ratio has not been changed.
 
# Create a new workbook for analyzing the weight data. In this workbook, create a new sheet: call it estimated_weights. In this new worksheet, create a column of labels of the form ControllerGeneA -> TargetGeneB, replacing these generic names with the standard gene names for each regulatory pair in your network. Remember that columns represent Controllers and rows represent Targets in your network and network_weights sheets.
# In the optimization parameters of the input workbook, let fix_b=1 and fix_P=0 and iestimate=1. Let alpha be 0.01 and kk_max be 1. Let Maxlter and MaxFunEVal equal 1e08.  Let TolFun and TolX be 1e-6 MaxIter and MaxFunEval should be 1e08 (one hundred million in plain English), and TolFun and TolX should be 1e-6. Sigmoid should be 1. igraph should be 1. simtime should be 0 5 10 15 20 25 30 35 40 45 50 55 60, each number in a different cell. After prepping this sheet, run GRNmodel by setting all the files in the matlab file folder and then type in 'GRNmodel' and click the input file. .
# save all graphs as jpegs and put in the powerpoint, labeling. Save all your graphs as jpegs and paste them into a powerpoint file.  
# make a new workbook in excel and name the sheet estimated_weights. Create a new workbook for analyzing the weight data. create a column "ControllerGeneA->TargetGeneB". And then in the rows below, input the control->target genes given in your network sheet. Then create columns next to it, using the non-zero optimized weights and save the worksheet.  change fix_b to 0 in the optimization parameters worksheet. rerun the GRNmodel. re-prep the sheet by putting in new weights.  
# Extract the non-zero optimized weights from their worksheet and put them in a single column next to the corresponding ControllerGeneA -> TargetGeneB label.
# Extract the non-zero optimized weights from their worksheet and put them in a single column next to the corresponding ControllerGeneA -> TargetGeneB label.
# Save your input workbook as a new file with a meaningful name (e.g. append "estimate-b" to the previous filename), and change fix_b to 0 in the "optimization_parameters" worksheet, so that the thresholds will be estimated. Rerun GRNmodel with the new input sheet.
# create a new worksheet and copy the weights into it. create a chart that compares the two weights.  
# Repeat Parts (2) through (4) with the new output.
# do the same with the production rates. and put the charts into the powerpoint.
# Create an empty excel workbook, and copy both sets of weights into a worksheet.
#use GRNsight to visualize this,
# Create a bar chart in order to compare the "fixed b" and "estimated b" weights.
# use GRNsight to model these genes and put into the powerpoint, with similar positioning for the GRNsight as previously.  
# Repeat (7) and (8) with the production rates.
#update your journal with the new excel sheets and grnsight photos.
# Copy the two bar charts into your powerpoint.
# interpret the results, examining the graphs and answering several questions.  
# Visualize the output of each of your model runs with GRNsight.  Arrange the genes in the same order you used to display them in your Week 12 assignment for both of your model output runs.  Take a screenshot of each of the results and paste it into your PowerPoint presentation.  Clearly label which screenshot belongs to which run.
#* Note that GRNsight will display differently now that you have estimated the weights.  For positive weights > 0, the edge will be given a regular (pointy) arrowhead to indicate an activation relationship between the two nodes. For negative weights < 0, the edge will be given a blunt arrowhead (a line segment perpendicular to the edge direction) to indicate a repression relationship between the two nodes. The thickness of the edge will vary based on the magnitude of the absolute value of the weight. Larger magnitudes will have thicker edges and smaller magnitudes will have thinner edges. The way that GRNsight determines the edge thickness is as follows. GRNsight divides all weight values by the absolute value of the maximum weight in the matrix to normalize all the values to between zero and 1. GRNsight then adjusts the thickness of the lines to vary continuously from the minimum thickness (for normalized weights near zero) to maximum thickness (normalized weights of 1). The color of the edge also imparts information about the regulatory relationship. Edges with positive normalized weight values from 0.05 to 1 are colored magenta; edges with negative normalized weight values from -0.05 to -1 are colored cyan. Edges with normalized weight values between -0.05 and 0.05 are colored grey to emphasize that their normalized magnitude is near zero and that they have a weak influence on the target gene.  
# Upload your powerpoint, your two input workbooks, and your two output workbooks and link to them in your individual journal.
# Interpreting your results.
#* Examine the graphs that were output by each of the runs.  Which genes in the model have the closest fit between the model data and actual data?  Why do you think that is?  How does this help you to interpret the microarray data?   
#* Examine the graphs that were output by each of the runs.  Which genes in the model have the closest fit between the model data and actual data?  Why do you think that is?  How does this help you to interpret the microarray data?   
#* Which genes showed the largest dynamics over the timecourse?  Which genes showed differences in dynamics between the wild type and the other strain your group is using?  Given the connections in your network (see the visualization in GRNsight), does this make sense?  Why or why not?
#* Which genes showed the largest dynamics over the timecourse?  Which genes showed differences in dynamics between the wild type and the other strain your group is using?  Given the connections in your network (see the visualization in GRNsight), does this make sense?  Why or why not?
#* Examine the bar charts comparing the weights and production rates between the two runs.  Were there any major differences between the two runs?  Why do you think that was?  Given the connections in your network (see the visualization in GRNsight), does this make sense?  Why or why not?
#* Examine the bar charts comparing the weights and production rates between the two runs.  Were there any major differences between the two runs?  Why do you think that was?  Given the connections in your network (see the visualization in GRNsight), does this make sense?  Why or why not?
==Data and Observations==
* [[Media:TM AG Model Figures RUN 2.zip| Figures]] with fix_b at 1, fix_P at 0, iestimate at 1, and alpha at 0.01
*The experiment was run by Alyssa and Tessa together (same excel documents)
<br>'''fix_b to 1'''
<br>[[Media:TM AG expression data params b 1.xlsx|Input]]
<br>[[Media:TM AG expression data params b 1 estimation output.xlsx|Output]]
<br>'''fix_b to 0'''
<br>[[Media:TM AG expression data params.xlsx|Input]]
<br>[[Media:TM AG expression data params b 0 estimation output.xlsx|Output]]
<br>'''Powerpoint'''
<br>'''Excel Summary'''
[[Media:TM AG Summary Workbook.xlsx]]
*Our GRNsight images are included in the powerpoint
*upon doing this, Tessa and I ran into a couple issues.  For the first output file, we had forgotten to save the numbers for the weights even though we had saved the photos.  Upon us re-doing this, looking at the weights, we saw that all of the output weights came out either as 0 or 1.  We attempted to go back and examine our errors, re-editing the original files to double check that we were correct.
*upon re-running it in matlab on several different computers, we continued to find the weights of 0 and 1and certain files kept on retracting back to the original numbers and inputs for the cin5 template.  we decided eventually to upload all images and output files anyways, in order to seek future help for our final presentation. 
* Which genes showed the largest dynamics over the timecourse?  Which genes showed differences in dynamics between the wild type and the other strain your group is using?  Given the connections in your network (see the visualization in GRNsight), does this make sense?  Why or why not?
**Upon doing this week's assignment, we had a hard time interpreting and discovering the dynamics over the timecourse.  Our information was very scattered and amongst re-trials, we had found no significant trend or method in finding an accurate difference between wild type and dgln3. 
* Examine the bar charts comparing the weights and production rates between the two runs.  Were there any major differences between the two runs?  Why do you think that was?  Given the connections in your network (see the visualization in GRNsight), does this make sense?  Why or why not?
**Again, due to error and all output rates equaling 0 or 1 when we set the fix_b to 0, we saw indescrepancies for the output answers.  Our connections in the GRNsight also gave no insight as to a conclusion
*Future plans: This experiment definitely seemed like it was going fine and like we would have completed our assignment much much earlier today, given we had worked on it previously in the week and had an idea of what we were doing.  However, now, although we put in the images for the first trial where b=0 , we cannot be sure that those images are correct or follow our error that we discovered upon realizing that we needed the weights again.  However, we must note that our graphs had, for some of them, just the wild type information gathered and displayed.


____________________________________________________
____________________________________________________
Line 25: Line 47:


* Figures for the Updated Parameters[[Media:TM AG Model Figures RUN 2.zip| Figures]] with fix_b at 1, fix_P at 0, iestimate at 1, and alpha at 0.01
* Figures for the Updated Parameters[[Media:TM AG Model Figures RUN 2.zip| Figures]] with fix_b at 1, fix_P at 0, iestimate at 1, and alpha at 0.01
***PPT: [[Media:TMAGWeek11 and 12new.pptx|Tessa Morris and Alyssa Gomes PPT]]
==4/28==
*Upon getting to class Tuesday, we were told that we would be given an extension until Thursday due to numerous errors
*After re-checking our input file for the b=0 estimate we realized that the iestimate value was set to 0 rather than 1. We are currently re-running it but have hopes it will work because the counter image came up, unlike before, because there was actually something to estimate.
*The experiment was run by Alyssa and Tessa together (same excel documents)
<br>'''fix_b to 1'''
<br>[[Media:TM AG expression data params RUN3.xlsx|Input]]
<br>[[Media:TM AG expression data params b 1 estimation output.xlsx|Output]]
<br>'''fix_b to 0'''
<br>[[Media:TM AG expression data params RUN4.xlsx|Input]]
<br>[[Media:TM AG expression data params RUN4 estimation output.xlsx|Output]]
<br>'''Powerpoint'''
<br>[[Media:TMAGWeek11 and 12new.pptx|Updated Powerpoint]]
<br>'''Excel Summary'''
<br>[[Media:TM AG Summary Workbook.xlsx|Summary Workbook]]
*Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
**MSN4 and ACE2 had the largest dynamics over the course. Genes that showed dramatic differences between would be most of them but MSN2, ACE2, YHP1 and YOX1.  This makes sense for ACE2, MSN2 and MSN4 because they were very connected on the GRNsight
Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
**Some showed up negative on the weights page. The scale on the production is much lower than the weights, noting a decreased amount of each product.  The MSN4 showed up much higher on the production than weights.


*Tessa and Alyssa Updated PPTTMAGWeek11 and 12new.pptx
==4/30 In Class==
*Upon looking over our powerpoint we realized that we had a typo in our networks for the input, substituting MIG1 for MIG2 for the dgln3 sheet in the workbook.  We went back and edited this and re-ran it so that we would get the proper graph outputs.

Latest revision as of 10:04, 30 April 2015

Alyssa N Gomes

MATH 388-01: Survey of Biomathematics Loyola Marymount University

Assignments       Individual Journal Page        Shared Journal Page            

Introduction to GRNmap and Gene Regulatory Network Modeling

Assignment This week's procedure involves running the GRNmap model on the input workbacks from last week using two different parsmeters for the threshold. We will use these two separate parameters in order to compare the runs and production rates.

  1. In the optimization parameters of the input workbook, let fix_b=1 and fix_P=0 and iestimate=1. Let alpha be 0.01 and kk_max be 1. Let Maxlter and MaxFunEVal equal 1e08. Let TolFun and TolX be 1e-6 MaxIter and MaxFunEval should be 1e08 (one hundred million in plain English), and TolFun and TolX should be 1e-6. Sigmoid should be 1. igraph should be 1. simtime should be 0 5 10 15 20 25 30 35 40 45 50 55 60, each number in a different cell. After prepping this sheet, run GRNmodel by setting all the files in the matlab file folder and then type in 'GRNmodel' and click the input file. .
  2. save all graphs as jpegs and put in the powerpoint, labeling. Save all your graphs as jpegs and paste them into a powerpoint file.
  3. make a new workbook in excel and name the sheet estimated_weights. Create a new workbook for analyzing the weight data. create a column "ControllerGeneA->TargetGeneB". And then in the rows below, input the control->target genes given in your network sheet. Then create columns next to it, using the non-zero optimized weights and save the worksheet. change fix_b to 0 in the optimization parameters worksheet. rerun the GRNmodel. re-prep the sheet by putting in new weights.
  4. Extract the non-zero optimized weights from their worksheet and put them in a single column next to the corresponding ControllerGeneA -> TargetGeneB label.
  5. create a new worksheet and copy the weights into it. create a chart that compares the two weights.
  6. do the same with the production rates. and put the charts into the powerpoint.
  7. use GRNsight to visualize this,
  8. use GRNsight to model these genes and put into the powerpoint, with similar positioning for the GRNsight as previously.
  9. update your journal with the new excel sheets and grnsight photos.
  10. interpret the results, examining the graphs and answering several questions.
    • Examine the graphs that were output by each of the runs. Which genes in the model have the closest fit between the model data and actual data? Why do you think that is? How does this help you to interpret the microarray data?
    • Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
    • Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?

Data and Observations

  • Figures with fix_b at 1, fix_P at 0, iestimate at 1, and alpha at 0.01
  • The experiment was run by Alyssa and Tessa together (same excel documents)


fix_b to 1
Input
Output
fix_b to 0
Input
Output
Powerpoint
Excel Summary Media:TM AG Summary Workbook.xlsx

  • Our GRNsight images are included in the powerpoint


  • upon doing this, Tessa and I ran into a couple issues. For the first output file, we had forgotten to save the numbers for the weights even though we had saved the photos. Upon us re-doing this, looking at the weights, we saw that all of the output weights came out either as 0 or 1. We attempted to go back and examine our errors, re-editing the original files to double check that we were correct.
  • upon re-running it in matlab on several different computers, we continued to find the weights of 0 and 1and certain files kept on retracting back to the original numbers and inputs for the cin5 template. we decided eventually to upload all images and output files anyways, in order to seek future help for our final presentation.
  • Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
    • Upon doing this week's assignment, we had a hard time interpreting and discovering the dynamics over the timecourse. Our information was very scattered and amongst re-trials, we had found no significant trend or method in finding an accurate difference between wild type and dgln3.
  • Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
    • Again, due to error and all output rates equaling 0 or 1 when we set the fix_b to 0, we saw indescrepancies for the output answers. Our connections in the GRNsight also gave no insight as to a conclusion
  • Future plans: This experiment definitely seemed like it was going fine and like we would have completed our assignment much much earlier today, given we had worked on it previously in the week and had an idea of what we were doing. However, now, although we put in the images for the first trial where b=0 , we cannot be sure that those images are correct or follow our error that we discovered upon realizing that we needed the weights again. However, we must note that our graphs had, for some of them, just the wild type information gathered and displayed.

____________________________________________________

  • Figures for the Original Parameters: Figures



4/28

  • Upon getting to class Tuesday, we were told that we would be given an extension until Thursday due to numerous errors
  • After re-checking our input file for the b=0 estimate we realized that the iestimate value was set to 0 rather than 1. We are currently re-running it but have hopes it will work because the counter image came up, unlike before, because there was actually something to estimate.
  • The experiment was run by Alyssa and Tessa together (same excel documents)


fix_b to 1
Input
Output
fix_b to 0
Input
Output
Powerpoint
Updated Powerpoint
Excel Summary
Summary Workbook

  • Which genes showed the largest dynamics over the timecourse? Which genes showed differences in dynamics between the wild type and the other strain your group is using? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?
    • MSN4 and ACE2 had the largest dynamics over the course. Genes that showed dramatic differences between would be most of them but MSN2, ACE2, YHP1 and YOX1. This makes sense for ACE2, MSN2 and MSN4 because they were very connected on the GRNsight

Examine the bar charts comparing the weights and production rates between the two runs. Were there any major differences between the two runs? Why do you think that was? Given the connections in your network (see the visualization in GRNsight), does this make sense? Why or why not?

    • Some showed up negative on the weights page. The scale on the production is much lower than the weights, noting a decreased amount of each product. The MSN4 showed up much higher on the production than weights.

4/30 In Class

  • Upon looking over our powerpoint we realized that we had a typo in our networks for the input, substituting MIG1 for MIG2 for the dgln3 sheet in the workbook. We went back and edited this and re-ran it so that we would get the proper graph outputs.