Alondra Vega: Week 13

Steps to make worksheet to run code

 * 1) Start with log2_concentrations.  Fill in the data for the genes you've identified as being in your network.  This may be a tedious bit of cut-and-paste.  The data you need is the average log2 fold change for each gene of interest and each time point.  You may find it easiest to use your previous (large) workbook from Week 11, adding a copy of your final sheet and editing that down to the genes of interest.
 * 2) * The log2_concentrations sheet has a header row. Columns A and B contain the gene names, the systematic name (e.g., YDR259C) followed by the standard name (e.g., YAP6). Entries in column C and beyond of that header row are the times of the data collection.  The format is that the rows represent genes while the columns represent the time course.  For the Schade data, use only the short term data (not the 12 hour and longer).  For the Dahlquist Lab data, use only the cold shock and not the recovery.
 * 3) Now, let's tackle the concentration_sigmas.  This sheet follows the same format of the log2_concentrations sheet, except that the data we enter are the standard deviations of the log2 fold changes for each gene at each time point.  We will extract this data from the Week 11 sheet as well.
 * 4) Next, move to the network sheet.  The rows are the affectees, while the columns are affectors.  This format is the transpose of the output from your YEASTRACT data.  Copy the YEASTRACT network, but when you paste it into your input excel workbook, use "paste transpose" (or "paste special" selecting "transpose" depending on your excel version).  This sheet also requires a header row and a header column, in which we label the genes by their standard names (e.g., YAP6, GLN3).
 * 5) The network_weights sheet is an initial guess, as we discussed in class.  Thus, we put our best shot in there.  Lacking other information, we can make them all ones for starters.
 * 6) The network_thresholds sheet is also an initial guess, as we discussed in class.  Thus, we put our best shot in there.  Lacking other information, we can make them all ones for starters.
 * 7) Degradation rates are harder.  My lionshare account also includes an excel workbook Belle_PNAS_06_SuppDataSet_with_abs.xls.  This file contains degradation rates for a number of the proteins translated from the transcription factors.  The data is in the sheet named ranked-and-averaged.  Find your genes there, and convert half-life to rate (lambda = LN(2)/half_life).  This data must be entered into the spreadsheet degradation_rates, which also has two columns of gene identifiers, just like the log2_concentrations sheet, as well as a header row.
 * 8) The production_rates are actually determined inside the software.  You should delete the production_rates sheet if you have one in your input file.
 * 9) Save this workbook in the same folder as the matlab files.
 * 10) Share this file with your instructors on lionshare.  Due to the sensitive nature of the unpublished data, take care not to leave data files in public places such as openwetware or lab computers.

There was no degradtion rate recorded for YAP5, thus I took the average of all 20 degradation rates and used that value as the degration rate for YAP5 and CIN5. The value was 0.05177.


 * The powerpoint with all the figures was uploaded to lionshare and the professors have access.

Running the parameter code and what it means

 * Genes that activate: NRG1, YAP5, GLN3
 * Genes that repress: CUP9, GTS1, HSF1, MSN4, AFT1, YAP6
 * Genes that repress and activate: FHL1, RAP1, REB1, ROX1, CIN5, YAP1, STE12, SOK2, SKO1, SKN7
 * Are any strong or weak? There are values that are "strong", just not throughout the entire column of affecting genes.  For example, CIN5 activates itself and has a strong value, but CIN5's affectance of other genes is weak.  This is the case for some of the genes.  I feel that for the most part there are weak interactions between the genes, since most of them have degree of 10-5 to 10-8.
 * What does threshold mean? I believe that the threshold value tells us where in the sigmoidal curve we have the "jump". It is the point where the gene goes from activation to repression or vice versa.