Natalie Williams Week 10: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Computational algorithm: Added more sections for outline)
(Added new section plus first sentence)
Line 81: Line 81:
The 40 target genes were selected from Chen et al's work.
The 40 target genes were selected from Chen et al's work.
====Inference of regulators====
====Inference of regulators====
The data were in the form of log base 2 ratios between actual values of mRNA divided by value of a standard.
   
   




{{Template:Natalie Williams}}
{{Template:Natalie Williams}}

Revision as of 21:54, 21 March 2015

Outline of Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae

Introduction

  • Gene regulation makes a working copy of the genetic information of DNA sequences into proteins and/or functional RNAs.
    • Promoting regions must be recognized by transcription regulatory proteins which bind RNA polymerase to the DNA strand.
  • Microarray developments have made it easier to follow the changes of the cell's gene expression over time.
    • Analyzing this microarray data, we could better understand the relationships between genes and their transcription factor regulators.
    • Because these relationships collectively form a network among the genes, it should be possible to construct networks by studying the results of microarray data.
  • Budding yeast, Saccharomyces cerevisiae, has been studied extensively in the lab.
    • There is a lot of knowledge about its genome.
    • Expression data was collected and analyzed to figure out what genes were being used at a specific stage of the cell cycle.
    • Genes were grouped based on where their regulators bound to promoter regions.
  • Methods in which networks were produced previously:
    • A generalized linear model was going to be created to described regulators and guess the pattern of regulators and their target genes.
    • A kinetic model with Bayesian networks was used to predict gene regulatory networks as well as the proteins that regulate genes expression.
    • Including both information from the genome and gene expression data named another method to predicting networks.
      • Another research furthered this method by using promoter regions or the sigma factor.
  • An alternative method used in this paper:
    • A model based on nonlinear differential equation model was used.
      • It called for all potential regulators
      • Genes from a group of potential regulators are picked and the model is applied to try to fit the gene expression results of the target genes.
      • This is done for all potential regulators
  • In this model:
    • There were 40 target genes;
    • 184 possible regulators were identified;
    • The data were analyzed using a linear model; and,
    • Results from the linear model were compared to that of the nonlinear differential equation system to see how well it predicted the target genes' profiles.

Results

Dynamic model of transcriptional control

  • For the model, an assumption that there is repeated interactions between regulators and target genes over time.
    • The model also assumes there is combinatorial control among the regulators for target genes.

Equation 1

  • yj: expression level regulators
  • wj: regulatory weights
  • g: regulator effect of a specific gene
  • j =1,2,...m, where m is number of regulators controlling a gene
  • b: parameter for transcription initiation delay/unspecific bias caused by regulator effects associated with gene expression

Rate of expression of target gene (dz/dt) is given by regulatory effects of other genes ρ & the effect of degradation x.
Equation2

  • Degradation is shown with a first order chemical reaction --> x = k*z
  • ρ = regulatory effect g of regulators transformed by a sigmoidal transfer

The entire model for control of target gene expression z:
Equation 3

  • k2: rate of degradation of target gene product
  • k1: rate of expression

However, Equation 3 can be simplified to Equation 4
Equation 4

  • y is approximated with a polynomial of degree n

Approximation of y

  • Coefficients were taken from experimental gene expression data using a least squares approximation.
  • An assumption that all the weight errors for all points were the same.
  • The simplified version - Equation 4 - was used to figure out regulators of the target genes
  • n has to be chosen to represent the large amounts of changes in gene expression for each individual experiment

These expression profiles Z {z(t)} for the target and Y {y(t)} for regulating genes measure at time points ranging from 1,2...Q were used to look at and analyze the gene profiles to minimize the average square error.
Equation 6

  • {z^c(t)}: altered profile of z(t) for all Z at time points t=1,2,...Q,
  • Q: data points calculated from Equation 4

The issue began to focus on how to get the best results with the minimal amount of error.
The linear model was then compared to the nonlinear model.

  • The parameters (d) came from the minimization of errors in function 6.

Computational algorithm

  • Regulators for target genes are being chosen to predict the profiles of the target genes by using the pool of 184 potential regulators
    • Equations 4 and 6 were used
  • Potential missing experimental data is added into the method by using the polynomial of degree n, with n representing the number of data points and level of expression change

The algorithm used is as follows:

  1. Fit regulators with Equation 5
  2. Select a target gene
  3. Potential regulatory gene is chosen
  4. The least squares minimization for target and regulator genes was then applied
  5. Step 3 is repeated for all potential regulators
  6. Regulators that best fit the selection are then picked out
  7. Step 2 and then all following steps are repeated until this method has been done for all target genes
  • The above algorithm was done 100 times for each pair of regulator and target gene.
  • Optimization was based off the LEvenberg-Marquardt method & Equation 4 was solved with ode45 in MATLAB.

Dataset selection

To validate their model, Vu and Vohradsky compared their results to microarray data from Spellman.

  • 6178 open reading fames were on the chip.
  • The amount of regulators was smaller for influencing the cell cycle.

The 184 possible regulators was extracted from YEASTRACT and other published papers
The 40 target genes were selected from Chen et al's work.

Inference of regulators

The data were in the form of log base 2 ratios between actual values of mRNA divided by value of a standard.


Back to User Page: User:Natalie Williams
To view the Course and Assignments:BIOL398-04/S15