User:Nuri Purswani/Network/Introduction/Literature

From OpenWetWare
Jump to navigationJump to search
Algorithms for Biological Network Reconstruction from data

Home Introduction Synthetic Datasets Methods Results Discussion References
Literature Review

This section will review a few methods that were not implemented in this project, but were very interesting for possible future comparisons. Here we provide a summary of the methods and the relevant references. For detailed descriptions of the methods that have been implemented (i.e. Beal et al. 2005, Stan, Gonçalves et al. 2008-2010) see the Methods section.

Other interesting network inference methods

Dynamic Bayesian Inference using ML estimates

References

  • C.Rangel, J. Angus, Z. Ghahramani, and D. Wild. "Modeling biological responses using gene expression profiling and state space models," Probabilistic Modelling in Medical Informatics and Bioinformatics, D. Husmeier, S. Roberts, and R. Dybowski, editors, Springer Verlag, 2005.
  • C. Rangel, J. Angus, F. Falciani, Z Ghahramani, M. Lioumi, and D. Wild. "Modeling T-cell activation using gene expression profiling and state space models," Bioinformatics. 2004 Jun 12; 20(9):1362-72. Epub 2004 Feb.12.

Description

This method has the same underlying assumptions on the data as Beal's method. This method was implemented in the initial stages of the project, but the variational bayesian approach was chosen over it, as it is less prone to overfitting (Beal et al. 2007).
This algorithm assumes that the gene expression data can be modelled as a linear dynamical system, with gaussian noise perturbing the hidden states and the observations at every time point. The implementation steps of the algorithm are the following:

  • Crossvalidation
    • Estimates the optimal dimension of hidden states "K"
    • Estimates the parameters A, B, C, D, and noise covariances for the state space model for gene expression.
  • Bootstrap
    • Increases the confidence of the estimate. Typically outputs 100 candidate networks, for which a mean result is approximated
    • The bootstrapping compensates for situations when there are not sufficient experimental repeats of our data.

Estimation of parameters and hidden variables in non-linear state space models

References

  • F. Alche-Buc, M. Quach, N. Brunel. Estimating parameters and hidden variables in non-linear state-space models based on ODEs for biological network inference. Vol 23 23:3209-3216 (2007)

Description

This method is based on ODEs and has different assumptions on the input data to the methods from Beal et al., Rangel et al. and Stan et al. Biological systems are non-linear state space models, instantiated on hill kinetics and mass action kinetics. It estimates parameters in the non linear model provided that the structure has been previously set, and thus, requires knowledge of the boolean structure of the network prior the start of the estimation. The implementation steps are summarised as follows:

  • The key point of this method is that it utilises an unscented kalman filter to allow for estimation of the non linear evolution of a variable. This way, they adapt it to conventional bayesian estimation of parameters.
  • The examples they provide on line are able to infer parameters in the repressilator and the JaK-STAT pathway. The repressilator is an example that neither of the methods implemented in this project can cope with.

Using state space models and location analysis to infer time-delayed regulatory networks

References

  • C. Koh, F.X. Wu, G. Selvaraj, A. Kusalik. Using a State-Space Model and Location Analysis to Infer Time-Delayed Regulatory Networks. EURASIP Journal on Bioinformatics and Systems Biology. Article ID 484601 (2009)

Description

The previous state space models introduced, and the ones used for this project do not take into account time delays in regulatory networks. When we apply an input perturbation into the system, there are delays due to transcription, translation and transport which in turn modify their effects on the system of interest. So replacing the conventional state space representation, this network inference algorithm assumes that gene expression can be modelled as follows:

[math]\displaystyle{ x_{t+1}=Ax_t + Bu_{t-T} + w_t }[/math]

[math]\displaystyle{ y_{t}=Cx_t + v_t }[/math]

Where [math]\displaystyle{ A, B, C x_t and y_t }[/math] have the same meaning as the variables described in Synthetic Datasets and [math]\displaystyle{ T }[/math] is the delay in the input perturbation caused by the aforementioned processes. Interestingly, they also use the Akaike Information Criterion to rank network structures. This method is worth looking into as they go one step further than existing methods and apply their ideas to ChIP-ChIP data.


Approximate Bayesian Computation

References

  • M. Secrier*, T. Toni*, M.P.H. Stumpf, The ABC of reverse engineering of biological signalling systems, Mol.Biosyst. 5:1925-1935 (2009) (website)
  • T. Toni, D. Welch, N. Strelkowa, A. Ipsen, M.P.H. Stumpf, Approximate Bayesian Computation scheme for parameter inference and model selection in dynamical systems, J.Roy. Soc. Interface 6, 187-202 (2009).
  • T. Toni, M.P.H. Stumpf, Simulation-based model selection for dynamical systems in systems and population biology, Bioinformatics, 26:104-110 (2010). (website)

Description

This approach backs up Beal's method of estimating a posterior distribution of parameters, instead of a point estimate. Point estimates can be misleading and lead to development of "sloppy" models in systems biology. This method cannot infer parameters without knowledge of the boolean structure of the network, so it cannot cope with hidden variables. However, it takes an interesting perspective for parameter estimation and applies the idea to the MAPK signalling biological example. What is interesting about this method is that it can cope with linear and non linear examples, such as the classical Lotka-Volterra predator-prey interactions. The variational treatment is similar to Beal's although this framework uses sequential montecarlo simulations to optimize parameters, instead of the expectation maximization algorithm, more prone to getting stuck in local minima. The implementation steps can be summarised as:

  1. Initialize parameters
  2. Propose an estimate of the parameter at the next time point, according to a prior distribution
  3. Simulate the dataset from that estimate
  4. Quantify the Eucledian distance measure between [math]\displaystyle{ d(estimate dataset, observed dataset) }[/math]
  5. If the distance is not small enough set the next estimate of the parameter with a probability given by the ratio of the likelihood of that estimate and the previous estimate - analogous to simulated annealing
  6. Continue iterating steps 2-5 until the distance is minimised.

Other Interesting Papers

Importance of Input Perturbations and Stochastic Gene expression in the reverse engineering of gene regulatory networks

Reference

  • Zak DE, Gonye GE, Schwaber JS, Doyle FJ, 3rd. Importance of input perturbations and stochastic gene exprsesion in the reverse engineering of genetic regulatory networks: insights from an identifiability analysis of an in silico network. Genome Res (2003);13:2396-2405

Description

This paper performs an identiviability analysis of an an in silico gene regulatory network that takes into account stochastic effects of gene expression. They identify the accuracy with which network parameters can be estimated as a function of the input perturbation, and show that for the network to be identifiable, they require prior knowledge of mRNA degradation constants. In addition, they mentioned that complex perturbations (such as a step) are more favourable in identifying network parameters than simpler ones (such as a pulse). What is most thought provoking about this paper is that they mention the necessity of the perturbation, and that "reconstruction is otherwise not possible" without extra information. This can be related to the method from Stan et al. and the results of the simulations from that in silico model were used as inputs to Beal's Variational Bayesian method (Beal et al. 2005). An interesting point of comparison would be the implementation of this in silico network with the robust control algorithm. This was not possible due to limited amounts of time.

Reverse engineering by building synthetic networks

References

  • Cantone et al. A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches Cell, Volume 137, Issue 1, 172-181
  • Collins et al (2009). Systems Biology Strikes Gold. Cell 137, Pgs24-25. DOI 10.1016/j.cell.2009.03.032

Description

While systems biology aims to study and understand biological systems, synthetic biology uses a different approach, by building them "de-novo". In this paper, the authors created a synthetic network in yeast and measured time series and steady state expression after multiple perturbations of the system. Then they tested several in silico methods and their ability to reverse engineer the underlying network structure. The types of algorithms they tested included: BANJO (Bayesian Network Inference), ARACNE and an ODE based method. The possible future extension of the robust control method can be applied to the construction of synthetic gene networks, and has the potential of becoming a debugging tool for synthetic biology.