Algorithms for Biological Network Reconstruction from data
Gene regulatory networks consist of interacting species, that influence each other in complex ways. Biological data collection from microarray experiments has proved to be challenging task, given the large number of species that need to be measured. Even with recent advances and improved tools for genetic analysis, it is impossible to measure the contribution from every species, which makes biological networks harder to infer. These unmeasured species are commonly described as "hidden variables". These include: the presence of unknown regulatory molecules, degradation of mRNA and protein levels and influences from other genes that have not been measured in a given experiment. The aim of this project is to evaluate the performance of a subset of selected available in silico methods for biological network reconstruction from data. The analysis will provide the user with an overview of the relative capabilities of some of the current methods available, with specific emphasis to the type of data input they require and their performance under different simulated conditions.
Inputs and Methods
Simulations were implemented in MATLAB. The two methods of comparison were:
- The method of Gonçalves et al. (2008,2009), here referred to as the robust control method
- The method of Beal et al.(2005), here referred to as the bayesian inference method
The input datasets were the following in silico network topologies:
- The robust control method takes in steady state measurements under these conditions:
- If a network contains p measured species, the same number p of experiments must be performed.
- Each experiment must independently control a measured species.
- The bayesian inference method takes in time series measurements and the conditions above, should in theory, not apply.
The aims of this project are to compare the relative algorithmic performance of the Bayesian inference and the robust control approaches. The criteria we utilised to asess algorithmic performance included:
- Ability to recover the correct network structure as a function of signal to noise ratio.
- Ability to recover the correct network structure in the presence of non-linearities.
- Ability to recover sub-topologies in complex networks with feedback loops.
- Ability to recover network structure as a function of the input perturbation.
- Noise tolerance of both methods as a function of increasing number of experimental repeats.
- Comparison of the types of inputs that each algorithm requires.
The comparison should serve as a guide to the user, who may wish to select a suitable method to analyse his/her data. The method by Gonçalves and Warnick requires the datasets to be in a particular input format (see Methods) that is not available experimentally. For this reason, we have based the comparisons on in silico implementations although future plans of the group are to test their algorithm on the DREAM datasets, so it is still work in progress.
Results and Discussion
- The robust control method has a higher sensitivity and specificity at low signal to noise ratios on the datasets we implemented.
- The robust control method has a higher sensitivity and specificity and a higher noise tolerance and with less experimental repeats than the bayesian method.
- Both methods are able to cope with non-linearities.
- The robust control method outperformed the bayesian inference method at recovering 'sub structures' in networks of complex topologies.
- Neither of the methods were able to recover network topology with gaussian noise input perturbations. When the perturbation was a step input, the robust control method recovered network structure at a higher sensitivity and specificity than the bayesian inference method.
- Explore more methods mentioned in the literature review.
- Investigate further the effects of input perturbations on capabilities of both methods to recover network structure.
- Validate results on a larger number of datasets, including biologically relevant in silico networks.
- Explore the possibility of entering the DREAM competition.
More work is needed to truly asess the relative performance of both methods. Preliminary simulation results from in silico experiments suggest that the robust control method with steady state measurements is able to recover boolean network structure at a higher sensitivity and specificity than the bayesian inference method with a time series. In addition, we were unable to recover network interactions in the absence of a step disturbance at each individual node, which suggests that an input at every node is necessary to be able to reliably infer network structure. With Gaussian noise, both methods were unable to recover the network topologies. This implies that the input disturbance is necessary to be able to fully reconstruct the network. Future work is necessary to explore this question further.
Recommendations for the user: At present, the robust control method needs further validation, so it cannot be used directly on real datasets, since it requires experiments to be performed in a specific way. Provided that these experimental criteria have been met, if the user has steady state measurements, then the robust control method is a better option than the bayesian inference method. For time series measurements we still require further simulations for an accurate validation of the algorithms and our conclusions.
To Dr Guy-Bart Stan, Mr Taylor Southwick, Prof Michael Stumpf, Dr Jorge Gonçalves and Mr Ye Yuan