Distributed Computing for Systems Biology using Taverna Workflows
Biological data is vast, heterogeneous and distributed. In order to perform experiments in large-scale, high-throughput areas of research, such as, Proteomics, Microarray Analysis and Systems Biology, scientists must be able to access, analyse and consolidate this network of data and analysis resources.
myGrid is a suite of middleware components designed to support these in silico experiments, allowing access to distributed and heterogeneous resources from the scientist’s desktop. The myGrid workflow engine, the Taverna workbench, allows the design and execution of workflows, automating the chaining together of complex analyses over complex datasets.
Workflows exist in the wider context of scientific data management. Scientists need to retrospectively analyse workflow results and compare different workflow invocations with one another. Therefore myGrid provides an environment and components that combine workflows with sophisticated methods of provenance collection, so data carries with it a record of how and why it was produced.
This workshop introduces myGrid and gives an overview of implementing biological workflows for comprehensive, integrated data analysis. We will describe how the workflow approach can be applied to systems biology using current research from the community and discuss the integration of the generic myGrid tools with specific systems biology resources, such as in the construction of SMBL models and the visualisation of Systems Biology data.