IGEM:IMPERIAL/2008/Modelling

=Modelling the Bioprinter=

Introduction
In this three-part tutorial we study the design and subsequent analysis of experiments such as motility assays. The problem is complex – very complex. We have little information on the statistical process involved. All we know is that there is a physiological process that underpins bacterium motility. It is likely that we will not be able to test comprehensive motility models against the data we will gather. On the positive side, we are not interested in building the best, most accurate model ever. What we really want is a model that is simple enough to be testable and that matches the data effectively. In order to get an acceptable model, we are going to break the modelling process into the following three distinct steps – corresponding to the successive layers of a Bayesian data analysis routine.

Step 1: Generation of the hypotheses Thanks to a bibliographical research, we can identify a set of hypotheses that we wish to compare to the data we have collected. In our case, a hypothesis consists in a model for the unhindered movement of bacteria. Each candidate model should depend on a few parameters only – the present exercise is complex enough as it is.

Step 2: Match Each Hypothesis to the Data Before knowing whether a model is supported by the data or not, we seek to match it to the data we have. In practice this step amounts to finding out what its best-fitting parameters are – see Tutorial 2.

Step 3: Hypothesis Testing Once we have obtained for each model the best available match with the data, it is time to compare the various hypotheses. Again several approaches exist – see Tutorial 2.

Improving the Design of the Experiment One of the most interesting properties of the Bayesian approach is that it provides intuitive methods for quantifying the accuracy of our predictions. As you will see the quantity and quality of the data available are crucial factors determining the quality of the whole process. Being able to quantify the accuracy of our predictions has very appealing consequences. Of particular interest to us is the possibility to design our experiment so that the reliability and accuracy of the predictions improve. Of course the design will be better if we have - and in particular if we have data representative of the phenomenon we wish to study. But interesting results can still be obtained with synthetic data generated by the various candidate models.

In the first tutorial we will focus on the construction of a relevant statistical model for the movement of a bacterium like b-sub and on the generation of the synthetic data required to train our analysis routines and design of the wetlab experiments. In the second part we will assume that we have an unrealistic level of control on the data acquisition process. Under such ideal assumptions, we will introduce the basics of Bayesian data analysis and how the results can be used to design experiments. Finally in the third tutorial, we will study the far more complex – and more realistic - case where the data acquisition process only gives us imperfect access to the data. As you will see the quality of the predictions that we can make is degraded and from an experimental point of view we need to gather more data.

Lessons to be learnt from these tutorials:
 * When we know a little about a phenomenon, we can still effectively design experiments and hope to model the phenomenon
 * Basis of Experiment Design 1: Increasing the confidence of the predictions
 * Basis of Experiment Design 2: Make sure we only collect relevant data
 * It is not easy but it can done!
 * Bayesian inference is great

Recommendations:
 * Use Matlab for the calculations (since it is the iGEM sponsor)
 * Try to solve the problem on paper first and then do simulations: you will learn a lot about the limitations of theory and the pitfalls of simulations.

Dry Lab Tutorial 1: Creating Synthetic Data
Useful files!
 * Random number generator. We will be using this to generate normal and von Mises distributed random numbers for the modelling of bacteria motility! Save as randraw.m

Video Methods
A video camera attached to the microscope can be used to capture images of moving cells. Video images are captured into memory by the system at a video frame rate of 25 or 30 Hz, after which algorithms which detect moving objects over a series of digitized images are applied.

In the initial frame, objects which satisfy the criteria for cell bodies are distinguished from the background and their positions stored as an array of x,y-coordinates with time. The boundaries of the cell are defined and the centroid of the cell is determined.

In the next frame, we can assume that the cell has moved a short arbitrary maximum distance and lies within a region with respect to their original position. A search for the cell may be carried out within this region, or all cells in the current frame are detected and those which satisfy certain criteria to match cells in the previous frame are assumed to be the same. This allows the system to track the movement of cells from frame to frame, captured from a video.

Scion Image
Runs on Windows. Scion Image manipulates, displays and analyses 2D images. Pixels represented by 8-bit unsigned integers; 16-bit images can be scaled to 8-bits. Supports organisation and manipulation of series of 2D images as 3D array called a stack. Supports TIFF and BMP file formats. Software can be extended using Pascal-like macro programming language that allows the user to customise and automate repetitive tasks. Scion Image uses a frame grabber card to digitise images from video, but requires a frame grabber card to do so. This paper which studies the themotaxis in C.Elegans utilises the above software with a Scion Image LG-3 capture board for video capture. Video images were captured at 1 fps. Elegans

The software is available on college system, which means that there must be some form of experience working with this software within college.

LabView
LabView allows for data acquisition, instrument control, industrial automation. It uses G-code, a dataflow programming language. Non-programmer users can also choose to use graphical programming which allows for drag-and-drop usage for simple algorithms. It is able to support many instrument hardware which can be represented as graphical nodes. Uses .m file for data analysis which is compatible with MATLAB. However, the software is extremely expensive (1200 USD) and is unavailable on the college network. Yet, it is widely used for the study of bacteria motility.

CellTrak
CellTrak allows for quantitative tracking of cells using microscopy imaging techniques. It is able to process .avi files and extracts measurements of movements. These measurements include centroid movement i.e. x,y coordinates with time, orientation with time and greyscale information with time. However, a search run shows that this software is not commonly used within college, and neither in academia for the analysis of bacteria motion.

Volocity
Captures 3D images in real time, allowing the user to find, track and measure objects. Algorithms are available to eliminate noise and blur, in data. Used in Imperial College by various researchers.

Zeiss LSM
Allows for 3D image analysis, used with a laser scanning microscope. Macros are available for download. 

ImageJ
Free to download, written in Java by NIH. ETH (in Switzerland) has an algorithm to track multiple cells.