IGEM:IMPERIAL/2008/Prototype/Drylab//Motility data collection/Data Extraction

=Data Extraction=

Algorithm Error Analysis
Synthetic data was used as an input to assess the errors associated with our algorithm used to extract run velocity, run duration, tumbling angle and tumbling duration from two-dimensional coordinate data. The data extracted from the algorithm used is then fitted with models used to construct the bacteria's trajectory. Errors associated with the model's parameters are then determined.

The following assumptions were made:
 * 1) During tumbling the bacteria's displacement and hence velocity is zero.
 * 2) Bacteria's run velocity is averaged between periods when tumbling takes place.

The synthetic trajectory was constructed using the following parameters:
 * Run Velocity: Normal Distribution, μ=55, σ=2
 * Run Duration: Exponential Distribution, $$\lambda=1$$.
 * Tumbling Angle: von Mises Distribution, a=0, k=1.
 * Tumbling Duration: Exponential Distribution, $$\lambda=10$$.

Run Velocity
The posterior of parameters μ and σ of the normal distribution used to construct synthetic run velocity data is shown on the right. The following data on run velocity was extracted from synthetic coordinate data points:
 * (μ,σ) at Maximum Posterior = (54.5879,3.2300)
 * Mean of μ = 54.5711, Std of μ = 0.3743.
 * Mean of σ = 3.2950, Std of σ = 0.2695

The associated errors are: Δμ = 0.749% Δσ = 61.5%

Run Duration
The posterior of parameter $$\lambda$$ of the exponential distribution used to construct synthetic run duration data is shown on the right. The following data on run duration was extracted from synthetic coordinate data points:


 * $$\lambda$$ at Maximum Posterior = 0.5460
 * Mean of $$\lambda$$ = 0.5539
 * Std of $$\lambda$$ = 0.0644

The associated errors at maximum posterior are: Δ$$\lambda$$ = 45.4%

Tumbling Duration
The posterior of parameter $$\lambda$$ of the exponential distribution used to construct synthetic tumbling duration data is shown on the right. The following data on run duration was extracted from synthetic coordinate data points:


 * $$\lambda$$ at Maximum Posterior = 6.9800
 * Mean of $$\lambda$$ = 7.0791
 * Std of $$\lambda$$ = 0.8229

The associated errors at maximum posterior are: Δ$$\lambda$$ = 30.2%

Data Clustering

 * By plotting velocity*exp(i*angle) in the complex plane for each frame, we hope to be able to segment our data into two clusters:
 * One defining the tumble phase
 * One defining the run phase
 * This segmentation would then be used by our analysis program to derive the following properties: run time, tumble time, run velocity, tumble angle.