Simulations of Metagenomic Data

From OpenWetWare
Jump to navigationJump to search

Major overhaul (15 April 2009): I reorganized the wiki pages on simulations of metagenomic data; all related pages should be linked to from this page.
-- Sam Riesenfeld.

Motivation

It appears that there is no obvious choice for a method of constructing protein family phylogenies from metagenomic data. (See the discussion on phylogenetic methods.) We hope to shed some light on this issue by creating some simple simulated data sets and then testing different methods (existing and under development) on the simulated data sets.

Discussion

As we set out to do these simulations, we discussed what parameters we would like to be able to tweak, what software we might use, and related issues.

The pipeline

Sam implemented the full pipeline.

  • See the simulation pipeline for a high-level description of the steps in the pipeline. The page also has links to scripts and examples, in case you want to run it yourself.

Simulated data sets

See the simulation pipeline web page and my iSEEM page for more information on available simulated data.

To do (as of Nov. 2009)

See the list of action items for the simulations and analysis