Dec08 Quarterly

From OpenWetWare
Jump to navigationJump to search


Home Project People News For Team Calendar Library

Quarterly Meeting - Dec. 2008 - Davis, CA


  • Jessica Green
  • Steve Kembel
  • James O'Dwyer
  • Katie Pollard
  • Josh Ladau
  • Samantha Riesenfeld
  • Jonathan Eisen
  • Martin Wu
  • Dongying Wu
  • Sourav Chatterji
  • Aaron Darling
  • Kelly Kryc
  • Srijak Bhatnagar

Group Discussions

Metagenomics Education

  • Summer workshop(s)
  • Develop course (JE will put outline on wiki)
  • Taking course (e.g. MBL microbial diversity, Stanford's Hopkins Marine Station)
  • Existing resources (e.g. Santa Fe Institute)
  • Tutorials/documentation


  • coordinate and see each other vs. diversify and cover more
  • visibility, networking, accessing information


Talk by Jenna Morgan about in vitro simulations of metagenomic sequencing. Media:MetaSim long.pdf

Discussion of simulators and simulated data sets that are already available.

We've discussed doing some sort of simulation study to compare the performance of different methods for measuring diversity, etc. in metagenomic datasets. This could include evolving genomes, simulating the sampling of these genomes, and/or simulating various community assembly or evolutionary processes and their effects.

  • Hypothesis testing
  • Null Models


  • Kepler actors and data stream
    • actors are independent, but actors have some dependencies and limits
    • actors are wrappers for existing software
  • Other workflows (e.g. TAVERNA) use webservices
  • Why make workflows?
    • sharing our analysis pipelines (e.g. through CAMERA)
    • democratizing access to methods (e.g. for new lab members)
    • provenance
    • reproducing analysis pipelines repeatedly/iteratively

Data types

General discussion of data sets and data types we want to have in our shared bioinformatics resource

  • Pre-computes
    • in: genomes
    • out: profile HMMs, scores for each marker gene
  • Marker genes in metagenomes
    • in: metagenomic reads (DNA OK in new HMMer)
    • out:
      • multiple sequence alignments (AMPHORA conversion)
      • counts of hits to families
      • trees (for one marker gene family at a time)
        • for the sequence alignment including metagenomic reads as tips, possibly with homologs from isolates
        • supertree methods are good, but no branch lengths
      • binning (possibly using multiple marker genes)
      • OTUs: clusters at different percent identity cutoffs (based on scores) or phylogenetically (eventually)
  • What metagenomic samples to use?
    • Delong depth series
    • Jennasim
    • GOS
    • Hiedelberg hotspring (not available?)
  • Metadata

Tree building, phylogenetic diversity, and selection discussion

  • Tree building
    • We will have a conference call next week Tuesday 10am to discuss different methods for building trees from a multiple sequence alignment consisting of reference (whole gene) alignments plus fragments from metagenomic data. People will consult their phylogenetic gurus for ideas before the call.
    • Potential approaches
      • Supertree
      • Supermatrix
      • Throw the whole alignment into your favorite tree inference method (likelihood, Bayesian, whatever)
  • Phylogenetic diversity
    • build tree across all reads/samples
    • measure phylogenetic dviersity within/among samples. relative variation, variatoin among environmental/spatial gradients, correlations with metadata.
    • identify where/when on the tree major habitat transitions occurred
    • changes in diversification of reads/lineages through time
  • Selection
    • same tree plus multiple alignment is the data requirement
    • To study
      • rates
      • patterns
      • shifts in rate matrix
  • How will we compare diversity and selection across protein families?
    • one issue is that the leaves of the tree don't match. Discussed trying to use binning or OTU approach to get around this. Bin reads for each protein family, try to match up bins across protein families by taxonomy. But this will be difficult.
  • Start with DeLong data set - good metadata, manageable size
  • Can we look for shifts in birth/death or diversification rates through time. Can we look for clocklike gene families to test this? Other approach is for non-clocklike to link to character states, test for correlations between traits and diversification (recent Hedges paper on terrestrial evolution).
  • What about population-level evolution? There's a section of the grant on population genomics. How do link metagenomic data to island biogeography, etc. How are neutral vs. beneficial alleles dispersed around the globe? Unexplored. Tough with microbes - gene flow between species, gene flow within species, and convergent evolution all operating and difficult to distinguish.

PIs and Kelly

  • Interaction with CAMERA
  • Deliverables update
  • Annual report & budget reporting

Break-out sessions

Community Diversity

Within samples

  • Entropy

What is the right way to weight the leaves of a phylogenetic tree for a microbial community in order to be able to use something like Shannon entropy on the weighted abundances to measure the community's diversity? (And, is this idea too ad-hoc?) (Sam, Josh, Sourav?)

I would like to meet with this subgroup, if other people are also interested. I'd like to talk about how to characterize community diversity given the type of data we have, and how to relate this to predictions from ecological theory. -James

I'm also interested in discussing this. Perhaps this is two subgroups - one on the entropy idea, another on the general questions about doing ecology with metagenomic data? -Steve

  • Genetic Distance Distribution (James O.)
  • Phylogenetic Diversity (Jess G.)

Between samples

  • PhyloSor (Jess G.)
  • Spatially Explicit Neutral Theory of Biodiversity (James O.)
  • Trait-based diversity metrics (Jess G.)

Measuring selection in different environments

I have a half baked idea about using phylogenies from metagenomic data to measure the strength of selection on different genes in different environments, analogous to studies of selection on different genes in different genomes. I'd like to discuss briefly with someone who knows something about this (Steve, Katie, Sam)


Wednesday Dec. 10

  • Green lab arrives 10:40
  • 12-2 Lunch with Green Lab and Eisen Lab
  • 2-5 Previous research/introduction talks
    • Location: Tropical Room, Storer Hall, 2nd Floor
  • 6:30-9 Dinner

Thursday Dec. 11

Location: Tropical Room, Storer Hall, 2nd Floor

  • PI breakfast
  • Budget Discussion with PIs and Kelly
  • Deliverables discussion with PIs and Kelly
  • Breakouts
  • Coordination of topics
  • Outreach, conferences, etc