Moore Notes 4 27 10

From OpenWetWare
Jump to navigationJump to search

OTU Project Group Call

  • Update from Tom: Slides
    • OTU pipeline
    • Simulation process (two types, one with reads that run off the end of the gene)
    • Results comparing read vs. full-length source analyses
      • Phylogenetic distance matrices: Mantel tests show moderate (but significant) positive associations
      • Overlap of OTU clusters: Sensitivity and specificity are much better than random clusters, best for small MOTHUR cutoffs
      • Adjusting the MOTHUR cutoff for reads to maximize TNR (specificity) subject to TPR (sensitivity) > 80%
        • Works as long as source clustering is at a cutoff < 0.1
        • Max accuracy at cutoff = 0.15 for source cutoffs < 0.05
      • Number of clusters: also suggests increasing the cutoff
    • Results comparing phylogenetic to percent identity distance
      • OTU pipeline underestimates total biodiversity (no false negatives)
    • Results comparing phylogenetic clusters (full length sequences) to taxonomy clusters (species, genus, phylum)
      • Genus cutoff of 0.05 is OK
      • Species will be undercounted at 0.03 cutoff
    • Results about what factors might be contributing to error in the reads vs. source analyses
      • Position in alignment: no
      • Conservation across 16S: no
      • Info in alignment column (number of reads aligned): no
      • Length: yes; reads < 100bp are contributing a lot, slightly longer reads might be
  • Future work
    • Ideal adjustment of cutoff
    • How long do reads need to be? e.g. filter reads < 100bp, maybe longer
      • Relates to doing Illumina vs. 454, pyrotags vs. shotgun
    • Other methods:
      • James will explore clustering methods using alternative linkage methods in MOTHUR
      • Sam will explore RAxML (vs. fastTree)
    • Relative abundance?
    • Size of reference db?
    • Steve will work on downstream analyses of GOS/gut data
    • Tom will make an outline and folks will look for where they can contribute
    • Journal?
      • Nature Methods
      • PLoS Comp Bio
      • ISME
    • What papers do we need to cite? speak to? improve upon?
  • Things to emphasize
    • Phylogenetic distance handles non-overlapping reads, gets around PCR bias
    • Modular pipeline that is easy to use, high through-put
    • Method for adjusting cutoffs