Moore Notes 4 27 10

From OpenWetWare

Jump to navigation Jump to search

OTU Project Group Call

Update from Tom: Slides
- OTU pipeline
- Simulation process (two types, one with reads that run off the end of the gene)
- Results comparing read vs. full-length source analyses
  - Phylogenetic distance matrices: Mantel tests show moderate (but significant) positive associations
  - Overlap of OTU clusters: Sensitivity and specificity are much better than random clusters, best for small MOTHUR cutoffs
  - Adjusting the MOTHUR cutoff for reads to maximize TNR (specificity) subject to TPR (sensitivity) > 80%
    - Works as long as source clustering is at a cutoff < 0.1
    - Max accuracy at cutoff = 0.15 for source cutoffs < 0.05
  - Number of clusters: also suggests increasing the cutoff
- Results comparing phylogenetic to percent identity distance
  - OTU pipeline underestimates total biodiversity (no false negatives)
- Results comparing phylogenetic clusters (full length sequences) to taxonomy clusters (species, genus, phylum)
  - Genus cutoff of 0.05 is OK
  - Species will be undercounted at 0.03 cutoff
- Results about what factors might be contributing to error in the reads vs. source analyses
  - Position in alignment: no
  - Conservation across 16S: no
  - Info in alignment column (number of reads aligned): no
  - Length: yes; reads < 100bp are contributing a lot, slightly longer reads might be

Future work
- Ideal adjustment of cutoff
- How long do reads need to be? e.g. filter reads < 100bp, maybe longer
  - Relates to doing Illumina vs. 454, pyrotags vs. shotgun
- Other methods:
  - James will explore clustering methods using alternative linkage methods in MOTHUR
  - Sam will explore RAxML (vs. fastTree)
- Relative abundance?
- Size of reference db?
- Steve will work on downstream analyses of GOS/gut data
- Tom will make an outline and folks will look for where they can contribute
- Journal?
  - Nature Methods
  - PLoS Comp Bio
  - ISME
- What papers do we need to cite? speak to? improve upon?

Things to emphasize
- Phylogenetic distance handles non-overlapping reads, gets around PCR bias
- Modular pipeline that is easy to use, high through-put
- Method for adjusting cutoffs

Retrieved from "https://openwetware.org/mediawiki/index.php?title=Moore_Notes_4_27_10&oldid=990834"

Navigation menu