Moore Notes 4 27 10
From OpenWetWare
Jump to navigationJump to search
OTU Project Group Call
- Update from Tom: Slides
- OTU pipeline
- Simulation process (two types, one with reads that run off the end of the gene)
- Results comparing read vs. full-length source analyses
- Phylogenetic distance matrices: Mantel tests show moderate (but significant) positive associations
- Overlap of OTU clusters: Sensitivity and specificity are much better than random clusters, best for small MOTHUR cutoffs
- Adjusting the MOTHUR cutoff for reads to maximize TNR (specificity) subject to TPR (sensitivity) > 80%
- Works as long as source clustering is at a cutoff < 0.1
- Max accuracy at cutoff = 0.15 for source cutoffs < 0.05
- Number of clusters: also suggests increasing the cutoff
- Results comparing phylogenetic to percent identity distance
- OTU pipeline underestimates total biodiversity (no false negatives)
- Results comparing phylogenetic clusters (full length sequences) to taxonomy clusters (species, genus, phylum)
- Genus cutoff of 0.05 is OK
- Species will be undercounted at 0.03 cutoff
- Results about what factors might be contributing to error in the reads vs. source analyses
- Position in alignment: no
- Conservation across 16S: no
- Info in alignment column (number of reads aligned): no
- Length: yes; reads < 100bp are contributing a lot, slightly longer reads might be
- Future work
- Ideal adjustment of cutoff
- How long do reads need to be? e.g. filter reads < 100bp, maybe longer
- Relates to doing Illumina vs. 454, pyrotags vs. shotgun
- Other methods:
- James will explore clustering methods using alternative linkage methods in MOTHUR
- Sam will explore RAxML (vs. fastTree)
- Relative abundance?
- Size of reference db?
- Steve will work on downstream analyses of GOS/gut data
- Tom will make an outline and folks will look for where they can contribute
- Journal?
- Nature Methods
- PLoS Comp Bio
- ISME
- What papers do we need to cite? speak to? improve upon?
- Things to emphasize
- Phylogenetic distance handles non-overlapping reads, gets around PCR bias
- Modular pipeline that is easy to use, high through-put
- Method for adjusting cutoffs