Moore Notes 10 30 13

From OpenWetWare

Jump to navigation Jump to search

Group Call

Participants: Katie, Jonathan, Tom, Patrick, Guillaume, Stephen, Ladan

PICRUSTs analysis (Tom and Guillaume)
- English Channel metagenomes and 16S, both annotated to KEGG Orthology Groups
  - For PICRUSTs used GreenGenes 16S tree
  - JE: What genomes were used? This is under the hood, probably same as PICRUSTs manuscript and not appropriate for marine
- Plots: http://edhar.genomecenter.ucdavis.edu/~gjospin/picrust_test/picrust_test/
  - Weak correlation using abundance estimates
  - Many families with zero or low abundance in 16S/PICRUSTs but high abundance in shotgun metagenomes
  - Large Hamming distances (>0.7) on presence/absence estimates
- What is different from Jack Gilbert's analysis?
  - Figfams vs. KEGG
  - Similar but not same algorithm as PICRUSTs

JE will ping Tara Oceans folks again re: data release

Stephen: Average genome size estimation project
- Taxon specific? Possible maybe with taxon specific markers
- JE: Compare to normalizing to number of reads hitting a single copy protein coding gene (e.g., recA)
  - recA is not one of Dongying's markers used in this analysis, because it is diverse between bacteria and archaea
- Applications:
  - Protein family abundance estimate normalization
  - Ecological differences in average genome size, e.g., IBD significantly different from healthy in MetaHIT
- Estimates between different libraries on the same sample don't agree as well as one would hope (different lanes of same prep and different simulation runs do agree)
  - KP: Could this be due to duplicate sequences? Probably not, but can check with fastqc

Next call: Nov 13

Retrieved from "https://openwetware.org/mediawiki/index.php?title=Moore_Notes_10_30_13&oldid=990241"

Navigation menu