Moore Notes 8 3 15
From OpenWetWare
Jump to navigationJump to search
Group Call
- Participants: Katie, Tom, Jonathan, Josh, Kit, Patrick, Stephen, Dongying, Guillaume
- Guillaume: Pangea website update
- Metadata repository linked to data sets
- No new data sets
- Josh and Kit: spatial autocorrelation of taxa and functions in metagenomes
- Central park data as a start: all collected at one time, less confounding
- OTUs with spatial autocorrelation (out of approx 13000)(e.g., OTU 566966)
- KOs present in each OTU with picrusts: table
- KOs with spatial autocorrelation (e.g., KO1850 vs. KO1852)
- KOs present in OTUs with autocorrelation (e.g., KO1061), see also distribution
- Tom: what about if copy number of KO is what matters for autocorrelation?
- KOs with different relationships to OTU autocorrelation (summary)
- Jonathan: simulation to see if picrusts induces any particular biases
- Any functional trends?
- Archaea, photosynthesis
- Next: Same analyses on Tara Oceans
- Stephen: make sure to follow the email thread, use ShotMAP annotations
- Stephen: Strain level variation (slides)
- PhyloCNV pipeline to
- Estimate abundance of species (genome clusters) using phyEco markers
- Predict presence/absence of genes in strains in a sample
- Identify SNPs in core genes
- Uses a custom reference database of genome clusters
- Provides species designation for unannotated
- Splits some species groups
- About 5% errors in taxonomy annotation
- Writing methods paper with a few applications
- Applied to human gut worldwide
- Reference genomes capture a low percentage of diversity, especially in Africa and South America
- Jacard dissimilarity in strains from two hosts varies across bacterial species, and can be more than 50%
- Statistical test for clustering (or lack of clustering) of strains (gene content or SNPs) by geographic location of host
- Tom: See Anthony Amend paper on null distributions (June ISME J)
- Unifrac for multiple groups: do it for all pairs
- Katie: accounting for branch lengths
- Deep branches (major clades) mixed vs. not - how to discover where in the tree the signal is
- Ideas for applying to Tara Oceans
- PhyloCNV pipeline to