Moore Notes 8 3 15

From OpenWetWare
Jump to navigationJump to search

Group Call

  • Participants: Katie, Tom, Jonathan, Josh, Kit, Patrick, Stephen, Dongying, Guillaume
  • Guillaume: Pangea website update
    • Metadata repository linked to data sets
    • No new data sets
  • Josh and Kit: spatial autocorrelation of taxa and functions in metagenomes
    • Central park data as a start: all collected at one time, less confounding
    • OTUs with spatial autocorrelation (out of approx 13000)(e.g., OTU 566966)
    • KOs present in each OTU with picrusts: table
    • KOs with spatial autocorrelation (e.g., KO1850 vs. KO1852)
      • KOs present in OTUs with autocorrelation (e.g., KO1061), see also distribution
      • Tom: what about if copy number of KO is what matters for autocorrelation?
      • KOs with different relationships to OTU autocorrelation (summary)
        • Jonathan: simulation to see if picrusts induces any particular biases
    • Any functional trends?
      • Archaea, photosynthesis
    • Next: Same analyses on Tara Oceans
      • Stephen: make sure to follow the email thread, use ShotMAP annotations
  • Stephen: Strain level variation (slides)
    • PhyloCNV pipeline to
      • Estimate abundance of species (genome clusters) using phyEco markers
      • Predict presence/absence of genes in strains in a sample
      • Identify SNPs in core genes
    • Uses a custom reference database of genome clusters
      • Provides species designation for unannotated
      • Splits some species groups
      • About 5% errors in taxonomy annotation
    • Writing methods paper with a few applications
    • Applied to human gut worldwide
      • Reference genomes capture a low percentage of diversity, especially in Africa and South America
      • Jacard dissimilarity in strains from two hosts varies across bacterial species, and can be more than 50%
      • Statistical test for clustering (or lack of clustering) of strains (gene content or SNPs) by geographic location of host
        • Tom: See Anthony Amend paper on null distributions (June ISME J)
        • Unifrac for multiple groups: do it for all pairs
        • Katie: accounting for branch lengths
        • Deep branches (major clades) mixed vs. not - how to discover where in the tree the signal is
    • Ideas for applying to Tara Oceans