Moore Notes 3 30 15

From OpenWetWare
Jump to navigationJump to search

Group Call

  • Participants: Katie, Tom, Patrick, Stacia, Dongying
  • Shotmap paper in review, but no updates
  • MicrobeCensus is published at Genome Biology
  • CAMI competition (assembly, taxonomic annotation of assemblies, binning)
    • Humann2 potentially with MicrobeCensus
    • PhyloSift
  • Patrick: variable (and stable) gene families across metagenomes (slides)
    • Linear model for log abundance of gene family with study effects
    • Test statistic is variance of residuals (for each gene family)
    • Tom: be cautious about low abundance families (need 100 reads for accurate mean estimate)
      • Patrick did use only universally present families, which tend to be fairly abundant
    • Unmapped reads (%) affects overall abundance for all families
    • JE: how to get from gene family abundance to pathways/functions?
      • Patrick: for now just doing enrichment tests
      • Correlation of families might predict shared functions
    • Testing if residual variance is larger than expected given mean abundance
      • Bootstrap null with centering and scaling (estimated by Poisson)
    • Normalizing for gene length and average genome size
    • Overlaying with phylogenetic breadth data
      • Two component signaling pathways have small PD, but are very stable (e.g., quorum sensing, sporulation)
      • Tom: sample PD vs. database PD
      • JE: Does KEGG have many families of unknown function
        • Repeat on SFAMs novel gene families to get a hit list (Stacia) and covariation with known genes
  • Next Calls:
    • April 13: Tom and Tara Oceans
    • April 27: Stacia novel gene families