Group Call

  • Participants: Jonathan, Tom, Josh, Ladan, Sarah, Guillaume, Dongying, Patrick, Stacia, Stephen
  • Patrick: Gene functions necessary for life in an environment or variable across samples
    • slides
    • Application: healthy human gut (lots of data)
      • JE: stool isn't really a single niche/environment
    • Taxonomic variability and functional stability previously noted
    • Model accounts for mean overall and per study
      • Mean correlated with variance (as expected for count data)
      • Different definitions of "healthy" across studies
      • Test statistic is the residual from the model (unexplained variance)
    • Two possible null hypotheses: high variance or high variance given mean
      • Null distribution by simulation from fitted model
      • Significantly variable and invariable gene families
      • JE: also show genes with significant study effects
    • Variable gene families are enriched for transporters, PTS systems, and nitrate metabolism
      • Variability and abundance correlation with taxonomic breadth of gene family
      • Abundance of these are correlated with average genome size
      • Ian Paulson, Milton Saier papers on transporter copy number and genome size
        • Broader specificity of transporters in smaller genomes
        • Subfamilies necessary to see differences, e.g., flux versus affinity
    • Invariable gene families are enriched for central metabolism, ribosome, vitamin biosynthesis, non-mevalonate isoprenoid biosynthesis (cell wall), exon junction complex (why?)
      • Stephen: Probably not host contamination (see plot)
    • Phylogenetic logistic regression to look for evidence of selection
      • Tests for association of gut habitat with presence/absence of gene family, across genomes
      • Gut depleted gene families are associated with invariability
      • JE: Risky to annotate taxa by environment, Louzapone 2008 annotation is old
      • Kostas (GA tech) ecological correlates of genome properties, Jenna in Eisen lab did some analysis for correlates with LGT