Moore Notes 8 13 14

From OpenWetWare
Jump to navigationJump to search

TJS taking notes in KP's stead

Participants: TJS, PB, SN, SW, DW, SH

Topics for discussion

  • Patrick's presentation
    • analyzed metahit cohort with shotmap (N=39)
        • annotated into MetaCyc, SFam, KO
        • have BMI, IBD, gender, age covariates
      • corrected for AGS using MicrobeCensus
      • conducted rank-based tests of correlation (wilcox & kendall)
        • multiple tests corrected with q-value
    • using q<= 0.25, ibd-3 (none v. CD v UC) is only covariate with any real differentiation
    • Sfams have relative few sig hits
    • There are 429k families in at least 1 sample. Multiple test correction a problem
      • Was one (or more) of the samples processed with round 2?
    • Can we filter families that we are testing to improve discovery
      • considered various family properties that is independent of the test statistic
        • as we filtering by mean, we do see some gain in discovery, though in SFams requires throwing out lots of data
        • variance filtering has similar pattern as mean, as expected
        • filtering for fraction 0 only impacts ibd-3 and is generally helpful
          • Q: how does number of samples impact these results
          • Q: can you label the cutoffs on the final image?
          • Q: Sfams lots of families are eliminated; could be the round 2 and the poor recall families (which are usually low abundance) that SN identified through simulation
          • Observation: This is a lot like the pipeline optimization, but at the test statistic level. Should fit into narrative nicely
        • Coefficient of Variation has inconsistent results
          • SFams ibd3 has a result we don't trust. Not exactly sure what's happening here
        • filtering does seem to help with ibd3, biggest gains for SFams. Fraction observed helps all 3 and can use a single, strict cutoff for all databases (85%).
    • pathway enrichment (Fisher's exact test)
      • Used q-family < 0.25 and q-pathway of 0.25 or 0.1
      • has done kegg and metacyc, for sfams will need interpro2go
      • found more modules higher in CD than lower
        • Slide 21 has some interesting examples
          • LPS. Let's check to make sure this is real.
          • cobalamin (B12) biosynthesis showing up in both KO than MC
          • Oddly, short chain fatty acid biosynthesis elevated in CD. What is going on here?
      • Lower in CD
        • All three domain ribosomes, a little worrisome given that AGS correction should take care of this
          • What is AGS by disease state distribution? What is abundance of ribosome before and after disease state.
        • Also, why is sulfate lower? Morgan et al showed higher in IBD. Do see some similarities
    • PCA on gene family abundances
      • differentiation does seem to exist in KO for CD v. rest in PC2, PC3, but generally little distinction
      • Might need to drop UC-21 which appears to be outlier (and has low seq quality)
      • Can we recapitulate what they did with taxonomy?
        • Used metaplan, but don't see the same separation.
        • May need to contact the authors to see what's going on.

Next week's call:

  • Josh & Tom
  • MetaCenter meeting update?