Moore Notes 2 9 11

From OpenWetWare
Jump to navigationJump to search

Group Call

  • James: SAR for niche mapping project
    • Plots with different average numbers of reads per cell: plot1(0.4), plot2 (4), plot3 (40), plot4 (400)
    • At very high sampling depth, all areas will have all genera
      • This could be real biologically since bacteria genera are big and we're not looking at particularly niche driven taxa (like hyperthermophiles)
      • Filtering out things that only occur in <10 samples
    • There is a linear pattern on the log-log scale at shallower depths
    • Is the non-zero probability output by GLM a realistic framework for this?
      • They are driving the flattening
      • The minimum p-hat depends on the number of samples/reads going into the model fitting
  • Should we look at model mis-specification and the dependence of results on sampling depth in general?
    • Can probably just acknowledge limitations for now
  • Josh: additional niche mapping updates
    • AUC>0.7 as a criteria for including taxa omits 6 common genera (including Pelagibacter and Prochlorococcus)
    • Draft manuscript is being edited, will send out to group soon (1 week turnaround OK for every one)
  • Alex: working on phylum specific richness maps
    • Need to adjust for numbers of genera in each phylum
    • Waiting for new AUC values from Josh