Moore Notes 8 19 09

From OpenWetWare
Jump to navigationJump to search

Group Call

  • Sam will email questions about simulations
  • Update on svn or git server
    • Srijak will follow up
  • Problems with skype
    • gizmo
    • google
    • Gladstone teleconference system
  • GOS OTUs
    • (1) identify 16S reads
      • mpiblast on genbeo
        • large query and relatively small db isn't optimal situation for mpiblast
      • Tom is running regular blast on one node for now
      • STAP/reducing db size in initial pass (use all of greengenes for classification later)
        • Dongying: get greengenes down to a few (e.g. 300 from STAP) sequences
        • JE: two versions, use the smaller ~300 organism set (spans tree)
      • switching roles of db and query
      • how to deal with small fragments of 16S?
        • mostly on ends of longer reads
        • GOS is Sanger sequencing
      • see paper from CAMERA (Bioinformatics, May 2009)
    • (2) align small fragments
      • Program to use?
        • STAP
        • mothur tools
        • MAST alignment server or shorter fragment version (GAST?)
        • infernal (via RDP?) - How automated?
      • Small reads
        • Dongying: <200bp is hard
        • most reads are longer
        • ignore tiny things
        • replacing small read with a longer one from the db, if read is a near perfect match (just to assign, not define OTU)
      • Use gap penalties to stop splitting of reads
    • (3) find OTUs
      • Katie: do we need a maybe/unknown category (besides present/absent)?
      • Dongying looked at which parts of molecule are most informative
      • might want to use a minimum alignment length
      • develop a reliability measure
      • James: for abundance, don't want to throw away any data
      • Jess: what about using other genes (i.e. proteins)?
        • Hard to define similarity cutoffs
        • Schloss paper is a first step
        • Cuts on tree, rather than percent identity
        • Marcel (Eisen lab) did cut on tree vs. percent identity on 16S already
          • monophyletic groups correspond with 99% or 97% OTUs
          • but wasn't tried on fragmentary data
          • tree might be better for non-overlapping reads