Moore Notes 12 2 09

Subgroup Call

OTUs
- version 1 of pipeline is done!
  - Sam's simulation results might lead to modifications
  - JE will talk to CAMERA: Tom will follow up Thurs
- JE's question about FastTree
  - FastTree psuedo-counts method (with max=3.0) and mutual overlap with a third sequence account for non-overlapping reads
    - Sam: we need to emulate this in the simulation analyses
  - we use phylogenetic not sequence distance in MOTHUR
  - Tom will follow up
- JG: Sogin and Welch talk
  - did simulations to show that MOTHUR generates too many clusters with large pyrosequencing data sets
  - new algorithm: pairwise alignment rather than multiple alignment
- James is using OTUs (computing metrics)
  - posted raw and reformatted output of MOTHUR on edhar (see email)
  - lots of noise: big variation in number of reads and number of OTUs across sites
    - especially a few sites with few reads and/or OTUs
    - maybe due to 1% cutoff, will try others
    - idea from paper (Biers et al): get most similar full length sequence from greengenes http://www.citeulike.org/group/6072/article/4095375
- Josh: do we still need the wrapper script?
  - not essential right now
  - good to have for pipeline
- publication on pipeline
  - focus on the identification of OTUs?
    - maybe interesting if we did a whole bunch of datasets
  - probably need software (CAMERA? Just Perl module OK for pub?)
  - and/or James (Josh's?) analyses
  - include simulations, e.g. just split up reference sequences
  - comparison with Biers et al. results, Sogin paper? Jenna's data?

Simulations
- Sam has simulated data on genbeo
  - for a reference db with 20 sequences (5 chosen by maxPD, 15 randomly)
  - all parameters on wiki
  - 5 repeats of each combination
  - this makes a lot of data sets
  - directory name gives the details, within directories the names are the same (keep separate)
  - will reorganize and put on edhar
- Tree building
  - which methods? RAxML (two ways), FastTree, pplacer (?)
  - which models?
    - WAG (in AMPHORA) vs. JTT (more like FastTree)
    - cat/cat+gamma
- Next steps
  - Sam will make a wiki page with assignments
  - Steve: let's set up name conventions (Sam will do)
  - Sam will keep track of data sets that have same parameter values due to rounding
  - Steve: should share code/scripts over svn
  - Tom: data should be in svn too

Moore Notes 12 2 09

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools