Moore Notes 3 4 09
From OpenWetWare
Jump to navigationJump to search
Group Conference Call
- Data sharing
- genbeo (behind a firewall - need to sftp, a few terrabytes)
- put a note on the wiki with readme info
- include non-metagenomic datasets, maybe in a subdirectory
- CollectiveX or other conversation system
- how to have huge discussions? Let's try it.
- MetaSim
- problems with command line
- how many species to include and how related (random?)
- fasta file vs. alignments to AMPHORA profiles (folks want alignments)
- ComboDb vs. AMPHORA
- different sets of sequences (e.g. for rpoB)
- almost all sequences in ComboDb are in AMPHORA profiles
- ask Martin where he got the sequences
- would be nice if we can link between ComboDb and AMPHORA with a unique id
- tree building
- pplacer discussion
- computational time for ML tree estimation
- on 8 core computer full ML with 700 seqs = 9 hours without bootstrapping
- using reference seq tree as a guide tree speeds it up to 1-2 hours
- 100s of sequences are OK, 1000s are not
- Steve's results: only with reads vs. with reference sequences (in two ways)
- reference tree helps to parcel out reads, without it the reads get lumped
- could be a problem for clades where we have reads but no reference seqs
- possible visit of Josh to Eugene