Moore Notes 5 13 09
From OpenWetWare
Jump to navigationJump to search
Group Call
- JE just got a budget from Gail
- PIs will follow up on this later
- Also need to talk about hiring
- ASM
- JE and JG will be there
- Kelly will too
- OK to talk about our methods in development?
- Building trees from large alignments
- Steve: concatenated alignment of 31 markers (~7000 aa long, ~5000 reads, 570 genomes-AMPHORA)
- Trying to build tree
- FastTree works
- Wants to use likelihood
- Memory issue? Test with a debug tool
- Trying using FastTree (or genome tree) as a constraint in RAXML
- Martin: used phyML for whole genome tree
- Try bowtie? only for human...
- JE: ask Sanderson's group
- Any reason not to use a guide tree?
- Steve: concatenated alignment of 31 markers (~7000 aa long, ~5000 reads, 570 genomes-AMPHORA)
- Simulator is revised
- Need to grow data sets to get bigger
- Subsamples from (a subset of) AMPHORA db
- Need input from Srijak
- Binning/OTUs: Kunin paper
- Extrinsic approaches
- ref genome closely related to reads, BLAST, pull out those reads (as in Rusch paper)
- Intrinsic approaches
- assemble (a little)
- nucleotide composition: works for low complexity ecosystems
- Which to use depends on what you want to do, e.g.
- how does number of bins grow with spatial scale? Taxonomic classification -> AMPHORA on markers works here
- to estimate total number of taxa in all oceans ->will AMPHORA underestimate? better with more markers
- are pairs of reads from the same taxa?
- move to DNA sequences to resolve fine level distinctions
- no individual gene will get the tail of the distribution (rare organisms)
- OPFs/environmental gene tags
- alternative: 454 sequence 1 million rRNAs from 1 sample (deep coverage) vs. metagenomics
- Plan for GOS
- markers plus more genes
- use reference genomes
- iterative algorithm based on a guessed set of dominant taxanomic groups
- follow up skype call: Josh, James, Dongying, Jonathan
- Extrinsic approaches