Moore Notes 5 13 09

From OpenWetWare

Jump to navigation Jump to search

Group Call

JE just got a budget from Gail
- PIs will follow up on this later
- Also need to talk about hiring

ASM
- JE and JG will be there
- Kelly will too
- OK to talk about our methods in development?

Building trees from large alignments
- Steve: concatenated alignment of 31 markers (~7000 aa long, ~5000 reads, 570 genomes-AMPHORA)
  - Trying to build tree
  - FastTree works
  - Wants to use likelihood
  - Memory issue? Test with a debug tool
  - Trying using FastTree (or genome tree) as a constraint in RAXML
- Martin: used phyML for whole genome tree
- Try bowtie? only for human...
- JE: ask Sanderson's group
- Any reason not to use a guide tree?

Simulator is revised
- Need to grow data sets to get bigger
- Subsamples from (a subset of) AMPHORA db
- Need input from Srijak

Binning/OTUs: Kunin paper
- Extrinsic approaches
  - ref genome closely related to reads, BLAST, pull out those reads (as in Rusch paper)
- Intrinsic approaches
  - assemble (a little)
  - nucleotide composition: works for low complexity ecosystems
- Which to use depends on what you want to do, e.g.
  - how does number of bins grow with spatial scale? Taxonomic classification -> AMPHORA on markers works here
  - to estimate total number of taxa in all oceans ->will AMPHORA underestimate? better with more markers
  - are pairs of reads from the same taxa?
  - move to DNA sequences to resolve fine level distinctions
  - no individual gene will get the tail of the distribution (rare organisms)
- OPFs/environmental gene tags
- alternative: 454 sequence 1 million rRNAs from 1 sample (deep coverage) vs. metagenomics
- Plan for GOS
  - markers plus more genes
  - use reference genomes
  - iterative algorithm based on a guessed set of dominant taxanomic groups
- follow up skype call: Josh, James, Dongying, Jonathan

Retrieved from "https://openwetware.org/mediawiki/index.php?title=Moore_Notes_5_13_09&oldid=990910"

Navigation menu