Moore Notes 10 2 13
From OpenWetWare
Jump to navigationJump to search
Group Call
- Participants: Katie, Stephen, Josh, Tom, Ladan, Sarah, Guillaume, Dongying, Patrick
- Guillaume: Jenna has been working with PICRUSTs and QIIME
- She will be helping Guillaume run analyses
- Josh: Tara Oceans data
- Shotgun metagenomics samples from Pacific and Atlantic
- 243 samples, 68 locations
- 5m depth
- 40-450 million reads per sample
- Will be available at the end of November
- Katie: Any gaps in their sampling locations?
- Josh: hard to find out, well distributed, some bias towards Mediterranean
- Might have to wait until released to figure this out
- Should get in touch with senior PIs: Bork, Acinas, Hingamp, Raes, Fallows, Sullivan
- Tom will drop Sullivan a line copying Josh, Jonathan, Katie
- Shotgun metagenomics samples from Pacific and Atlantic
- Sarah: phylogeography background, has focused on species in vertebrate studies
- Phylogeography looks at distribution of variants
- How phylogenetic distribution of genetic variants scales across space and time
- For bacteria, she would do phylogeography of function using SFams database
- Blast samples versus SFams, focusing on ones where we know functions (e.g., photorhodopsin)
- Tom: concerned about having enough resolution in a metagenomic read to do this
- Katie: look at Sam's paper, maybe do some additional simulations
- Maybe looking at reads that cluster together in tree would increase confidence
- Tom: might be good to work with nucleotide sequences (for phylogeny, not for read classification)
- Needs to think about how to address different sources of variants, in a family specific way, in order to set expectations/null distributions
- How to synergize with Josh?
- Use same ShotMap runs (e.g., on Tara Oceans)
- Niche modeling plus phylogeography
- To predict distributions of variants, which might be more important functionally than distributions of families
- Example: nitrate reductase subclade that alters its enzymatic reaction (and output of the metabolic pathway)
- Ladan: Predicting functions of SFams with no GO annotations
- Mapped SFams into a network (weighted, nondirected edges) based on Pearson correlation of presence-absence across genomes
- Tried to find tightly connected subnetworks with extreme sets algorithm
- Finds groups of nodes where removing a piece has a cost greater than removing the whole group
- There may be extreme sets within extreme sets (hierarchical, tree-like structure)
- Could potentially map functions from annotated members of the extreme set
- Most sets are nearly all annotated or mostly not annotated
- Stephen: how does the phylogenetic distribution of the genomes impact the results?
- More info here: https://docs.google.com/file/d/0B5MwVN20vJJzZjJJZW9nOFIzems/edit?usp=sharing
- Dongying: tried a "phylogenetically independent correlation" (from the 80s), but didn't make sense and eliminated some good information
- Stephen: could evaluated results based on known annotations
- Ladan: tried this to compare her algorithm to MCL