Moore Notes 5 17 11
From OpenWetWare
Jump to navigationJump to search
Group Call
- Open access
- Protein families in GOS
- Sam: protein family phylogenetics simulation project
- read length, reference Db, phylogenetic method
- rpoB, lolC, 16S - rpoB and 16S are very similar
- unifrac ROC curves on enterotypes (weighted more sensitive to tree error)
- could run more proteins, e.g., one or two from each of Steve's clusters
- Steve: built trees and estimated taxonomic composition for each AMPHORA gene (plus 16S) separately
- compared patterns of PD across genes
- different genes give different PD patterns across GOS samples - clustered into 6 groups
- correlations with environment are different for the 6 groups
- JE: are all of these still more similar to each other than other genes?
- KP: scale would change, but negative correlation
- JE: compare unifrac dissimilarity to other beta-diversity measures to make sure it isn't an issue unifrac per se
- JG: how does relative abundance play into results? May depend on gene length
- MEGAN (whole metagenome + pplacer) vs. AMPHORA approach - some taxa are in terms of very different in terms of presence/absence and 5-10% different in terms of relative abundance
- should look at MEGAN with a subset of genes to eliminate this variable (vs. methods)
- Overlap with AMPHORA2 paper
- Syngergy between Sam and Steve's approaches
- Sam will test if she can get more gene families running, plus pplacer in simulations
- Can we quantify which gene families are better for phylogeny based analyses
- Sam: protein family phylogenetics simulation project