Moore Notes 2 9 15
From OpenWetWare
Jump to navigationJump to search
Group Call
- Participants: Katie, Jonathan, Tom, Stephen, Guillaume, Dongying, Ladan, Stacia, Patrick, Josh
- Upcoming meetings
- JGI users meeting
- ASM in New Orleans
- AAAS this week (Katie speaking at main meeting, and citizen science satellite meeting)
- MicrobeCensus changes have been pushed recently
- Tom will check affects on ShotMap
- ShotMap final loose ends
- Target this Friday for these
- Submit a week or so later
- Tara Oceans update
- PATRIC genomes as reference db
- JE: check for phylogenetically novel genomes
- Genome clusters
- Gene clusters within genome clusters
- DIAMOND for read alignment
- Faster than RAPSEARCH2 with a lot of reads
- A lot of compute is indexing of huge protein db
- Josh working on prioritizing using environmental variables
- PATRIC genomes as reference db
- Stephen: Strain variation in gene copy number in metagenomes (slides)
- Used PhyEco markers to identify genome clusters
- Bacterial specific markers perform best (Bacterial and Archaeal are probably too conserved)
- Ribosomal proteins have high sequence conservation (too high? maybe due to codon bias?)
- Perpetual updating
- Discussion of this paper: http://www.cell.com/cell/abstract/S0092-8674(15)00013-6
- JE: how does this relate to mapping errors (e.g., repeats)?
- Future directions for Stephen given this paper
- They did not generate methods that are easily used by others
- Create data resources (e.g., genome clusters, pangenomes)
- Analyze more data, better data
- Use genes outside KEGG
- Look at SNPs
- Unanswered biological questions: Co-variation of genes (genomic islands), biogeography, core genes, gene flux across species
- JE: Look more broadly at covariation of genes (present/absent, not just copy number variation)
- Applications to metatranscriptome analysis, "phylogenetic RNA-seq"
- Used PhyEco markers to identify genome clusters