Moore Notes 1 15 14

From OpenWetWare

Jump to navigation Jump to search

Group Call

Participants: Jonathan, Katie, Tom, Josh, Sarah, Dongying, Guillaume, Stephen, Patrick

PICRUSTs update http://edhar.genomecenter.ucdavis.edu/~gjospin/picrust_test/picrust_test/Jan_13_2014_modules/
- Found a bug in mapping from Kegg Orthology Groups to Modules, but fixing it did not change results
  - Modules better correlated than individual KOs, but still not highly correlated between PICRUSTs and Shotmap
- Fixing this made it run faster (now don't search all family members, just the module ID)
- Ran full data set plus top 100
  - Correlations a bit higher for top 100
  - However residuals are pretty large
- How bad would PICRUSTs output be for niche modeling
  - Josh recommends looking at logic transformed relative abundances
- Patrick: could be useful to look at KO (or module) across samples to see if correlated between PICRUSTs and Shotmap
- Let's put this on hold for now, and see how long it takes for metagenomic data to come
- Patrick might want to apply PICRUSTs to estimate protein family abundances in mammalian microbiomes with metabolomics data

EFI collaboration
- They do all analyses based on UniProt IDs (vs. genomes)
  - KB database as source of input sequences for current analysis pipeline
- How many Sfams have no Pfam annotation?
  - Annotated Pfams: ~61% of Sfams have no Pfam
  - Pfam-B families (unannotated): TBD
- What percent of InterPro has an Sfam?
  - For comparison, only ~19% of InterPro (UniProtKB) has no Pfam
- Should annotate and score families
  - Jonathan: should use metrics to score these (phylogenetic breadth), not just family size
  - Tom: also how connected is Sfam in family network space
  - Stephen: from gut (genomes plus assembled proteins), consistently present, maybe correlated with a phenotype
  - Jonathan: antibiotic resistance and synthesis genes
  - This framework could be useful for multiple environments and future grants
  - Call it the "most wanted" list
- Downstream analyses they can do
  - Genome neighborhood based functional prediction
  - For pathway analysis, start with input (e.g., solute carriers) and try to annotate the rest of the pathway
  - Synthesize/clone proteins
  - Crystal structures
  - In vitro enzyme activity assays
- Blue Waters: 20 million integer processor hours
- To do: get list of InterPro with no PFAM (or other annotation)

Tim Laurent is leaving
- Get him to document what he did with Sfams several months ago (build 2 families)
- Ad posted
- Jonathan will follow up with Katie re: hire and support from his lab on EFI project

Retrieved from "https://openwetware.org/mediawiki/index.php?title=Moore_Notes_1_15_14&oldid=990248"

Navigation menu