Moore Notes 1 21 15

From OpenWetWare

Jump to navigation Jump to search

Discussion of TARA Oceans data

Participants: Katie, Josh, Tom, Guillaume, Stephen

Stephen:
- Data embargo issue
- Updated summary (slides)

Analysis discussion:
- What do we want to do with the data?
  - Start with aims of proposal
- How to preprocess?
  - They will likely release EGGNOG abundances
  - They may map reads to assemblies (gene catalog)
  - Do we need something more/different?
    - Database
    - Classification thresholds
    - AGS normalization
- diamond vs. rapsearch2
  - Do a quick comparison (correlation) of bit scores
    - If highly correlated, can use previously identified thresholds
- Many (667) samples to process
  - Prioritize the prokaryote size fraction, then protists, then viruses
  - Prioritize open ocean (all?), surface waters (approximately 216 samples)
  - Start with metagenomes
- Josh will look at ecological variability (MESS plots)
  - Can we do global predictions?
  - Are there samples we would drop and therefore do not need to run for read classification?
- How much QC is needed
  - Stephen: Probably hasn't been done, but also not necessary
  - Better to keep track of quality and use that info downstream
  - Illumina looks better than 454
  - Could QC one library and compare protein family abundances pre and post QC
- Size fractions reliable?
- Stephen will start AGS analyses right away

Retrieved from "https://openwetware.org/mediawiki/index.php?title=Moore_Notes_1_21_15&oldid=990291"

Navigation menu