1. Page 2 - "it is not possible to close a metagenome" - meaning what don't have the technical capacity to do this now?
2. Page 2 - what is this 4th approach for bringing "statistical analysis to the analysis of metagenomic sequences"?
3. Page 2 - how does the BSR approach differ from DOTUR, AMPHORA, ZORRO, etc.?
4. Page 3 - what did the do with Baccillus? Looks like they are assigning OPF > 1 for one population. What does this mean biologically? Is this some type of population-level trait variation/diversity measurement?
5. Page 4 - why expect 100% overlap when sampling two halves of B. anthracis?
Kunin et al. 2008
1. What is difference between rRNA/rDNA analyses (when discussing primers, Eisen PLoS, metagranscriptomics (page 558))
2. Page 559 - how read Figure 2?
Get larger contigs in communities dominated by a few species.
3. Page 562 - how big are the fosmids for our Mediterranean project?
We aren't using fosmids. Fosmids are a different kind of vector that are not e-coli. They can shear DNA to get average sizes of 3, 8 and 40 kbp.
4. Page 562 - what is the difference between contig size and read depth? Sounds like GC content could be an important trait.
Read depth is the number of reads that you have for a certain portion of the genome (another word would be coverage).
5. Page 565 - Figs 3, 4
Fig 3 - showing misalignment. Fig 4 - should just see one peak per color.
6. Page 565 - would be ever be building contigs in our pipeline?
7. Page 565 - "the major cause of misassembly in genomic projects is repetitive regions that can be resolved in the finishing process"
Finishing process - after automated stuff happens. Hand curating contigs and assemblies to make sure no problems.
8. How does AMOS differ from AMPHORA?
AMPHORA works with 31 marker genes only, not entire genome assembly.
9. Page 566 - what is finishing. "Genome rearrangements such as insertions, deletions and inversions will break assemblies, whereas point mutations usually will not. Why?
Point mutations are SNPs and that doesn't cause problems.
10. Page 567 - what is "generic" gene prediction?
In GOS data are nucleotide data. AMPHORA works with amino acids or proteins. To do that, someone has run ORF reading frame methods (ORF translated versions of the data).
11. Page 567 - "Treating all ORFs as putative genes usually produces prohibitive amounts of data, contains too much noise, and is therefor very hard to use".
12. Page 568 - what mean to perform gene calling on both reads and contigs?
13. Page 568 - discuss context annotation.
14. Page 569 - will we not be using evolutionary distance methods due to the problems brought up here?
15. Page 570 - understand supervised and unsupervised procedures?
16. Discuss Figure 7.