Moore Notes 8 18 09
Conference Call about computing OTUs from GOS data
1. Tom has algorithms set up to run on genbeo.
2. Options: a) blast all genes, b) align raw reads, c) align XX
3. After blast against greengenes, for sequences with high agreement, noncoding sequences need to be trimmed.
4. Steve suggested using something like NAST or RDP to kick out post-trim short reads (e.g. 50 bp) that align poorly. Josh asked, what about padding the sequence with a reference greengene sequence if there is really high agreement? Tom says the entire sequence doesn't evolve at the same rate, now getting into developing model that accounts for conservation rate of different sites. So much data out there, could use know 16s sequences to derive a prior gamma for different sites. Jonathan is keen on using as much reference data as possible to understand metagenomic data. We could attempt to devise some type of scheme to derive gamma parameter. Relative conservation score is kind of like the gamma parameter For Steve's project, liklihood parameter that varies among gene families. We could also do this within a gene family (i.e. 16s). If we had some variability score for each region of 16s, that would inform the blast step. It would be a post-blast analysis.