Moore Notes 1 16 13
From OpenWetWare
Jump to navigationJump to search
Group Call
- Participants: Jonathan, Katie, Dongying, Guillaume, Stephen, Josh
- Guillaume: updating SFams
- Downloading new genomes (>2000, about 5000 total)
- Doing sifting
- All-vs-all LAST for MCL of new sequences next
- Stephen: LAST designed for "best hit" searches
- Maybe the m-parameter will produce a distance matrix that is too sparse or not symmetric
- Katie: Compare results to BLAST for sequences that we had previously
- Stephen: SFam clans
- Currently use consensus sequence (regardless of family size)
- Another distance might be better
- PRC is a good profile-vs-profile method, but very slow
- SCOOP looks like a good option
- Dongying: Could compute multiple consensus sequences per family for larger families
- Stephen: classification thresholds
- Min scoring true positive, max scoring false positive
- Best if these don't overlap, but sometimes they do
- In PFAMs, the false positives above the min true positive threshold are all from a related family (by definition)
- Stephen investigated this in SFams (5%), PFAMs (6%), TIGRFAMs (3%)
- Round 2 genomes analysis shows that these problematic families are very common
- Family-specific thresholds might be a good idea, though makes it more complicated
- Stephen: hmmsearch performance
- More sensitive than BLAST for full-length sequences
- But metagenomics data (i.e., short sequences) is better classified using BLAST/LAST
- Not specific to SFams
- Particularly true for large families
- Need to increase read length to find inflection point
- Sequencing error is not a major factor
- Need a better and update web presence
- Jonathan: Word Press has website/blogging software
- Open, self-install version (.org) needs to be hosted on a server (e.g., dreamhost)
- Commercial version (.com) is hosted, free for a little storage, $20/year for more
- Will have iseem.org point there
- Keep notes and private pages on OWW for now
- Jonathan: Word Press has website/blogging software
- Dongying: Statistical question
- How to compare clusters?