Classification, prediction and modelling of bacteriocins (and archeocins)
Protein or peptide-based antibiotics have been discovered in all three domains of life (named bacteriocins, eucaryocins and archeocins). There's a considerable amount of research conducted on their structure and mechanisms of activity. That includes various ways of their classification, such as: method of killing (pore forming, dnase, nuclease, murein production inhibition, etc), genetics (large plasmids, small plasmids, chromosomal), molecular weight and chemistry (large protein, polypeptide, with/without sugar moiety, containing atypical amino acids like lanthionine) and method of production (ribosomal, post ribosomal modifications, non-ribosomal).
Because with modern sensitive tools for finding sequence similarity I see homology between seemingly non-related gene-encoded bacteriocins, there's a chance, that using sequence information one would be able to come up with new classification schema. Hopefully, that schema later would allow to build a tool for prediction of bacteriocins from genomes. Given recently increasing interest in novel peptide-based antimicrobial drugs, there's also hope that evolutionary approach to bacteriocin research can lead to some new insights in that area too (although this is only nonsupported hope).
- data collection (literature, PDB structures, existing classifications, protein annotations)
- data clustering (sequence based)
- enriching dataset with HHsenser for each cluster from previous point
- data clustering
- construction of manually curated alignments for each cluster and later profiles for genome-wide scans
- exhaustive search for bacteriocins in available genomes and phylogenetically enriched clustering