Wikiomics:Searching for 3D functional sites in a protein structure
Given a protein structure, which are the potentially interesting sites? Approaches which are based only on sequence patterns or backbone architecture are often insufficient to find similarities between sites of similar biochemical function.
The set of methods which are shown here use the 3D arrangement of the atoms of proteins to find putative functional sites, such as ligand binding sites or catalytic sites.
Search by comparison against annotated sites
Comparing 3D structures locally at the atomic level is not a simple problem, and there is no standard method in this field. However, many of these recent techniques are available from web servers, which makes them relatively easy to use.
An advantage of comparing a query protein structure against 3D sites of known biological activity is that both sites can be compared and the similarity can be further investigated either visually or using other tools.
Methods and tools
PdbFun  is a web server for the identification of local structural similarities between annotated residues in proteins, gives fast access to the whole PDB organized as a database of annotated residues, helps selecting any residue subset by combining the available features, compares query and target selections with a fast and sequence-independent 3D comparison algorithm representing each amino acid by one point located at its centroid.
PINTS [4, 5] defines types of atoms for certain atoms of the lateral chains of amino acids. 2 atoms of the same type such as an oxygen of a carboxyl group (in Asp or Glu) can be considered as equivalent. The search is based on interatomic distances and the scoring is based on Wikiomics:RMSD values.
pvSOAR [12, 13] uses centroids of amino acids forming pockets and the pseudosequence they form: if a pocket is made of amino acids Ala45, Tyr12, Ser124 and His32 then the corresponding sequence would be Tyr-Ala-His-Ser. The default comparison procedure uses an alignment between the sequences associated with 2 pockets. This constraint can be removed if only 2 pockets are being compared.
SiteEngine  uses surface exposed functional groups that describe the physico-chemical properties of amino acids. It is possible to compare a protein structure against a given site on the web server. The program is also available for download.
SPASM/RIGOR  was the first webserver to propose sequence- and fold-independent search in 3D structures of proteins. It represents each residue by it's C-alpha or the centroid of the lateral chain.
SuMo [16, 17, 18] uses chemical groups with their own geometry and symmetry plus a complementary local shape comparison technique. It does not require a low Wikiomics:RMSD between 2 sites to consider them as similar although local pairwise matching is required. Given a protein structure, it will scan the PDB for similar ligand binding sites and return a list of sites, sorted by decreasing size. Clicking on each individual result gives a parallel view of the matched sites.
Prediction of functional sites from geometrical or physico-chemical properties
These tools do not try to match 3D sites between a query and sites of biological importance. Based on the geometry or the chemistry of the protein sites, they are associated with a given function.
- SARIG  predicts functional sites using residue interaction graphs (contact maps)
- WebFEATURE [20, 21] scans a protein structure for local environments of a given type. An RNA version exists too, naFEATURE .
- THEMATICS [23, 24, 25] catalytic sites are predicted from deviations in theoretical titration curves of proteins
Prediction using phylogenetic information
Combined with projections onto 3D structures, the degree of conservation of aligned residues within a family of proteins can indicate amino acids which are functionally important.
- Evolutionary Trace (Cambridge) [26, 27]
- Evolutionary Trace Viewer and Evolutionary Trace report_maker (Baylor) [26, 28]
- ConSurf 
- Ausiello G, Zanzoni A, Peluso D, Via A, and Helmer-Citterich M. pdbFun: mass selection and fast comparison of annotated PDB residues. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W133-7. DOI:10.1093/nar/gki499 |
- Ivanisenko VA, Pintus SS, Grigorovich DA, and Kolchanov NA. PDBSiteScan: a program for searching for active, binding and posttranslational modification sites in the 3D structures of proteins. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W549-54. DOI:10.1093/nar/gkh439 |
- Ivanisenko VA, Pintus SS, Grigorovich DA, and Kolchanov NA. PDBSite: a database of the 3D structure of protein functional sites. Nucleic Acids Res. 2005 Jan 1;33(Database issue):D183-7. DOI:10.1093/nar/gki105 |
- Russell RB. Detection of protein three-dimensional side-chain patterns: new examples of convergent evolution. J Mol Biol. 1998 Jun 26;279(5):1211-27. DOI:10.1006/jmbi.1998.1844 |
- Stark A, Sunyaev S, and Russell RB. A model for statistical significance of local similarities in structure. J Mol Biol. 2003 Mar 7;326(5):1307-16.
read  first
- Wallace AC, Laskowski RA, and Thornton JM. Derivation of 3D coordinate templates for searching structural databases: application to Ser-His-Asp catalytic triads in the serine proteinases and lipases. Protein Sci. 1996 Jun;5(6):1001-13. DOI:10.1002/pro.5560050603 |
- Porter CT, Bartlett GJ, and Thornton JM. The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D129-33. DOI:10.1093/nar/gkh028 |
- Bartlett GJ, Porter CT, Borkakoti N, and Thornton JM. Analysis of catalytic residues in enzyme active sites. J Mol Biol. 2002 Nov 15;324(1):105-21.
- Torrance JW, Bartlett GJ, Porter CT, and Thornton JM. Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. J Mol Biol. 2005 Apr 1;347(3):565-81. DOI:10.1016/j.jmb.2005.01.044 |
- Wallace AC, Borkakoti N, and Thornton JM. TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Sci. 1997 Nov;6(11):2308-23. DOI:10.1002/pro.5560061104 |
successor of PROCAT 
- Barker JA and Thornton JM. An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis. Bioinformatics. 2003 Sep 1;19(13):1644-9.
successor of TESS 
- Binkowski TA, Adamian L, and Liang J. Inferring functional relationships of proteins from local sequence and spatial surface patterns. J Mol Biol. 2003 Sep 12;332(2):505-26.
- Binkowski TA, Freeman P, and Liang J. pvSOAR: detecting similar surface patterns of pocket and void surfaces of amino acid residues on proteins. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W555-8. DOI:10.1093/nar/gkh390 |
- Shulman-Peleg A, Nussinov R, and Wolfson HJ. Recognition of functional sites in protein structures. J Mol Biol. 2004 Jun 4;339(3):607-33. DOI:10.1016/j.jmb.2004.04.012 |
- Kleywegt GJ. Recognition of spatial motifs in protein structures. J Mol Biol. 1999 Jan 29;285(4):1887-97. DOI:10.1006/jmbi.1998.2393 |
- Jambon M, Imberty A, Deléage G, and Geourjon C. A new bioinformatic approach to detect common 3D sites in protein structures. Proteins. 2003 Aug 1;52(2):137-45. DOI:10.1002/prot.10339 |
describes the basic method, which has been considerably refined since. Read  for a good understanding of the current method and the concepts on which it relies.
- Jambon M, Andrieu O, Combet C, Deléage G, Delfaud F, and Geourjon C. The SuMo server: 3D search for protein functional sites. Bioinformatics. 2005 Oct 15;21(20):3929-30. DOI:10.1093/bioinformatics/bti645 |
application note about the SuMo web server
- Jambon M. A bioinformatic system for searching functional similarities in 3D structures of proteins. PhD thesis, 2003.
- Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, Venger I, and Pietrokovski S. Network analysis of protein structures identifies functional residues. J Mol Biol. 2004 Dec 3;344(4):1135-46. DOI:10.1016/j.jmb.2004.10.055 |
- Wei L and Altman RB. Recognizing protein binding sites using statistical descriptions of their 3D environments. Pac Symp Biocomput. 1998:497-508.
- Liang MP, Banatao DR, Klein TE, Brutlag DL, and Altman RB. WebFEATURE: An interactive web tool for identifying and visualizing functional sites on macromolecular structures. Nucleic Acids Res. 2003 Jul 1;31(13):3324-7.
- Banatao DR, Altman RB, and Klein TE. Microenvironment analysis and identification of magnesium binding sites in RNA. Nucleic Acids Res. 2003 Aug 1;31(15):4450-60.
- Ko J, Murga LF, Wei Y, and Ondrechen MJ. Prediction of active sites for protein structures from computed chemical properties. Bioinformatics. 2005 Jun;21 Suppl 1:i258-65. DOI:10.1093/bioinformatics/bti1039 |
- Shehadi IA, Abyzov A, Uzun A, Wei Y, Murga LF, Ilyin V, and Ondrechen MJ. Active site prediction for comparative model structures with thematics. J Bioinform Comput Biol. 2005 Feb;3(1):127-43.
- Ko J, Murga LF, André P, Yang H, Ondrechen MJ, Williams RJ, Agunwamba A, and Budil DE. Statistical criteria for the identification of protein active sites using Theoretical Microscopic Titration Curves. Proteins. 2005 May 1;59(2):183-95. DOI:10.1002/prot.20418 |
- Lichtarge O, Bourne HR, and Cohen FE. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol. 1996 Mar 29;257(2):342-58. DOI:10.1006/jmbi.1996.0167 |
- Innis CA, Shi J, and Blundell TL. Evolutionary trace analysis of TGF-beta and related growth factors: implications for site-directed mutagenesis. Protein Eng. 2000 Dec;13(12):839-47.
- Mihalek I, Res I, and Lichtarge O. A family of evolution-entropy hybrid methods for ranking protein residues by importance. J Mol Biol. 2004 Mar 5;336(5):1265-82. DOI:10.1016/j.jmb.2003.12.078 |
- Glaser F, Pupko T, Paz I, Bell RE, Bechor-Shental D, Martz E, and Ben-Tal N. ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics. 2003 Jan;19(1):163-4.
- Polacco BJ and Babbitt PC. Automated discovery of 3D motifs for protein function annotation. Bioinformatics. 2006 Mar 15;22(6):723-30. DOI:10.1093/bioinformatics/btk038 |
uses the same technique as SPASM 
- Schmitt S, Kuhn D, and Klebe G. A new method to detect related function among proteins independent of sequence and fold homology. J Mol Biol. 2002 Oct 18;323(2):387-406.