Computational, Biophysical and High-Throughput Studies of Protein-Protein Recognition
We are using computational and experimental methods to understand, predict and design protein-protein interactions.
Protein-protein interactions mediate almost all biological processes important to life. Large-scale efforts in proteomics and structural genomics are rapidly identifying new protein complexes and providing high-resolution structures of some of the components. Our incomplete understanding of how to predict, design or disrupt such interactions, however, is a significant barrier to fully exploiting these data.
The goal of our laboratory is to better understand the versatility and specificity of protein-protein interactions through a combined program of bioinformatic analysis, computational design, and experimental characterization. New computational methods recently developed for the design of protein folds are potentially powerful tools for re-engineering proteins to form specific interactions. Designed proteins could act as inhibitors, activators or stabilizing partners for molecules of biological interest; they could be used as specific dimerization reagents, as standards for the development of competitive binding assays or as initial models for the design of small molecule peptidomimetics, among other potential applications.
Our work to date has focused on computational and structural studies of coiled-coil proteins. Coiled-coil protein interactions. Coiled coils are a common and important protein-protein interaction motif that consists of two or more a-helices that wrap around each other with a superhelical twist. These proteins are characterized by a repeating sequence of seven amino acids, (abcdefg)n, in which the a- and d-position residues are hydrophobic and the e- and g-position residues are usually polar or charged. The regular sequence makes it possible to predict the occurrence of coiled coils in genomic sequence data. We estimate that 5% of all proteins in S. cerevisiae, C. elegans, A. thaliana and D. melanogaster contain a coiled-coil region. It is likely that most of these coiled coils mediate protein-protein interactions or oligomerization.
An important, unanswered question about coiled coils is how their interaction specificity is encoded in their sequence. We call this the “partnering problem” for coiled coils and are studying it using both computational and experimental approaches.
Because the coiled coil has a very simple structure, it is particularly amenable to computational modeling. In my postdoctoral work, I developed a method to predict the core structure of certain coiled-coil heterodimers from their sequences. Using a series of coiled coils that we developed to test the calculations, we showed that it is possible to make accurate predictions of both the structures and the relative stabilities of different complexes. Therefore, this approach can be used to address the partnering problem.
We are also studying interaction specificity experimentally, focusing on the bZIP transcription factors. In this class of proteins, the coiled-coil region determines the homo- or heterodimerization specificity of the transcription factor, which in turn influences its DNA-binding properties and biological function. To determine how sequence encodes interaction preferences in this class of proteins, we have used microarray technology to measure all of the pair-wise interactions between 49 human bZIP peptides. We also measured the relative stabilities of 8 different bZIP homo- and heterodimers in solution, and have solved crystal structures for three of these complexes. In addition to providing a wealth of data about molecules important for human biology, these data provide a starting point for testing computational methods for predicting interaction specificity. In collaboration with Mona Singh in the Princeton University Computer Science department, we are developing and testing such methods, using machine learning and molecular modeling approaches.
Computational Design of Protein-Protein Interactions
Another way to understand factors that mediate protein association is through the process of design. The field of protein design has seen exciting advances recently with the application of fast search algorithms to the problem of side chain positioning. This has allowed, for example, the design of a new protein that adopts a zinc-finger fold that has little sequence similarity to any known zinc-finger protein (Dahiyat, B. I. & Mayo, S. L. (1997) Science 278, 82-87). Although there are still significant hurdles to be overcome, such successes suggest that the design of custom proteins with useful functions may be within reach. We are applying computational methods developed for the design of protein folds to protein interfaces. The general approach involves an extensive search through protein sequence and conformational space. Solutions that correspond to possible new designs are selected using a molecular mechanics-based energy function. The process provides a computational alternative to experimental selection techniques such as phage display. Our goal is to provide insights into the biophysics of molecular recognition, and to apply computation-aided design to the molecular engineering of proteins of biological interest. Close feedback between design and experimental testing is central to our approach.