From OpenWetWare
Jump to navigationJump to search

What is directed protein evolution?

Directed evolution is a powerful method for altering the properties of biological parts and systems. Directed protein evolution employs iterative rounds of mutation and artificial selection to generate new proteins with desirable functions [1].


Overview of directed evolution [1].

Biological molecules have a unique ability to rapidly evolve in response to strong selective pressure [1]. Protein engineers exploit this evolvability to generate new and useful proteins through successive rounds of mutation and selection [2] [3]. This approach is known as directed protein evolution, and it involves four basic steps [1]:

  1. A parent protein sequence is selected, or multiple sequences for some methods.
  2. The parent sequence is mutated to generate a library of functional variants.
  3. Variants are evaluated for their ability to perform the desired function.
  4. The process is repeated starting with step 2 until the desired function is achieved.

The parent sequence is chosen based on its perceived similarity to the desired function, and a library of functional variants is generated using one or several of a variety of sequence diversification techniques [1]. High-throughput functional screens and genetic selection methods are used to identify library members with enhanced target function, and those variants are used as parent sequences in successive rounds of mutation and selection [1]. This process is repeated until the desired function is achieved.

The process of directed evolution allows protein engineers to search protein fitness landscapes. For a detailed discussion of protein fitness landscapes see this review from Frances Arnold.

Library construction

After a parent sequence is chosen, a library of functional mutants must be generated. Common methods used for library construction include error-prone PCR and DNA shuffling [4] [5].

Error-prone PCR is a technique for introducing random point mutations into cloned sequences, in which modifications to standard PCR conditions increase the error rate of nucleotide incorporation during amplification [4] [5]. Common methods for decreasing polymerase fidelity include the addition of manganese ions, an increase in the concentration of magnesium, and using an imbalanced ratio of dNTPs [4] [5]. There are a number of commercially available kits for error-prone PCR, such as the GeneMorph II random mutagenesis kit from Agilent Technologies.

DNA shuffling is a technique for “in vitro homologous recombination of pools of selected mutant genes" [6]. In this method, parental sequences are fragmented by DNase I and then reassembled by PCR. Recombination events occur as fragments anneal at regions of sufficient sequence homology [6]. After reassembly, chimeric sequences are amplified by PCR and cloned into an appropriate vector.

See methods of DNA assembly page for alternative methods of gene construction.

Screening and selection

Once a library of mutants is generated, they must be evaluated for their ability to perform the desired function. To do this, protein engineers employ a variety of high-throughput assays. Successful assays allow researchers to test a large number of functional variants while maintaining a connection between phenotype (the evolved protein function) and genotype (the DNA sequence encoding the evolved function) [7]. These assays can be categorized as either “in vivo” versus “in vitro” or “selection” versus “screening" [8]. The most important distinction is between a screen and a selection. Selections allow for only cells expressing proteins that exhibit the desired function to survive. In contrast, a screen allows for cells expressing any functional variant to survive yet be distinguished by phenotype. In a typical screen, the number of variants that can be tested is ~104 [8]. This is often due to the fact that researchers must pick individual colonies to grow liquid cultures and personally supervise activity assays. In a typical selection, the number of variants that can be tested is on the order of ~106 to 108 [8]. Selections are limited by transformation efficiency, which is ~106 in yeast and ~108 in E. coli.

The most basic cell-based screening methods involve transforming a library of mutants into bacteria and identifying individual colonies or cultures that exhibit the desired function. These assays maintain a link between phenotype and genotype that is “achieved naturally by introducing plasmid DNA encoding the protein into a cell" [7] These methods allow millions of sequence variants to be transformed into cells, and by manipulating the statistics of DNA transformation allow for each cell to contain a single vector containing a single sequence variant. These individual cells can then be isolated by growth on solid media [7].

List of more advanced high-throughput screening and selection methods:

  1. phage display
  2. ribosome display
  3. mRNA-peptide fusion
  4. plasmid display
  5. cell-surface display
  6. n-hybrid systems
  7. in vitro compartmentalization
  8. spatial address

Improving whole cell fluorescence of GFP by directed evolution

Comparison of the fluorescence of different GFP constructs in whole E. coli cells [2].

Wild type green fluorescent protein (GFP) is routinely used as a reporter of gene regulation. However, for some applications, a stronger whole cell fluorescence signal is required [2]. Thus, the research team of Willem Stemmer et al. set out to construct a GFP mutant that would exhibit enhanced whole cell fluorescence. To do this, the group first constructed a synthetic GFP gene with improved codon usage. They then performed successive cycles of DNA shuffling followed by a visual screen for the brightest E. coli colonies. Using this method, they were able to generate a mutant with a whole cell fluorescence signal 45-fold greater than a standard commercially available GFP sequence [2].

Directed evolution of a thermostable esterase

Model of pNB esterase constructed based on homology to esterases of known structures [3].

It had been previously suggested that enzyme thermostability is incompatible with high catalytic activity at low temperature [3]. However, by constraining both properties, the Frances Arnold group was able to create a thermostable esterase that maintains high catalytic activity at lower temperatures using directed protein evolution. To do this, the Arnold group generated a mutant library through error-prone PCR and DNA shuffling and developed a 96-well plate thermostability and activity screen. After six generations of mutagenesis and screening they were able to generate a thermostable p-nitrobenzyl esterase mutant that retains catalytic activity at low temperature (30ºC) [3].

Summary and future directions

The papers highlighted in this article demonstrate that directed protein evolution strategies "circumvent our profound ignorance of how a protein's sequence encodes its function by using iterative rounds of random mutation and artificial selection to discover new and useful proteins" [1]. In addition to the generation of novel proteins, directed evolution studies have also helped protein engineers elucidate the "mechanisms of adaptation" that give rise to new protein functions [1]. As protein engineers continue to employ directed evolution strategies, new methods of screening and selection are continually in development. These more advanced methods, such as phage display and in vitro compartmentalization (listed above) all maintain a link between phenotype and genotype.

iGEM connection

The UC Davis 2012 iGEM team used directed evolution to engineer an E. coli strain that more efficiently degrades ethylene glycol than previous strains. Although not strictly a directed protein evolution project, their work demonstrates the ability of biological molecules and systems to rapidly evolve under strong selective pressure.


  1. Romero PA and Arnold FH. Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Bio, 2009.

  2. Crameri A, Whitehorn EA, Tate E, Stemmer WP. Improved green fluorescent protein by molecular evolution using DNA shuffling. Nat Biotechnol, 1996.

  3. Giver L, Gershenson A, Freskgard PO, Arnold FH. Directed evolution of a thermostable esterase. Proc Natl Acad Sci USA, 1998.

  4. Cadwell RC and Joyce GF. Randomization of genes by PCR mutagenesis. Genome Res, 1992.

  5. Abou-Nader M and Benedik MJ. Rapid generation of random mutant libraries. Bioeng Bugs, 2010.

  6. Stemmer WP. Rapid evolution of a protein by in vitro DNA shuffling. Nature, 1994.

  7. Lin H and Cornish VW. Screening and selection methods for large-scale analysis of protein function.

  8. Leemhuis H, Kelly RM, Dijkhuizen L. Directed evolution of enzymes: library screening strategies. IUBMB Life, 2009.