CH391L/S12/Unnatural Amino Acids
Unnatural Amino Acids
The genetic code for the translation of RNA into protein is one of the most ancient and universal innovations in the evolution of life on earth. Nearly all life forms use the same redundant code for the incorporation of the 20 canonical amino acids into proteins. In two unique exceptions, selenocysteine and pyrrolysine, stop codons have been retooled to code for a 21st amino acid.  Expansion of the genetic code to include noncanonical, or unnatural amino acids (UAAs) holds promise for improving and diversifying protein function, generating proteins that normally would require postranslational modification, and the study of the genetic code itself. Technology for the creation of proteins bearing UAAs has progressed steadily over the last ~30 years, including both in vitro and in vivo methods.
In vitro Synthesis
Solid-phase Synthesis and Chemical Ligation
Solid-phase peptide synthesis (SPPS) was developed by Bruce Merrifield in the early '60s, for which he received the Nobel Prize in 1984. In this method, the C-terminal amino acid is anchored via a linker to an insoluble support. Both the N-terminus and the side-chain are protected from reaction. The N-terminus is typically protected by Boc or Fmoc groups, and the side-chains can be protected by a variety of groups. In the first step, the N-terminus is deprotected. The next desired amino acid in the chain is added to the column. This process of deprotection and addition is repeated until the chains is completed. The side-chains are then deprotected, and the completed peptide is eluted from the column. This method is useful for producing peptides of up to ~50 amino acids, and noncanonical amino acids may be readily incorporated.
Native chemical ligation is used to produce larger peptides and full proteins. This method requires uniquely reactive functionalities incorporated into each peptide at the N- and C-terminus, and allows the use of unprotected peptide segments. In one method of native chemical ligation, the thiolate group of an N-terminal cysteine residue peptide attacks the C-terminal thioester of a second unprotected peptide. This reversible transthioesterification step is chemoselective and regioselective and leads to form a thioester intermediate. This intermediate rearranges irreversibly by an intramolecular S,N-acyl shift that results in the formation of a native peptide bond at the ligation site. This method may be repeated to make long peptides and proteins.
Synthesis via Chemically Aminoacylated tRNAs
Proteins with unnatural amino acids may also be produced biosynthetically. In this technique, truncated tRNAs are enzymatically ligated to chemically aminoacylated nucleotides, effectively decoupling the identity of the tRNA from that of the attached amino acid. One can then use a cell-free translation system to synthesize proteins with unnatural amino acids incorporated at the codon complementary to the tRNA used - typically an unused stop codon like Amber (UAG).  The protein may then be isolated and its properties analyzed.
In vivo Approaches
Amino Acid Auxotroph Substitution
Strains auxotrophic for a canonical amino acid can incorporate close structural analogs into proteins. Cells may be grown in presence of the canonical amino acid, and then removed from the growth medium and inoculated into growth medium containing none of the canonical amino acid, but an overabundance of a close structural analog. While this analog is usually not able to sustain exponential growth, nondividing cells are still viable and able to overexpress proteins containing this analog.
In vivo Amber Codon Suppression
In the late 1990s and early 2000s, the Schultz group at Scripps developed the technology to generate organisms with an expanded 21 amino acid genetic code.[8, 9, 10] UAAs have been successfully genetically encoded in organisms including a variety of bacteria, yeast, and human cells. These systems are diagrammed at right and generally consist of the following components:
- An orthongonal Amber stop codon (UAG) suppressor tRNA
- Evolved aminoacyl tRNA synthetase (aaRS) to charge specific unnatural amino acids on Amber suppressor tRNA
- A selectable (Ab resistance) marker with at least one in-frame Amber codon
- Exogenously supplied unnatural amino acid
The primary challenge to overcome in the development of these systems is the fulfilling the criterion of aaRS/tRNA orthogonality and aminoacylation specificity. The best starting point for this goal is to import an aaRS/tRNA pair from a different domain of life. The orthogonality of the pair must then be improved by rounds of positive selection to obtain individuals that successfully incorporate the unnatural amino acid of choice and negative rounds of selection to ensure that canonical amino acids are not incorporated at the Amber codon. This process is diagrammed below.
Many orthogonal tRNA/aaRS pairs have been developed, and the source organism for each pair will typically be from a different domain of life than the organism for which the pair will be engineered. Different pairs require varying degrees of engineering and directed evolution.
The tyrosyl tRNA/synthetase pair of Methanocaldococcus jannaschii, an archaebacterium, is one of the most commonly evolved orthogonal pairs for the incorporation of unnatural amino acids. This pair was originally chosen because the identity elements of its tyrosyl tRNA differ from those of the E. coli tyrosyl tRNA, and the aaRS contains a very minimal anticodon loop binding domain and lacks any editing mechanism which would proofread UAA-tRNA ligation. This tRNA-aaRS pair is used for the incorporation of UAAs 1-15, 17-26, 31, 32, 34-36, 41-44, 46, and 48-50 in figure at the bottom of the page.
A number of methanogen archaea including Mathanosarcina barkeri naturally encode pyrrolysine as a 21st amino acid at Amber (UAG) codons.  This system is unique in that it evolved naturally, and is highly orthogonal and efficient in other species of bacteria even without extensive optimization. This system has been used to incorporate UAAs 40, 51, 59, 60,61-68 and 69-71 in the figure at the bottom of the page.
Limits on Orthogonality
While papers describing incorporation systems for specific UAAs will typically profess a high degree of orthogonality of the system and fidelity of UAA incorporation, the reproducibility of these results varies severely from system to system. While fidelity of incorporation can be directly measured using mass spectrometry and N-terminal (Edman) sequencing of proteins, a standard rough measure of orthogonality is the ability of a strain carrying a selectable marker (Cam) with in-frame Ambers to grow with and without the UAA of choice. If the tRNA/aaRS pair is highly orthogonal, bacteria should only be able to grow in the presence of the UAA under these circumstances. If canonical amino acids may be charged by the introduced aaRS, or if an endogenous aaRS recognizes the introduced tRNA, then canonical amino acids will be incorporated at Amber and the bacteria will grow with or without the presence of the UAA. Different pairs fare variably in this test, with some such as L-DOPA(21) growing better in the absence of the UAA than in its presence, and others (22,23) displaying better orthogonality from personal experience. These challenges may be overcome by using advanced techniques for the evolution of better orthogonal pairs, or by reengineering strains to encourage orthogonality.
Release Factor 1 (RF1) recognizes the termination codons UAA and UAG, and is responsible for stopping translation at these codons. While obviously important for proper functioning of translation, the presence of RF1 also limits the amount of full-length protein produced if the gene contains an in-frame stop codon by competing with the Amber suppressor tRNA at the ribosome. This problem is compounded with each additional Amber in the gene, leading to a rapid dropoff of full-length protein isolated with greater than one stop codon.
Until recently, it was thought that RF1 was essential for cell survival. Several methods have recently been used to make RF1 conditionally inessential, enabling its knockout. Mukai et al. introduced all seven essential genes normally ending in Amber codons on a plasmid, instead ending in UAA.  Johnson et al. "fixed" the expression of RF2, the other primary release factor in E. coli. Both these measures enabled the knockout of RF1. The benefit of this knockout can be seen by the amount of full-length GFP produced in these knockout strains. Multiple Amber stop codons may exist in the GFP reading frame and still result in functional, full-length GFP when RF1 is knocked out. Furthermore, these strains do not grow in the absence of the absence of the UAA. Mukai et al. theorize that this is because, in the absence of the UAA, ribosomes stall at Amber stop codons, limiting their availability for translating other proteins and resulting in the degradation of essential proteins whose coding sequence ends in UAG.
- Longstaff DG, Larue RC, Faust JE, Mahapatra A, Zhang L, Green-Church KB, and Krzycki JA. . pmid:17204561.
- Böck A, Forchhammer K, Heider J, and Baron C. . pmid:1838215.
- Dawson PE and Kent SB. . pmid:10966479.
- Schnölzer M and Kent SB. . pmid:1566069.
- Hecht SM, Alford BL, Kuroda Y, and Kitano S. . pmid:248056.
- Noren CJ, Anthony-Cahill SJ, Griffith MC, and Schultz PG. . pmid:2649980.
- Link AJ, Mock ML, and Tirrell DA. . pmid:14662389.
- Liu DR and Schultz PG. . pmid:10220370.
- Wang L, Xie J, and Schultz PG. . pmid:16689635.
- Wang L and Schultz PG. . pmid:11564556.
- Liu CC and Schultz PG. . pmid:20307192.
- Liu CC and Schultz PG. . pmid:20307192.
- Park HS, Hohn MJ, Umehara T, Guo LT, Osborne EM, Benner J, Noren CJ, Rinehart J, and Söll D. . pmid:21868676.
- Srinivasan G, James CM, and Krzycki JA. . pmid:12029131.
- Mukai T, Hayashi A, Iraha F, Sato A, Ohtake K, Yokoyama S, and Sakamoto K. . pmid:20702426.
- Johnson DB, Xu J, Shen Z, Takimoto JK, Schultz MD, Schmitz RJ, Xiang Z, Ecker JR, Briggs SP, and Wang L. . pmid:21926996.
- Muir TW. . pmid:12626339.