From OpenWetWare
Jump to navigationJump to search


Translation is the third stage of protein synthesis and proceeds in three phases: initiation, elongation and termination. This process involves "reading" a mRNA's codons to create a polypeptide chain.


In prokaryotes, initiation occurs when an amino acid is covalently bonded to a tRNA. [1] Next the ribosome binds to the 5' end of mRNA. This allows translation to build a particular protein. This process is halted once a stop codon (UAA, UAG, or UGA) is reached because no tRNA can bind to these codons. What does recognize this codon is a "release factor" protein that disassembles the ribosome/mRNA complex.

An illustration of this process is shown:

A ribosome translating a protein that is secreted into the endoplasmic reticulum. tRNAs are colored dark blue.


Similar to prokaryotes, eukaryotic translation has three phases. However, two types of initiation exist. Usually to initiate translation, proteins must interact with the 5'-end of the mRNA called the 5' cap. This cap is an altered nucleotide that consists of a guanine linked to the mRNA using a triphosphate link. This type of initiation is cap-dependent. The other type is cap-independent. This method of initiation is usually mediated by the Internal Ribosome Entry Site (IRES) approach. Elongation and termination proceed in a similar manner to prokaryotes.

Internal Ribosome Entry Site (IRES)

IRES is a nucleotide sequence found in the middle of a mRNA that allows for the initiation of translation without needing to start at the 5' cap. This task is accomplished by the ribosome being "guided" to the IRES by IRES trans-acting factors (ITAFs). During recent years, examples of this type of initiation has been shown to be used in processes like apoptosis where a specific area of the mRNA needs to be expressed to induce cell death. [2]

Bacterial Ribosome Binding Sites

mRNA showing an RBS with the start codon AUG

A ribosome binding site (RBS) is a region 6-7 nucleotides upstream of the start codon AUG in prokaryotes called the Shine-Dalgarno sequence (5′–GGAGGU–3′). The ribosome will base pair with this site through its own rRNA as well at the start codon using tRNA. What makes this an interesting topic of research is the fact that the Shine-Dalgarno sequence is not the "optimal" RBS for all expression processes. In other words, a RBS affects the rate at which a particular Open Reading Frame (ORF) is translated. There are two general ways in which this happens: i) the rate at which ribosomes are recruited to the mRNA and initiate translation is dependent on the sequence of the RBS and ii) the RBS can also affect the stability of the mRNA, thereby affecting the number of proteins made over the lifetime of the mRNA.

RBS Sequence Design

The first variable to be taken into account should be the sequence of the RBS itself, specifically to what degree the RBS is complementary to the free end of the 16s rRNA. If this hybridization is highly favorable, then the time needed to complete the initiation phase will be reduced allowing elongation to proceed. Obviously the start codon AUG must be upstream of the RBS and needs to lie 6-7 nt. The reason for such a specific distance is due to the geometry of the ribosome and where it attaches itself to the mRNA. Not only is the exact sequence of the RBS important, but the overall sequence of the mRNA will affect translation rates. The iGEM registry has a nice collection of RBSs. iGEM RBS catalog Secondary structure in mRNA has been shown to affect translation rates [3] due to competitive binding of the RBS by the ribosome as well as other nucleotides in the mRNA. Many attempts have been made to computationally predict the best sequence to minimize secondary structure in DNA/RNA. CircDesigNA is one such tool .

Codon Optimization

Genetic code is defined by a sequence of three nucleotides, called codons. These codons can serve three functions. 1) The start codon, AUG is used to initiate translation 2) Specific

Table of codon usage for human cells

codons stop translation by not coding for anything 3) They serve as a template to allow the binding of amino acids to form a protein. Crick and Brenner et al. were the first to show that codons existed and that they coded for specific amino acids. [4] They translated a poly-uracil (UUUUUUU...) sequence in vitro and discovered that the only amino acid synthesized was phenlyalanine. Using similar methods, all codon combinations (4^3 = 64 total) were assigned to the twenty amino acids. This has led to developments in determining both theoretically and experimentally what codons work best for particular genes that need to be expressed[5]. This attempt at randomly varying the codons used the express GFP and found that there was a 250 fold range in protein levels across the 154 genes that were synthetically produced. However upon further investigation, they determined that secondary structure in the mRNA accounted for more than half of the variations in expression. They did not find that codon bias affected the results as much as anticipated. A review paper discussing different redesign strategies including modification of initiation regions, alteration of mRNA structure, and use of different codon biases gives a more thorough look at these topics. [6]

Eukaryotes vs. Prokaryotes

The heart of codon optimization lies in the fact that bacteria do not always use the same codon for amino acids that eukaryotic cells do. This has given rise to online tools that help determine the correct sequence needed to express genes in various host organisms. E coli codon usage

Alternative Genetic Codes and Orthogonal Ribosomes

There has been a push to create synthetic genetic code that can utilize unnatural amino acids in translation. However because the universal genetic code contains 61 codons that all code for a particular amino acid, designing artificial amino acids becomes quite difficult. One attempt to get around this is to create an entirely new genetic code that uses a mutated ribosome to allow for expression of more than 200 unnatural amino acids. [7] This genetic code is based on a four nt. codon as opposed to the traditional 3 nt. codon. To create a ribosome that can code quadruplet codons, they synthetically evolved an orthogonal ribosome and paired it with mutually orthogonal aminoacyl-tRNA synthetase–tRNA pairs.

Protein Expression with Rare Codons

As mentioned previously, sometimes proteins are not expressed well in other host organisms such as E. coli. This is compounded if the particular codons that have a low quantity of tRNAs compared to others are in abundance in the gene of interest. That is, there are more rare codons than usual in the gene of interest. To improve this difficulty, attempts have been made to adjust the codons that make up the protein making it more readily expressed in other organisms. An alternative to this approach is to alter the organism instead. [8] [9] One particular attempt tries to address low expression levels of recombinant protein from parasitic organisms. To overcome this, they transformed a plasmid that coded for three tRNAs (Arg, Ile, Gly) which will recognize the rare codons in their expression plasmid to enhance the expression levels. Previously, Brinkmann et. al. used a cotransfection method to help induce production of proteins by tranforming the tRNA Arg along with the gene of interest.



  1. Laursen2005 pmid=15755955

//Initiation of protein synthesis in bacteria.

  1. Gold1990 pmid=2199797

//Overview of high-level translation initiation.

  1. Crick1961 pmid=13882203

//General nature of the genetic code for proteins.

  1. Kudla2009 pmid=19359587

//Coding-Sequence Determinants of Gene Expression in Escherichia coli.

  1. Lopez-Lastra2005 pmid=16238092

//Study on the growing biological relevance of cap-independent translation initiation.

  1. Neumann2010 pmid=20154731

//Encoding unnatural amino acids using a quadruplet-decoding ribosome.

  1. Brinkmann1989 pmid=2515992

//Expression levels of recombinant genes in E. coli depends on the availability of dnaY gene product.

  1. Baca2000 pmid=10704592

//A method for high-level overexpression of Plasmodium and other AT-rich parasite genes in E. coli.

  1. Gustafsson2004 pmid=15245907

//Codon bias and heterologous protein expression