BBRFC14

RFC 14: Protein Domain Fusions in BB-2 Assembly November 23, 2008 Tom Knight

Related RFCs: 9, 11, 12, 13

Keywords: BB-2 assembly, protein fusions, protein tags

Purpose:

Original Biobrick assembly scars left out-of-frame assembly, and when in fame, left poor amino acid choices for protein fusions. With the BB-2 assembly standard, protein fusions are easy and leave an in-frame Ala-Ser scar. This provides us with the opportunity to rethink the partitioning of protein coding regions, with a much greater emphasis on the modularity of the protein structure. In particular, it now is desirable to think about N terminal domains such as export tags, C terminal domains such as degradation tags, and both N and C terminal tags and cleavage sites.

RFC 13 describes a proposed fragmentation of coding regions into a Head, zero or more Domains, and a Tail region. The scar linking these regions is undefined there, but here we consider the specific case of BB-2 scars, with sequence GCTAGT, coding for the amino acid sequence Ala-Ser.

Proposal:

Specifically, we propose here a rethink of the RBS/CDS region, resulting in a fusion of the ribosomal binding site with the start codon. A typical RBS/CDS start then would consist of an RBS, a spacer to the start codon, followed by an ATG start codon. The assembly of this RBS/Start with a protein Domain would result in an RBS and the N terminal coding region consisting of Met-Ala-Ser, followed by the Domain. This standardizes the initial few amino acids of each coding region, in combination with the ribosomal binding site, as a way of creating more standard translation levels. It is well known that the first few codons have significant effect on translation efficiency, so standardizing those, along with the RBS, should result in less expression variability. This standardization could not be performed along with protein export tags, but could, in most cases, be used with N terrminal tags.

In BB-2 assembly, The translation efficiency of the Met-Ala-Ser N terminal region (with the specified codons) is approximately 60% compared to the best efficiency initiation, with the sequence AUGAAA (Met-Lys) (Looman87, Stenstrom01, Stenstrom01a).

For proteins needing custom N terminal codons, such as export signal tags, these can be constructed as custom RBS/Head domains. See (Zalucki07, Zalucki07a, Zalcki08) for a discussion of codon usage in export tag leaders.

Note that the initial f-Met in coding regions constructed with BB-2 assembly is highly susceptible to pratease cleavage with the formyl-methionine-peptidase protein (map) in E. coli (see Frottin06), so mature proteins would have an N-terminal alanine.

More specifically, I propose a re-worked set of Head RBS + ATG components in the registry, consisting of many of the existing RBS components, using the BB-1 scar as the RBS-ATG spanning sequence. A typical RBS+ATG part then might have the sequence AGGAGGACTAGATG with the BB-1 mixed site preceeding the ATG start codon. This would retain some consistency of sequence with the existing RBS measurements, while moving to the new BB-2 standard.

Existing protein coding regions would be reworked to remove the initial start codon, stop codon, and any tags or degradation tails.

A new set of Tail parts, consisting of TAATAA stops would be constructed. Additionally, a set of E. coli degradation tails of different lifetimes would be made, including the TAATAA stop codon pair.

A new set of protein fustion tags would also be made. A partial list includes this set (partially from the excellent resource in the tables of Kimple04):

protein purification tags, antibody epitopes

FLAG DYKDDDDK pmid 8770418 http://www-users.med.cornell.edu/~jawagne/FLAG-tag.html

c-MYC EQKLISEEDL

HA YPYDVPDYA

6HIS HHHHHH (should this really be 7His? Interspersed AA for patent reasons?)

Strep AWRHPQFGG Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly

GST (n-terminal)

Chitin binding

BCCP biotin carboxyl carrier protein PMID: 14736427 PMID: 10470036 PMID: 10211839

VSVG

Thrombin cleavage domain LVPRGS (Leu-Val-Pro-Arg-Gly-Ser), factor Xa cleaved

Calmodulin tag

S-tag

GFP protein tags

maltose binding protein

Thioredoxin

protein splicing domains (inteins)

surface display proteins LPP/OmpA Neisseria IgA1

Frottin F, Martinez A, Peynot P, Mitra S, Holz RC, Giglione C, Meinnel T. The proteomics of N-terminal methionine cleavage. Mol Cell Proteomics. 2006 Dec;5(12):2336-49. Epub 2006 Sep 8. PMID: 16963780

Kimple ME, Sondek J. Overview of affinity tags for protein purification. Curr Protoc Protein Sci. 2004 Sep;Chapter 9:Unit 9.9. Review. PMID: 18429272

Looman AC, Bodlaender J, Comstock LJ, Eaton D, Jhurani P, de Boer HA, van Knippenberg PH. Influence of the codon following the AUG initiation codon on the expression of a modified lacZ gene in Escherichia coli. EMBO J. 1987 Aug;6(8):2489-92. PMID: 3311730

Stenstrom CM, Jin H, Major LL, Tate WP, Isaksson LA. Codon bias at the 3'-side of the initiation codon is correlated with translation initiation efficiency in Escherichia coli. Gene. 2001 Jan 24;263(1-2):273-84. PMID: 11223267

Stenstrom CM, Holmgren E, Isaksson LA. Cooperative effects by the initiation codon and its flanking regions on translation initiation. Gene. 2001 Aug 8;273(2):259-65. PMID: 11595172

Zalucki YM, Power PM, Jennings MP. Selection for efficient translation initiation biases codon usage at second amino acid position in secretory proteins. Nucleic Acids Res. 2007;35(17):5748-54. PMID: 17717002

Zalucki YM, Jennings MP. Experimental confirmation of a key role for non-optimal codons in protein export. Biochem Biophys Res Commun. 2007 Mar 30;355(1):143-8. Epub 2007 Jan 31. PMID: 17291454

ExPASy PeptideCutter: The cleavage specificities of selected enzymes and chemicals http://ca.expasy.org/tools/peptidecutter/peptidecutter_enzymes.html