From OpenWetWare
Jump to navigationJump to search

For more help understanding this, here is a ChatGPT prompt you can use to ask questions. Warning: tread carefully, ChatGPT is sometimes wrong.

Internal Restriction Sites

If any of the restriction enzymes you want to use are present in your gene, you need to remove them. For BglBricks format, that would be BamHI, BglII, EcoRI, and XhoI. This section of the tutorial will explain one of many ways to achieve this using SOEing. It also demonstrates design of oligos for homology-based assembly, which is the same for Gibson cloning.

As an example of this, let's look at a salicylate promoter basic part, Bca1111. This part confers exogenous salicylate-dependent transcription of downstream genes.

(Optional) An aside about Parts and Features

Now, I'm going to go on an aside here because Bca1111 is a part that brings up some concepts worth exploring. If you want, ignore this block for now and come back to it after finishing the tutorial. It won't really help you understand this particular tutorial. With that disclaimer...

Let me point out an unusual aspect of BioBrick basic parts before we dive into the construction file. A BioBrick basic part can be thought of as a wrapped sequence flanked by particular restriction sites that cannot be constructed based on BioBrick standard assembly. By wrapped, I mean the sequence has been concatenated to 5' and 3' sequences used for assembly. In other words, anything that cannot be described as a composite part is a basic part. In the case of basic parts such as the ceaB open reading frame described in the first section, the open reading frame for the protein cannot be trivially deconstructed further into biologically meaningful subsequences. Let's call this type of primitive biological element a feature. Therefore the part you designed is a "basic part" and the sequence between its restriction sites exactly matches the sequence of the feature. There are many ways of formalizing the relationship between these two concepts of part and feature and also different definitions of what words correspond to what concept, but I think there is some consensus, or at least not my lone opinion, that there exist two independent concepts here, one is some type of composition or fabrication primitive and one is a biological primitive, or a DNA sequence that directly encodes some biochemical within the cell. A part has a sequence as does a feature, and they may be exactly the same sequence, but usually the feature is a substring of the part's sequence. Finally, the relationship between these two concepts is that a part can be annotated in a particular region by a feature. Now, a BioBrick part can map directly onto one particular feature, or it can have multiple features. Biobrick basic part Bca1111 is an example of a part with multiple features. It has an entire gene cassette of a promoter, ribosome binding site, the NahR open reading frame, and terminator. Together these parts produce the transcription factor protein, NahR. Additionally, Bca1111 contains the Psal promoter, which is activated by NahR. We could Biobrick all these fundamental parts, and then assemble something with the same utility as Bca1111 as a composite part. This procedure is the essence of refactoring--splitting a naturally-occurring sequence into basic parts and reassembling them into a composite part that maintains the activity of the original. In some instances, this might be useful since subtle properties of the original cassette could be altered or improved by refactoring. In other instances, it just creates more work for you. So, in practice, these 'complete chunks' of DNA provided as parts are often very useful when refactored variations on the natural sequence or original non-biobrick sequence don't work as well as the original.

Alright, let's look at the construction file: Construction of salicylate promoter basic part

PCR    ca1110F  ca1111R     pBACr899  A
PCR    ca1111F  ca899R      pBACr899  B
SOE    ca1110F  ca899R      A   B     pcrpdt
Digest pcrpdt   EcoRI,BamHI 1         pcrdig
Digest pBca1100 EcoRI,BamHI 1         plasdig
Ligate pcrdig   plasdig               pBca1100-Bca1111

# Sequences
oligo ca1110F ctctggaattcatgAGATCTGCGATCCCGCGAAGAACC
oligo ca1111F catgaagtagatTtcgccaatgtc
oligo ca1111R gacattggcgaAatctacttcatg

During the basic parts tutorial, I had you include all the sequences in the construction file just like you present the oligo sequences. In real life, I wouldn't do that--I'd just include information here that cannot be easily pulled from elsewhere. I've also omitted the sizes of the pcrpdt digest. However, when you supply answers to quizes/midterms, you should write out the fully 'verbose' version of the construction file. For ease-of-use, I've done it the more 'real-world' way here. You can go ahead and download the 3 relevant sequence files:


Note: When you launch these files with ApE the window should resemble the image at right. If you aren't seeing the little blocks of color, you've somehow lost the annotations in the file. Do a right click on the file links and directly launch the files in ApE rather than a text-only editor.

Open up pBACr899 in your editor. Predict what the product of PCR with oligos ca1110F and ca899R would be. Look for EcoRI/BamHI/BglII/XhoI restriction sites in that PCR product. Notice anything wrong? There is a single BglII site in the sequence. It must be removed, or the future use of BglII during assembly would result in internal cleavage of the part. This construction file will result in a basic part without the internal BglII site.

The construction file is telling you to perform 2 separate PCR reactions with the ca### oligos using pBACr899 as template. The names of those PCR products are "A" and "B". Oligos ca1111F and ca1111R will not match the template exactly, so you'll need to use the tricks described in the first tutorial to figure out where they would anneal. Go ahead and predict the products of those reactions. Now let's examine them.

Copy the last 24 bases of "A" and search for this sequence in "B". You should find it on the 5' (left) end of the sequence. This is the homology region between the two PCR products. Instead of using restriction enzymes on these PCR products, the construction file has you gel-purify, or gp them. This procedure will physically separate your shorter PCR products from the plasmid DNA template still present in the reaction.

In the next step, you set up another PCR reaction using a mixture of the gel-purified A and B fragments. The two oligos in the reaction anneal to the ends of the fragments. Notice that these oligos are the same two oligos you used in the first PCR simulation of this tutorial. They amplify the entire nahR-Psal cassette. Indeed, if you used pBACr899 as template for this reaction, you would obtain a PCR product that retained the internal BglII site. This is the reason we must separate out the template for the A and B pcrs prior to this third reaction, the assembly reaction.

The assembly reaction is an example of a non-canonical PCR reaction. Some people call it "SOEing," some call it "overlap PCR", and it is also somewhat similar to a method called "Quikchange". Here's what happens during the reaction:

As illustrated at left, during the initial denaturation step of the PCR, everything becomes denatured into single strands. Upon annealing, the stands all anneal to homologous sequences. They could anneal to their original partner strand (ie, green to green), or they could anneal to the complementary ends of the other sequence (green to red). When these events occur between two 3' ends, a recessed 3' end occurs on both strands which is the substrate for polymerization. Polymerase thus fills in the remainder of the sequence giving rise to a DNA containing both the red and green fragments. This product then becomes the substrate for PCR amplification with the external oligos.

The rest of the construction file should look familiar to you from your previous tutorial exercises. Go ahead and simulate the rest of it and confirm that this results in the desired product.

Another (optional) aside

Before we move on to the design section, let's take an aside and look at plasmid pBca1100. This plasmid is fairly similar to pBca9145. In fact, it matches pBca9145 exactly external to the EcoRI and XhoI sites. The difference is that a cassette is inserted between BamHI and XhoI containing a ribosome binding site, the red fluorescent protein (mRFP1) open reading frame, and a terminator. Without a promoter, pBca1100 doesn't confer production of the red protein product. Upon insertion of a transcription-initiating element within the BglII/BamHI region, the downstream gene should be expressed. I call pBca1100 an "RFP reporter" for Biobrick promoter basic parts. What do you suppose the phenotype is of cells harboring this plasmid upon growth in the presence or absence of exogenous salicylate? Think about it. This and other plasmids containing cassettes between EcoRI/BglII or between BamHI/XhoI can be useful during assembly of the Biobricks because they confer readily-observable phenotypes to the bacteria. Because these plasmids maintain the uniqueness of all 4 Biobrick enzymes, their relative positioning, and the specific locations of BamHI and BglII, they do not interfere with any (currently described) methods of Biobrick assembly.


Make a normal construction file

Here's how to design the construction. First of all proceed as you did for the normal case. In fact, go ahead and do an entire construction file ignoring the internal sites. You'll still need the external oligos that are similar in both schemes, so you haven't wasted your time. Similarly, you'll still need to pick a vector to paste it into, digest that plasmid, note the digestion products, and give the product of the experiment a name.

Find the restriction site in your sequence

Find the restriction site. To remove the restriction site, you'll want to make a point mutation at one position present in the restriction site. In our nahR example, we mutate agatCt to agatTt. You have to be careful in making this mutation. Not only must it destroy the restriction site, but it must maintain the function of the underlying feature. For things like ribosome binding sites, promoters, and terminators, this is quite tricky but fortunately rare. Those elements tend to be short and are unlikely to contain the internal restriction sites. This problem case almost always is due to a restriction site present in an open reading frame. It is critical that you maintain the coding of your open reading frame part when you make this mutation. This can be achieved due to the degeneracy of the genetic code.

Design the silent mutation

1) Exploit the degeneracy of the genetic code

Consider the sequence AGATCT. Translated in the 0 frame, this sequence encodes two amino acids Arg (AGA) Ser (TCT). There are many other codons for Arg including all the CGN codons. In the 0 frame, then, you could replace Agatct with Cgatct and maintain Arg-Ser coding. This type of mutation is termed a "silent" mutation.

2) Note the frame of the restriction site

Changing an AGATCT to CGATCT is not a one-size-fits-all solution to removing BglII sites, though. Consider the following open reading frame parts:

  M  R  S  *
  M  K  I  *
  M  K  D  L  *  

Each contains the BglII site, but each encodes a different amino acid sequence. The solution to this problem starts with translating your open reading frame. Select the entire open reading frame from start codon to stop codon, paste it into your editor or web program, and let the computer translate it showing the DNA sequence above the amino acid sequence. Find the restriction site in the DNA and look at the codons that flank the site. Note what amino acids they encode, and then use a genetic code table to identify an alternate codon.

3) Avoid rare codons

The third issue you need to consider is that some codons in the genetic code should be avoided: AGA and AGG. The tRNAs for these codons are rare in E. coli, and genes containing these codons sometimes express poorly. There are always multiple silent mutation options, so do something other than introduce AGA or AGG.

Give it a try

Using nahR as an example, go ahead and give this a try. Is the mutation we introduce with oligos ca1111F and ca1111R silent? What other mutations would be silent?

Design the mutagenic oligos

Now we need to design the oligos that will introduce the point mutation. You are going to order 2 additional oligos for this. The two sequences will be reverse complements of one another. All the normal rules for PCR still apply here--you still need 6 bases of perfect homology on the 3' ends, good G/C content and base balance, etc. For the overlap PCR reaction to work, you want at least 20 bp homology, so these oligos should be at least 20 bp in length. Note that the two oligos for nahR were 24 bp in length. Sometimes the sequence present in this overlap region doesn't have a good base balance or the other desirable properties, and making the sequences a little longer will help guarantee success. Oligos are cheap, so it is always better to make a little longer oligo than risk a failed assembly. In general, though, you want to locate your silent mutation site in the sequence, put it right in the middle of your ~20bp sequence, choose that sequence, and order it and its reverse complement. I leave it to you now to figure out how to write up the construction file. If you're worrying about the bubble that results from these oligos annealing to their non-exact homologous templates, go here.

Test your construction file!!!

Always always always! The easiest mistake to make in these construction files is to put the external oligos with the wrong one of the two internal oligos during the first 2 pcrs. So, predict the products of the various PCRs, and keep in mind that your "forward" oligo in each case should match your template exactly, the "reverse" oligo should anneal as its reverse complement.

Internal Restriction Site Quiz

Construct a BglBricks basic part in plasmid pBca9145 (just as you did in the first tutorial, using BglII and XhoI restriction sites, and make sure you clone it in the right orientation: the open reading frame goes 5'->3') for the kdsB open reading frame from ‘’Rhizobium leguminosarum’’. The sequence of this gene is available in accession number AM236080 in NCBI. Name your external oligos qbs001 and qbs002. Name the internal oligos qbs003 and qbs004. You do not need to include the sequences for "kdsB" or "pBca9145-Bca1089" in your submission because they will be injected during simulation.

  • How long is this open reading frame sequence?
  • What internal restriction site is present in the sequence?
  • What codons and peptide sequence overlap this restriction site?
  • Design oligos to make your basic part
  • Writeup the construction file


If you have any comments or want to report a potential error in the tutorial, please email me (Chris Anderson) at