Arking:JCAOligoTutorial5

Assembly and Design of Composite Parts
The methods for fabricating DNAs remain a cost and time-limiting aspect of synthetic biology research. One day, all you'll do is type the sequence into some machine and it will pop out the other end, and you won't really need to understand anything about it. Actually, that black box exists today--there are companies like DNA 2.0, Blue Heron, and Geneart which do "Gene synthesis". You can type in a sequence up to around 20,000 bp long into their websites, and 2 weeks later they'll mail you some plasmid containing that sequence. Unfortunately, it's pricey -- it costs around $0.39 per base -- and what they're going to do to make it isn't wildly different from what you're learning to do in these tutorials (it still involves assembly). For an overview of methods, check out "Gene synthesis demystified" Trends in Biotechnology, 27(2):63-72, 2009 PMID 19111926

In practice, folks who do a large volume of DNA manipulation, at least in academia, do things in two phases. The rationale goes like this: many of the DNAs examined in a project are similar to one another. They might vary by only a single base pair, or perhaps they just involve replacing one out of 4 genes in the sequence with a different gene. Let's say you had a 10000 bp sequence and you needed to change just 1 bp from T to C in that sequence for your next experiment. If you make it by complete gene synthesis it will cost you $3900 and 2 weeks of waiting. If you follow the general procedure you learned for removing restriction sites (the SOEing method) or various other site directed mutagenesis techniques, it will cost something more like $40 and 3 days. So, gene synthesis often doesn't make sense. It might have made sense for the first construct, but all later constructs fabricated in this manner would be a waste of time and money.

The BioBrick approach to cloning (which we're learning in these tutorials) makes those two phases very concrete and specific: you make basic parts and then you assemble into composite parts. It assumes that most of the constructs you want to examine are permutations of genes, promoters, terminators, etc. that you or someone else have already synthesized, and all you need to do is stitch them together. Because the BioBrick stitching reactions are relatively fast (at least compared to total Gene synthesis) and very cheap and robust, it is a desirable route for building large DNAs when most of the constructs are permutations of different preexisting parts.

You've already seen how to make basic parts, and in practice folks in the real world still do things this way. Based on the relative costs of labor and materials, though, it depends on your situation as to whether it makes sense to create basic parts from natural sources like genomic DNAs. At a gene synthesis price point of around $0.15, you'd be crazy to do anything other than gene synthesis as the typical cost for acquiring plasmids from repositories such as addgene is alone $65, and acquiring genomic DNA from an organism of interest is around $200. So, it is often the case that you'll want to use gene synthesis methods to fabricate basic parts.

The BioBrick approach is very popular within the iGEM community and for some academic labs. For the Anderson Lab in particular (my lab) we use all sorts of methods as appropriate, but BioBrick-like approaches remain popular. In general, though, there are 2 distinct types of assembly: Cut-and-paste methods and PCR-based methods.

The BioBrick methods all use a standard set of restriction enzymes and T4 DNA ligase to join DNAs together. So, it's one of several cut-and-paste methods. The original BioBrick scheme involved the restriction enzymes EcoRI, XbaI, SpeI, and PstI, and this is why you incorporated those sites into your basic parts in the tutorials. Subsequently, a number of alternative sets of restriction sites have appeared. You've seen the BglBrick method, which uses EcoRI, BglII, BamHI, and XhoI already in these tutorials. The other commonly-used cut-and-paste based method is "Golden Gate" cloning which employs type IIS restriction enzymes which do a reach over and cut to generate sticky ends of arbitrary sequence. Other than that, it is essentially the same as the BioBrick methods--you build basic parts and then add restriction enzymes and ligase to assemble the fragments. The advantage of Golden Gate is you can do the assembly in one pot as long as all your basic parts have matching sticky ends. The disadvantage is that your basic parts all have to be designed with matching sticky ends. So, they are not "idempotent" parts like in the BioBrick method. Golden Gate parts are restricted to fixed assembly trees, so you can't fabricate any arbitrary order of parts. However, for many applications, you don't need this flexibility and it's a very robust, cheap, and fast way of assembling things.

The PCR-based methods have been around even longer than the original BioBrick method. The first was SOEing, the mechanism for which you've already seen. You design oligos that match the tail ends of each fragment so that you can use PCR to create homology between them. You can then assemble them in a subsequent PCR reaction. SOEing has largely been eclipsed by other homology-based chemistries. The SLIC method similarly involved using oligos to create homology arms on the ends of the fragments, but then employed a cocktail of enzymes involving T4 Polymerase (not to be confused with T4 ligase) to stitch the fragments together in a one-pot reaction. The new one referred to by folks in the field as the "Gibson method" is very similar to SLIC in the sense that you do PCRs with custom oligos to add homology arms, but the assembly reaction involves a different cocktail of enzymes that turns out to be more robust than SLIC. There are other homology-based assembly methods that follow the overall scheme of 1) find plasmids that have fragments of sequence you want in your product 2) design oligos that can amplify off those sources and add homology arms between them 3) PCR up the fragments and 4) do some assembly chemistry to make it all come together. The advantages of the PCR-based methods are 1) it's the fastest way to assemble 2) You can assemble in any order and any length just as with BioBrick methods but unlike with Golden Gate 3) You can introduce small snippets of additional sequence in your oligos used during the initial PCRs.  You can also make mutations using those oligos.  Also, you don't end up with BioBrick scars between your parts.  The PCR methods are usually much cheaper and faster than total gene synthesis, but substantially more expensive than the cut-and-paste methods.  This is caused by the need for custom oligos to bridge each fragment ($5 to $10 per junction) and the fact that PCR and homology-based joining methods have higher rates of errors requiring complete sequencing of the final product if you really want to be sure that they're error free.

The general idea of assembly is that you're going to be reusing existing DNAs and then stitch them together into larger DNAs. Whether its PCR-based or cut-and-paste based, you get pretty dramatic cost and time saving by doing assembly rather than "starting from scratch" each time using total gene synthesis. Actually, gene synthesis itself involves assembly -- due to error rates in gene synthesis, it also follows a two-phase process of synthesizing gene-length DNAs (now called synthons rather than parts) and then using homology-based or cut-and-paste-based assembly chemistries to assemble into longer lengths.

So, unless money and time is no object to you, you'll want to learn these assembly methods. Here, we'll go through a BioBrick-based assembly scheme and show how you can re-use preexistent composite parts. Regardless of which assembly methods you choose to use, step 1 is to write out what you want and then determine what preexistent parts you'll use to make them from.

Step 1: Identify the basic parts available
You need to establish what basic parts will be needed to make what you want. There's now an automatic way of doing this using new software called J5, but its still routine to do it by inspection. Look at the complete list of available basic parts and try to make a complete cassette out of it. If there is some "missing" function in the Biobrick library, make a basic part for it.

Step 2: Write out the design for your basic part
Once you've identified the appropriate basic parts, write them down in order. So, you might make something like this:

I0500.b0034.Bca1117.b0015.r0040.Bca1046.b0032.E0040.b0016.Bca1046.b0034.E1010.b0015 Pbad rbs   Cre     term  Ptet  Lox     rbs   GFP   term  Lox     rbs   RFP   term

What this thing is supposed to do is constitutively produce GFP under normal growth conditions. When the cells are exposed to arabinose, the Cre protein gets made, the region between the two Lox sites would be excised, and the cells turn red. ...at least that's what it's designed to do.

Before you start making something like this, you need to analyze it a little more. There is a good chance that some of the substructures of the composite have already been made. For example, we already have 3 useful parts in our toolbox:

E0241: b0032.E0040.b0016 rbs  GFP   term

I13507: b0034.E1010.b0015 rbs  RFP   term

Bca9089: b0034.Bca1117 rbs  Cre

Step 3: Minimize the Design
So, we can simplify our design as:

I0500.Bca9089.b0015.r0040.Bca1046.E0241.Bca1046.I13507 Pbad (cre)   term  Ptet  Lox     (gfp) Lox     (rfp)

This is as simple a construction as could be designed based on the current set of available parts. Let's now abstract this design a little further as just:

A.B.C.D.E.F.G.H

Step 4: Parallel (or convergent) synthesis
In this part of the tutorial, I'm not going to tell you how you actually go about making the junctions between parts. It's pretty complicated and we're constantly updating the protocols. So, we'll cover this when you actually get to lab. For now, just keep in mind that you install each "." one at a time. So, the most efficient way of making the above part is to assemble it convergently. So, split it in half, then half again, and so on until you only have single pairs left.

Round 3      A.B.C.D.E.F.G.H  Round 2     A.B.C.D     E.F.G.H  Round 1  A.B    C.D     E.F    G.H

So, now we have a plan for putting the thing together. In round 1, we join A with B, C with D, E with F, and G with H. Each of these dimeric composite parts will be assigned a number like e0241. Next time that dimer is called for in a construction, you'll be starting one step ahead. In round 2, we joing AB with CD and EF with GH. In round 3, we joing ABCD with EFGH, and then we're done!