20.20/Biocomputing/Specs

BioBricks / Registry of Standard Biological Parts <-- convenience link

Questions & Action Items

 * Specify exactly what molecules will be going in each device


 * Figure out nuclear import/export as necessary (LOL EUKARYOTES)


 * We need a testing and debugging plan!


 * We need a 6-people-6-month work plan for testing/debugging!


 * We need to practice our presentation!


 * General: we are working in yeast, so we need to verify that everything we use will work in yeast. (This is an issue because mostly these seem to be from bacteria or phage, and tested in those organisms.)
 * Natalie seems to be the local yeast person.
 * In re: degradation tags in particular, Natalie writes: The protein degradation system in yeast is, indeed, not the same as the ones in bacteria (where the 3 letter acronyms are often the amino acids that target proteins to be degraded). Here's a link to a review of a temperature inducible "degron" in S. cerevisiae: pubmed link. There are other links from that page that might be helpful.
 * There are some BioBricks specifically for yeast. However, they are very few, and the only ones that look immediately applicable are the fluorescent proteins, which aren't that crucial anyway (see Parts List>Biobricks>Misc below). There is also a terminator, but that page doesn't give any sequence information, and I can't find it elsewhere.


 * How do you use a YAC?
 * Natalie writes: You could always just start at NCBI and find a clone by searching for "yeast artificial chromosome." And here's a review I found by doing a quick glance at the Pubmed results from "YAC review": pubmed link. There's tons of great data. TK put some of it together on OWW here: Yeast artificial chromosomes


 * How do we get a repressor binding site by itself, separately from a repressible promoter? (We can't just put two promoters in a row, one CBL1 and one repressible -- what we need is a CBL1 promoter that is artificially made repressible by something of ours. Whatever we use to shut off the first bit, it can't be something that will shut off all the CBL1 in the whole cell!)


 * Choice of repressible promoters. We have three (LacI, TetR, Lambda cI). We only really need two. Suggest using LacI and Lambda cI since their pages say "high" and "strong", whereas TetR's says "medium".


 * At many points we will have to make a new protein generator BioBrick, starting from no more information than the sequence for the protein coding region. In these cases, what affects our choice of RBS, terminator(s), and filler text? Can we just look at the ones that are precedented in working BioBricks and choose the strongest / most appropriate ones?


 * When composing our own BioBricks, in what cases is it desirable to use the TACTAGAG filler text that seems to be in common use? Is it meaningful in some way? (NB: the previous judgment is made only from looking at one group's BioBricks... but that set includes working devices, so nyah.)


 * Hin invertase seems to need Fis as a cofactor. This will make one bit rather irregularly shaped. Luckily, Fis doesn't do anything by itself (it's just a helper for Hin) so I'm pretty sure we can have it under another copy of the same promoter as Hin, and put the Fis generator someplace in the vicinity of the Hin generator on the genome. (Luckily the Fis binding sites are already a BioBrick!)

Overall Design
See [|the brainstorms page] for the system design as of Tech Spec Review.

We are planning to embed the whole system in a Yeast Artificial Chromosome (YAC); see below for details.

Transcriptional Regulation
At Tech Spec our system relied on inhibitors for the activity of the already-expressed invertases, not the expression of invertases. We moved away from this because we don't know of any naturally occurring proteins that inhibit invertases. One idea we had was to engineer our own inhibitors: create RNA fragments that imitate the DNA sequences to which the invertases bind, so they would competitively inhibit the inversion reaction. This was a really neat idea, but relatively speculative (eg, will DNA invertases even bind RNA at all, even if the sequence is right?). The nice thing about this design was that it probably would have worked more tightly than our current system, because the invertases were being regulated/pulsed more directly. Now we're going to hope that a combination of degradation tags on invertases, and the whole system being shut off by chromatin packing for mitosis, will be enough to keep reversion under control. Given that invertases aren't that fast anyway, this may not be a fundamental problem (though certainly something that would need to be tuned by experiment).

Concurrency
At Tech Spec, the bit flips happened in series. This turns out to be a bad thing because inversion is a relatively slow reaction (on the order of the length of a cell cycle in bacteria), and if you need several of them to go in series then it doesn't scale. Our new system passes the state changes down two separate paths: a fast path using repressors, and a slow path using invertases. The fast path still technically happens in series, but it's plenty fast enough to avoid the cell-cycle-length problem. Basically, repressors do the same thing the invertases were going to do -- turn things off early -- but without a change in actual state; all the repressors can do is temporarily change a promoter's activity, not semi-permanently change its state (orientation). Meanwhile, prompted by the fast path, the necessary state changes in the slow path all happen at roughly the same time. Woo!

Lower Bound
We are now aiming for our system to give a lower bound on # of cell divisions. The reasoning behind this is that invertases are more likely to erroneously not invert than to erroneously invert, so we'll get low numbers anyway. So the last bit is different now. When it is toggled to ON, it activates a fluorescent protein reporter, and then turns the whole system off by repressing the first bit. This will give the system a hard upper limit and prevent it from cycling (which would confuse people, especially if there are relatively few bits). This would also have the nice effect of allowing researchers to FACS out those cells that (mostly) have or have not yet reached a certain borderline age.

Devices List
All our devices are really specialized cases of a general bit template. There are two bit halves: a pulser part and a bit part.

Generalized Bit
there should not be a horizontal line here Activator A & Repressor A are bit-internal -- when the pulser is activated, they act in concert on the promoter on the bit part, producing a pulse of Repressor B & Invertase B expression -- these will work on the next bit. At this point, Repressor B quickly acts on the repressible promoter in the pulser of the next bit. If the promoter is ON then it is turned OFF, and if it is OFF then Repressor B doesn't really do anything. The next bit then takes its action, propagating the signal. Meanwhile, Invertase B goes to work toggling the orientation of the promoter on the next bit, essentially making 'permanent' the temporary change Repressor B made. NB: The promoter is ineffective/OFF while it's being inverted, but this does not harm our system. If the next bit needs to be toggled to OFF, then Repressor B does it quickly (by activity) and then Invertase B does it slowly (by orientation). If the next bit needs to be toggled to ON, Repressor B doesn't really do anything and the Invertase B does it slowly. It's OK for the ON-wards toggle to be slow/late, as it will be, because only OFF-wards toggles are supposed to cause a signal to propagate down the chain of bits. A bit's ON-wards toggle can take as long as it likes because it need not, and will not, propagate any signal.
 * Pulser part: X-[Repressible promoter]-X---[Activator A gene]--[IRES]--[Repressor A gene]
 * Bit part: [A-activatable promoter][Repressor A binding site]---[Repressor B gene]--[IRES]--[Invertase B gene]

First Bit
The first bit is under the control of a Cyclin CLB1 promoter instead of an invertible/repressible promoter, placing the whole system under control of the cell cycle. (Specifically, the whole thing goes just before mitosis.) Immediately after the promoter is a repressor binding site that the final bit will use to turn the system off. The rest of the first bit is the same as a generalized bit.

Final/Reporter bit
The final bit is completely different: it only has a "first half", a pulser part. Following the invertible/repressible promoter (which is normal), there is a GFP gene and a gene for a repressor that will travel back to the first bit and turn the whole system off.

Our system: three bits as proof of concept
In theory, this system can be scaled to arbitrarily large numbers of bits. In practice, we are limited by the number of distinct repressors and invertases we can find. For our project we will specify three bits (a first bit, an intermediate bit, and a reporter bit), towards a proof of concept.

Parts List: By Device
Nomenclature:
 * In this section, I have used 'gene' to mean what BioBricks calls a 'generator': an RBS, a protein coding region, and a couple terminators.
 * Repressors, activators, and invertases are numbered by the device where they act, not the device where they are generated.
 * Each bit is numbered 1, 2, 3...
 * The pulser part of bit 1 is 1a
 * The bit part of bit 1 is 1b

First Bit (1)

 * Pulser part (a)
 * Cyclin CLB1 promoter
 * Repressor 1a binding site
 * Activator 1b gene
 * IRES
 * Repressor 1b gene
 * Bit part (b)
 * Inducible promoter (by Activator 1b)
 * Repressor 1b binding site
 * Repressor 2a gene
 * IRES
 * Invertase 2a gene

Intermediate Bit (2)

 * Pulser part (a)
 * Repressible promoter (by Repressor 2a)
 * Invertase 2a recognition sites, around promoter
 * Activator 2b gene
 * IRES
 * Repressor 2b gene
 * Bit part (b)
 * Inducible promoter (by Activator 2b)
 * Repressor 2b binding site
 * Repressor 3a gene
 * IRES
 * Invertase 3a gene

Final/Reporter bit (3)

 * Pulser part only!
 * Repressible promoter (by Repressor 3a)
 * Invertase 3a recognition sites, around promoter
 * Fluorescent reporter gene
 * IRES
 * Repressor 1a gene (acts to shut off whole system at first bit)

Repressors, binding sites, & repressible promoters

 * 1a:
 * Repressor 1a:
 * Rep 1a binding site:
 * 1b
 * Rep 1b:
 * Rep 1b binding site:
 * 2a: LacI
 * Rep 2a:
 * Repressible promoter 2a:
 * 2b
 * Rep 2b:
 * Rep 2b binding site:
 * 3a: Lambda cI
 * Rep 3a:
 * Repressible promoter 3a:

Inducible promoters & activators

 * 1b
 * Activator 1b:
 * Inducible promoter 1b:
 * 2b
 * Activator 2b:
 * Inducible promoter 2b:

Invertases & cut sites
We can choose any 2 of Hin, FimB, and Cre. Hin has the advantage of already being a BioBrick and the disadvantage of requiring a Fis cofactor. FimB and Cre have neither of these pros/cons as far as we know. Ergo we can choose whichever two we like, maybe taking other factors into consideration.
 * 2a:
 * Invertase 2a:
 * Inv 2a Recognition sites:
 * 3a
 * Invertase 3a:
 * Inv 3a Recognition sites:

IRES

 * We only need one

Degradation tags
Do LVA et al work in yeast? cf. Natalie's pointers

Filler text / other joining
TACTAGAG? (What is this?)

Misc

 * Cyclin CLB1 promoter
 * Fluorescent reporter gene: use yeast codon optimized FP?
 * Nuclear localization tag

Repressors & Repressible Promoters

 * LacI system:
 * LacI repressible promoter: "high transcription"
 * LacI generator containing this LacI protein coding region with LVA degradation tag
 * TetR system:
 * TetR repressible promoter: "medium strength promoter"
 * TetR generator containing this TetR protein coding region with LVA degradation tag
 * Lambda cI system:
 * Lambda cI repressible promoter: "strong promoter"
 * Lambda cI generator containing this Lambda cI protein coding region with LVA degradation tag

Comments:
 * We only really need two of the three above since we're making 3 bits. Suggest using LacI and Lambda cI since they are advertised as "strong/high".
 * All three are tagged DNA Available and Experience: Works.
 * It's good that they all have degradation tags already put on. We may have to choose strong promoters for the repressors, but we can be confident they will switch off when they are switched off.

Invertases

 * Hin
 * Hin invertase protein coding region
 * We will want to add RBS and terminators. All of the repressor generators listed above use the same RBS (B0034) and terminators (B0010 and B0012); can we use them here as well? (I looked at some other random protein generator BioBricks and these seem to be in wide use (in working parts, too). But the page for B0012 says it doesn't work. What?)

Promoters that get inverted

 * Promoters
 * Can we use the same one on each bit? After all, all it needs to do is be constitutively on and work in yeast.
 * Invertase sites
 * hixC site (for Hin invertase) -- is symmetrical
 * In usage, hixC sites seem to be surrounded by the 'filler text' TACTAGAG (see for example, , , and , all by "Davidson and Missouri Western", and including both working and nonworking devices).

Degradation tags
We are probably going to want to put degradation tags on lots of things. Many extant BioBricks seem to be using the LVA degradation tag, which is listed as "DNA Planning" and "Experience: None", although it is used in parts listed as "DNA Available" and "Experience: Works". But there are loads of degradation tags, which I don't understand the differences between.
 * There's a set of tags that have three-letter-acronym names: LVA, LAA, AAV, ASV. I guess these are interrelated somehow? They don't really have any distinguishing information listed.
 * There's another set of tags with different names: AANDENYALAA "very fast", AANDENYNYADAS "fast", AANDENYADAS "moderately fast". These have a bit more meta information, as well as references.
 * Question: do these degradation tags even work in yeast? They seem to be dependent on the presence of certain proteases/mechanisms -- where are these tags from, what organisms are they tested in?

Misc

 * Fis: promotes Hin inversion
 * Fis binding sites BioBrick
 * GFP generator -- more details on the specific protein encoded are here
 * Do we want to use instead one of the following yeast codon-optimized FPs? eYFP, eCFP, another YFP
 * Nuclear localization tag

Invertases

 * Cre-Lox
 * Inducible Cre generator -- we should be able to take this apart into what we want

'Filler text'
In what cases is it necessary or desirable to put filler text in between parts on a BioBrick we compose? Looking at several protein generator BioBricks containing invertase sites, the appropriate thing seems to be to use the filler text TACTAGAG in between everything, so that the device goes promoter - TACTAGAG - RBS - TACTAGAG - hixC site - TACTAGAG - protein coding region - TACTAGAG - hixC site - TACTAGAG - terminator 1 - TACTAGAG - terminator 2 - TACTAGAG - fis binding site. So, where does this TACTAGAG come from? Is it good for some reason (even if that reason is nothing more than "it's short and utterly meaningless"?

Invertases

 * FimB

IRES
The first thing to do is figure out how these actually work.
 * There appear to be at least two IRES native to yeast: YAP1 and p150, cf. this paper (with p150 being stronger, and also (conveniently) able to work even when chopped up in pieces). Unfortunately, this is disputed and re-disputed.
 * Of course, we can also use IRES from viruses, some of which (a) work in yeast and (b) are actually well characterized (gasp!)

Other

 * CLB1 promoter (should be easy)
 * Fis generator
 * Sequence for Fis is at the bottom of this page
 * All it needs is a promoter, an RBS, and a terminator
 * ...and maybe an LVA degradation tag
 * Should it be under the same promoter as Hin, so they're expressed at the same time? At first pass this seems to be the thing to do.

YAC stuff
We are probably going to embed the whole system in a YAC, rather than in plasmids. (Adam Arkin advised us in this direction and we seem to agree with him.) The idea behind using plasmids was to separate the bits so that they don't crosstalk with each other when they're not supposed to. But this may not be that much of a concern, because each bit uses different internal proteins. As well, we could separate them somewhat by putting them in different regions of the YAC (Is this true?). The main advantage of YACs is that they are stable and reliable. Plasmids get ejected or mis-sorted in mitosis (and the effects are compounded with multiple plasmids). Even getting yeast to take up multiple plasmids might be a challenge. And a lot of plasmids creates a higher metabolic load.

Natalie writes: You could always just start at NCBI and find a clone by searching for "yeast artificial chromosome." And here's a review I found by doing a quick glance at the Pubmed results from "YAC review": pubmed link. There's tons of great data. TK put some of it together on OWW here: Yeast artificial chromosomes

"Minimal size for a YAC is between 50kb and 100kb, while maximum sizes are 1Mb to 3Mb." Don't know where/whether our system will fall in this window, but it certainly won't be off the big end. If it's off the small end, we can surely just add filler or some random other possibly-useful device.

YACs include telomeres, a centromere, a couple of other things to make them behave like chromosomes, and (usually) selectable markers, plus some means of introducing your DNA of interest.

TK's page points to a couple of premade YAC kits.
 * One that seems to be basically made of restriction enzyme sites in the middle.
 * This one looks relatively easy: it comes as two plasmids, one for each end of the chromosome. Comes with ampR and TRP1 for selection.
 * A bunch of others...

Check function of the BioBricks we compose

 * Lone repressor sites (near inducible promoters on pulser parts)
 * IRES
 * We may need to state a list of viral IRESes that the literature says work in yeast, so we can just try them and see which ones work
 * Protein generator devices derived from protein coding sequence alone (low priority; this is probably easy)

Pulses

 * How discrete can we get them?
 * Adjust stability of activator/repressor mRNAs
 * Also adjust degradation of pulse products

Degradation tags

 * Do LVA et al work in yeast? Does it have the right proteases? Or should we look for yeast-native degradation tags that aren't listed in BioBricks yet?
 * Must tune this together with the pulser

Nuclear localization tags

 * Which parts need it? Some will have native yeast localization, or yeast will be able to work with the original organism's localization sequences (suboptimally).
 * Apparently it is also possible to get nuclear localization by diffusion, just by overexpressing the hell out of the protein of interest

Tagging in general

 * Adding too many tags can affect folding and/or activity
 * For each part: does it retain folding & function with the appropriate tags? do this in vitro and in vivo
 * If a part needs both a nuclear localization and a degradation tag, does it matter which one comes first? Or can we put one on each terminus of the protein?

How best to fit the system into a YAC

 * Which of the YAC kits listed on TK's site is easiest / most appropriate?
 * How should we spread the parts around? Do we want them all close together, or widely separated by bit, or all the bit parts in one place and all the pulser parts in another, or...?

Archive
Invertase inhibition: How are we going to get inhibition of invertases at the protein level?
 * In nature they mostly seem to be regulated/inhibited at the transcription level; can we rework the pulser idea to take advantage of this?
 * Can we find invertases that have naturally occurring protein-level inhibitors?
 * Can we inhibit them ourselves by producing RNA that mimics the recognition sites and binds the invertases?
 * Will DNA invertases even bind RNA at all, even if the sequence is right?
 * Will they bind better if there are fewer uracils (i.e. non-thymines)? This isn't something we can control.
 * The DNA that codes for this inhibiting RNA would necessarily be made of recognition sites, and would attract the invertases to come flip it. How might this affect the invertase activity that we actually want? Would it help or hurt the pulse discreteness? (Presumably since this RNA is nothing but a decoy it would not be hurt by being inverted?)

You can do the regulation at the transcription level...that was my original idea, and is much more implementable with current parts. Attach a repressor coding region and an activator coding region to the flipping promoter, then tag the repressor so it goes away faster. For a short time during the switch, you will have activator but not repressor...if you put their operator sites upstream of an invertase gene, you will get a pulse of invertase expression.
 * Kay writes:

This is what we have in our current diagram. (NB: this disagrees with the diagrams on the brainstorms page!) Now we're going to hope that a combination of degradation tags on invertases, and the whole thing being shut off by chromatin packing for mitosis, will be enough to keep reversion under control. Given that invertases aren't that lightning-fast anyway, this may not be a fundamental problem (though certainly something that would need to be tuned by experiment).