20.20/Biocomputing/Brainstorms

Main project page

Basic Idea
Chain invertase genes and promoters together, with invertase recognition sites positioned to (in)activate their own or each other's promoters.

Gene would look like this: ---[P1]--[Inv1]-[P2]--[Inv2]-[P3]--[Inv3]--- etc...

Construct promoters such that their activity can be toggled by some subset of the inverters in the chain. E.g. Inv1 is initially ON --> Inv1 acts on P1 and P2 --> P1 inactivated, P2 activated --> New state: Inv1 OFF, Inv2 ON. Inv2 then acts on other promoters, and so on. We define the sequence of possible states by the placement of invertase recognition sites. Writing this will be akin to solving a logic puzzle.

The whole chain of invertases is somehow put under the control of an S-phase initializing factor. This way, after Inv1 acts, Inv1 is OFF and Inv2 is ON... but Inv2 isn't expressed yet because the cell enters S phase (expression is turned off and replication happens). Both daughter cells inherit the Inv1 OFF, Inv2 ON state. The process then iterates.

To read the number/state of the chain, PCR and sequence. We should probably come up with at least a speculative other way of reporting the number, because it seems like most people are going to want to filter out the cells that have gone through N cycles and then do something else with them. Some method of assaying invertase expression?

Actionable Questions

 * How exactly do invertases work at the base-pair level?
 * Are their recognition sites directional? Does it make a difference if the recognition sequences are "pointing the same way" vs. "pointing in different directions"?
 * It seems to be typical that DNA between same-orientation sites will be excised circularly, and DNA between opposite-orientation sites will be inverted, cf. Wikipedia on Cre recombinase and the Shufflons review. This could be problematic -- we don't want anything excised, only inverted. Can we work around?
 * What would be ideal: invertases that can work when their sites are compatibly oriented, and cannot work when they are incompatibly oriented. Whether or not these invertases are reversible-with-compatible-orientation is less important, although it would be really nice if one inversion made the sites either incompatibly oriented or permanently borked (I've worked out some preliminary diagrams for the latter case). This is not strictly necessary if we can fine-tune the expression/regulation so that each invertase really only works once before it's degraded and expression gets turned off.
 * Do we have an option of inversions that can or cannot be undone?
 * Answer seems to be `yes': apparently there are invertases that are completely reversible or completely irreversible, as well as ones that can reverse but prefer one state to the other.
 * In the case of irreversible invertases: how does the irreversibility work in terms of base pairs? Can we take a sequence that's been irreversibly inverted and make it so that it can be inverted again? (Possibly by inverting a smaller piece of DNA around one of the sites? This ties in to the directionality question.)


 * Is there some kind of list/database of invertases? Especially one that gives their recognition sequences, and the directionality/reversibility information in the previous question?
 * It's difficult to search for them individually because "invertase" sometimes brings up a certain type of sucrase, and "recombinase" seems to bring up overly broad results (including all kinds of DNA snipping and splicing and whatevering enzymes, not just invertases).
 * The only invertase in BioBricks is Hin invertase. Seems to be well-known. TK is listed as an author; we could ask him about it.

The answer seems to be that we can't put any within the promoter, but that we can put an arbitrary number around the promoter, especially because they tend to be small, like restriction enzyme sites.
 * How many invertase recognition sites can we put around/in a promoter before its function is compromised?

Sometime during G1 or G2, whichever invertases are ON need to do their thing exactly once, and then they need to not act on the DNA any more until the next cell cycle.
 * How can we put the whole mechanism under the control of some S-phase-linked protein, such that the timing is correct?
 * Which time is better, G1/S transition or G2/M transition?


 * How can we use that "biological edge detection" thing that Kay mentioned, to get a brief pulse of expression?


 * The Hin invertase BioBrick has an LVA degradation tag added; how fast does this work? Can we add it to other invertases and BioBrickify them?


 * Not a question, but something to note: we can't use the state where all invertases are OFF, except as an endpoint.

This would only be really useful if we could combine it with some indicator of how many counter-cycles have passed. E.g. Raffi's idea of "GFP with invertase sites that turns on after X cell cycles". E.g. GFP- indicates states 1-8, GFP+ indicates states 9-16. This could increase our range. Alternatively, make the cycle long enough that it will take yeast cells a nontrivial number of hours to replicate that number of times.
 * Can we make this cyclic?

How many useful invertases are there that yeast isn't using? How do we avoid our invertases doing unpredictable things to the yeast genome, or yeast's endogenous invertases messing with our chain?
 * What does yeast's invertase space look like?

This question is certainly answerable by a quick literature search. (Is 'transform' or 'transfect' the right word to use here?)
 * How exactly does one take an arbitrary gene and get yeast to express it?

Invertases

 * Hin recombinase
 * Cre recombinase

Thoughts on Ribozyme
Main idea: something like an RNA antibody or a generalized/larger tRNA. Would bind to a specified RNA sequence, change conformation, and trigger some change in an associated protein (anything from GFP activation to a super complicated kinase cascade).

Other than what we thought of for 3 Ideas, why did Sussman think this was so important??

Trivial note: we should agree on what to call this among ourselves. Aptamer, antibody, signal RNA, tRNA++, ribozyme... for simplicity's sake we should pick a term and stick with it. Unless the word "it" will do, now that we're only working on one idea.

NB: It's "aptamer", not "aptomer" -- no wonder we couldn't find any information on Google etc.

Ideas for 3 Ideas
1. Simple computations

Something along the lines of what Dr. Weiss mentioned in the virus and diabetes part of the presentation: either developing something using a simple logic gate to incorporate two feedback mechanisms ANDed (or WHATEVERed for precise targeting) a la that awesome virus or some sort of program for cell-differentiation: imagine being able to program stem cells to differentiate according to environment! Alioth: we could actually program for astrocytes, perhaps! ...something to talk to TK about, esp. in the context of the model organism he's developing: it may provide us with a great platform for this work.

2. Ontology

The BioBricks project has a significant problem dealing with a general ontology to categorize parts, and figuring out how to make parts work together in a consistent and regular way. This is just as much a problem as actually developing parts, since it's about making the framework within which we think about them. This could include developing the rudiments of a programming language which "compiles" to BioBricks as well as a classification system for all BioBricks by function. This is neat because it takes advantage of things like *'s 6-ness, Alioth's linguistics, and whatever it is that I do.

3. Neural Nets

Thomas Demarse seems to have done some neat work in programing brain cells to actually do things in simulated environment. Alioth: could you email him and ask about this work? You seem to know the most about brain development. I can help with any of the embodied-mind stuff he talks about in his earlier work (I did a project on that kind of thing last term).

4. Intercellular computation

One of the big problems is noise and breakdown. Why not have multiple cells computing single problems? The Game of Life can be demonstrably Turing-complete.

Maybe ask jakebeal, tk, gjs? Kay: thoughts?

5. Development of a new part

Unfortunately, I have no idea what this would look like. *This* is something to ask gjs, tk, and jakebeal about.

Thought: Why is everyone focusing on logic gates of the same type as are the `atoms' of electronic circuits? Do/should complex protein networks use different computational atoms? Perhaps better suited to the cellular environment (noise! scalability!), or perhaps different simply because they manipulate actual physical molecules, not electronic pulses. This is more of a pure-science question, but we can twist it into coming up with some kind of protein-computational atom that's not the equivalent of a logic gate. (Though...do natural protein networks even *use* atoms, or are they just big nonmodular kluges?)

6. Turing machines

DNA is a tape read by a head of enzyme complexes which interpret and process the tape according to its instructions. DNA processing is, in other words, about as close to a proposed Turing machine as you can get. Devising computational models along these lines may, in fact, be the way to go with biology! After all, we *know* it can avoid noise, dissipation, conflicts, etc., because we have empirical evidence that it already works! This is huge, because it a) takes advantage of existing systems and b) is Something Completely Different. OF course, it may be pretty far out (but I would bet even money that this becomes The Way To Go within 20 years...)

7. Aptomers (inspired by Sussman & Knight)

"Antibody" made of RNA. One end binds to a specified RNA sequence (by complementary base pairing like a tRNA); the other end binds to some kind of standard indicator (visible reporter protein, kinase cascade to perform some other function, ...). This would eventually be the first part of a two-part system, where the second part is whatever you need to do whatever needs to be done in response to the RNA sequence being detected. Feasible, well defined -- in particular, it's a lot easier to predict the secondary/tertiary structure of RNA than protein. Perhaps we could even borrow from what's known about the structure & assembly of tRNAs & other aptomers. Possibly immediately useful for gene expression assays.

It remains to be seen/remembered how exactly Sussman thought this could trigger universal computation. We should probably hit up both him and TK again soon.

Presentation Notes
1 & 2. It would be neat if we could open with two movies: the first one would show either a chip fab process or some sort of large computer system at work. The second would show DNA processing. We should be able to cut the second video out from "Inner Life of a Cell." Thoughts on video #1?

3. Intro: what is biological computing?

Here we introduce the notion of biological computing as well as the advantages hinted at in 1 & 2.

4, 5, 6 The three ideas. Format follows outline on OpenWetWare wiki.

7. Contributions - This is a summation of the project, and what we've discussed overall. This is the last slide, and (according to phw) the most important one, regardless of the actual content. It takes its content from 3-6 and is a brief summary/index.

We have KeyNote. It makes things look nice.

3 Ideas Presentation Outline
[These are the notes from which I compiled my broken-glass diagram -- Kelly Drinkwater 19:09, 1 April 2008 (EDT)]

(It looks like we're going to do the aptamer idea for the actual design project, so the presentation of the other two ideas is less important, but still an opportunity to talk about cool stuff.)

Limitations of silicon. Biocomputing uses the same sort of abstract mechanisms, just implemented in a different medium.
 * Introduction: What is Biocomputing?

Advantages: -Way smaller -Individual components not faster...but massively parallel! -Can survive in living environments, where silicon/electronics would degrade

Challenges: -noise -imprecision / stochastic effects (reliable behavior only emerges as an average of many unreliable components) (Inner Life of Cell -- kinesin interpretive dance?? HAHA) -more stuff - Beating electronic computers. Obviously, electronic computers are our direct competitors.


 * Idea 1. Intercellular communication (Game of Life)

One of the major problems with biological computing is noise. If you try to cram everything into a single cell, it has to be small, and you have a lot of difficulty with random molecule motion because you're working in a liquid medium at room temperature. Making things bigger tends to solve noise problems. So why not use multiple cells, each containing a smaller piece of machinery, and have them communicate with each other? When you make things bigger, molecular-level noise becomes less of a problem. And we can also take advantage of the cell's native signalling machinery

See this highly usable Life applet, which we could show in class instead of laboriously making our own animated GIF of e.g. a glider gun. It looks like the cells are moving around and communicating with each other. But actually they are just individually obeying very simple rules. If a cell has less than two neighbors, it gets lonely and dies. If a cell has more than three neighbors, it dies of overcrowding. Cells with two or three neighbors are happy and stay alive. And any dead cell with exactly three neighbors comes to life. From these simple rules, you get wildly complex behavior at the system level. Cells can form self-sustaining structures like the glider, and you can get recursive (glider gun; glider gun gun). You can also make logic gates. It's been demonstrated that the Game of Life can act as a universal computer.

(Recall Ron Weiss' video of GFP cells doing amorphous Life. Can we get a copy???) Ron Weiss has made these glowing bacteria that play Life amorphously. The emergent behaviour is not so precise but you still get the same sort of complexity. Our possible followup: create a cell that's easily programmable with an arbitrary set of simple GoL-like rules that produce GoL-like behavior.


 * Idea 2. DNA Turing machine

A Turing machine is a theoretical universal computer. It consists of a tape with symbols on it, and a head that moves along the tape and interprets the symbols.

We don't use Turing machines as computers because different types of computers are more convenient to implement electronically. But in biology, we already have a lot of the things a Turing machine requires.

DNA sounds like the perfect thing to make a Turing machine tape out of. The read/write head could be made of an enzyme complex. This could work really fast and be really small ==> massively parallel processing!

A group at the Weizmann institute has actually implemented such a machine, but it's fairly rudimentary. We could look at either improving the computational power of the Turing machine or designing it to perform computations in parallel, which would mean that although it's very slow, we could still get it to perform lots of simultaneous computations and have the net effect of it seeming very fast.


 * Idea 3. Ribozymes

We spoke to Gerald Sussman, a professor in Course 6 associated with the Amorphous Computing project, about biological computers and he suggested that we think about computers other than universal Turing machines. In particular, he suggested that a very useful component of a biological computer would be something that triggered a large protein signal, a cascade, in response to the presence of a particular RNA molecule.

Antibodies can really precisely target a molecule, but it's hard to make an arbitrary antibody, and even harder to design another protein: it's really difficult to make up your own protein from scratch because it's so hard to predict the 2, 3, and 4ary structure from the sequence.

RNA, being single stranded, tends to fold up into shapes similar to how proteins do it. And if we want to bind to RNA, like Dr. Sussmnn suggested, we can take advantage of the fact that RNA will base-pair with itself: in other words, we can very easily make RNA enzymes, called ribozymes, that bind to RNA and change shape in response. That change in shape could be the trigger for some kind of protein cascade.

Right away you can see that this is much easier to customize for any RNA sequence of interest -- just rewrite the sequence for that particular part.

Particularly useful for computing because nucleic acids are a good data storage medium -- so then we need something that can *react* to the data and take action on it: ribozymes!