Endy:Notebook/BioBrick Open Language Specification/Comments

Discussion on OpenWetWare

 * Symbiosis with SynBioSS
 * This is a list of the information that SynBioSS Designer needs to generate a reaction network, and thus, it would be beneficial if we could represent it using BOL. Some of this is obvious, but for the sake of completeness I have included everything I could think of.
 * Components/Devices
 * BioBrick ID: To allow for the design of custom BioBricks, I think this property should be defined by default but not mandatory.
 * BioBrick nickname: Mandatory. For naming custom bricks, and convenience; "pTet" is more descriptive than "BBa_R0040".
 * BioBrick type: Promoter, RBS, Coding DNA, Terminator, etc. SynBioSS only cares about these four types, but obviously BOL should support every type in the Parts Registry.
 * BioBrick order: For devices and whatnot.
 * Type-Specific Characteristics
 * Promoters
 * Constitutively ON/OFF: That is, whether a promoter is "ON" or "OFF" when no protein is bound to it.
 * Operator Site Name(s): Such as tetO, lacO, etc....
 * Operator Site Location(s): A start/end is all we need.
 * Location of -10/-35 sites: Start/end. We are interested in this because the leakiness of a system is affected by the location of any operator site(s) relative to these sites.
 * Coding DNA
 * Corresponding protein: For example, BBa_E0040 codes for GFP.
 * Regulatory/System Information
 * I do not know if the scope of BOL will eventually include entire systems, instead of just BioBrick devices. If so, the following information would be helpful to us:
 * Protein Properties
 * Constitutive or Non-Constitutive: Self-explanatory.
 * Protein "type": The general behavior of the protein. Currently, SynBioSS can handle activators, repressors, and reporters.
 * Binding Sites: A list of operator sites that the protein can bind do.
 * Complex: This is just an integer representing the number of protein monomers that are in an active complex. For example, TetR and LacI bind DNA as dimers(2) and tetramers(4), respectively.
 * Effector Properties
 * Such as aTc, IPTG, etc....
 * Binding behavior: Protein(s) that the effector binds to.
 * Act in Concert: For every protein/effector pair, this is a yes-no question. Some proteins (e.g. LuxR) will not bind to DNA until a small molecule (e.g. HSL) first binds to them.
 * --Francois 18:55, 12 July 2009 (EDT): Cesar, the symbols look good. Make sure that for the reverse terminator that the "T" is formed correctly (see me for more explanation). Also, the bidirectional terminator seems different (shorter) than what you would expect if you superposed the forward and reverse terminators.
 * Richard Mar 15:18, 1 June 2009 (PST) : I've looked through the symbols and noticed that the dimensions vary between symbols. I think the easiest way to get the symbols to consistently line up is to use a single template for all the symbols, with 100x63 for the dimensions (maybe 3 pixels for the line through the middle and 30 pixels on either side of the line). Another idea would be to use a huge template, maybe 600x300 pixels, to account for more complex symbols and still allow for standardized dimensions, and just vectorize the symbols to maintain clean images at various sizes.
 * Douglas Densmore 17:30, 18 May 2009 (EDT) : I used the symbols today in a powerpoint presentation and I noticed that it is very difficult to get the symbols to line up correctly. When I move them in powerpoint they only move in discrete increments. Perhaps I could change a setting or something to allow continuous movement so I can line them up better but I figured I would make this comment anyway.
 * Cesar: Julius sent me a screenshot of Cartoonist awhile back. The symbols were quite expressive.  We should synchronize the Cartoonist and BOGL symbols.  As suggested by Drew, I've changed the release version to 0.1.  This release is by no means a final set of symbols.  It's an evolutionary step forward.
 * Douglas Densmore 16:21, 15 April 2009 (EDT):We have recently ported "Cartoonist" to Clotho. Cartoonist is a simple app given to us by Julius Lucks (via Sarah Richardson) that takes a sequence, looks at the features present, and then creates an image which "stitches" icons together for the features. Naturally the icons we use are now not the ones proposed here. One of my students (Richard Mar) is going to add his 2 cents here about what he thinks. Right now we have many more feature icons than BOGL supports. My initial concern is that the icon proposed here are not expressive enough.
 * Cesar: Julius, we will discuss your suggestions in the next lab meeting. Please feel free to place suggested symbols in the "Proposed" column of the table.  The suggestions can be "moused out" squiggles using MS Paint.
 * Julius B. Lucks 13:42, 14 April 2009 (EDT): Several comments about the symbol set:
 * How can the symbols handle overlapping reading frames?
 * For overlapping reading frames, it will be necessary to connect where the translational terminator for that frame is. Not sure how you want to do this - lines connecting the start and stop?  Color codes? ...
 * Transcriptional terminators are not represented in the set. They also have a direction associated with them so the terminator needs asymmetry in the left-right direction to reflect this.
 * Promoters and RBSs (and other elements) have associated 'strengths'. It would be nice if these symbols had some way to reflect that - maybe with shading, or gradient shading?.  It would be too much to ask for a 'strength' scale that could be universal, but having these options to at least show relative strengths within a certain construct class would be useful.
 * Promoters can be constitutive or inducible - how can that be depicted? Should protein operator sites be shown inside the promoter if it is inducible?  Should there be some other inducible symbol?  We have used wavy arrows for this in the past.
 * What about protein binding sites?
 * Methylation sites?
 * The restriction site should have some way of specifying which enzyme.
 *  Douglas Densmore 18:29, 13 April 2009 (EDT) : Here are some of the first questions we need to answer regarding defining a language (taken from Al Aho from Columbia (creator of AWK)):
 * Programming model
 * Character set and lexical conventions
 * Names, scopes, bindings, and lifetimes
 * Data types
 * Expressions and assignment statements
 * Control flow
 * Procedures and control abstraction
 * Data abstraction and object orientation
 * Concurrency
 * Of the items in this list we should first (in my opinion) decide on the data types and lexical conventions. We can begin this by first deciding on what are the primitives in the language. Are the symbols we have proposed the primitives? Can we compose these into a higher level objects? What is composition? It is a set of symbols and an order (more?). Are we going to enforce composition rules? What are the restrictions on the DNA element we are describing (circular, linear, etc)?
 * Anyway, once an RFC comes out, I will add these sections along with a first pass on what they should be. We can also propose the initial syntax. To try these ideas out, this summer I can commit to creating a Clotho plug-in and compiler for BOGL.
 * Cesar: Barry I agree with all your points accept color. I think the community-at-large will have greater interest in Bioengineering artifacts when compared to EE ones.  I think using color is more inviting.  Nonetheless, I will create both black-white and color symbols during the next iteration.  We'll see if my hypothesis will survive the test of time.  Here are my rough thoughts on color:
 * Green for promotors
 * Red for all terminators
 * Blue for open reading frames
 * Barry: Big fan of a standard symbol set! I would suggest that an effort be made to keep the icons as simple as possible (see this svg file of electrical symbols that are mostly just black lines as an example).  I think there are a couple of reasons to aim for simplicity -
 * If you encode meaning in color and complex graphical elements, the symbols become hard to approximate on a white board/sketch without loss of information.
 * Simple symbols won't go out of fashion as gradient fills, drop shadows and certain color-palettes may do in a few years.
 * The simpler the images, the more time you spend thinking about what they mean rather than about the images themselves (see Chart Junk for this idea applied to charts).
 * Easier to make, easier to edit, easier to support in software etc.
 * (Sorry for being old-fashioned).
 * Drew: Agreed that we want depictions across levels of abstraction. To this end, please remember the Polkadorks ~2004:  http://parts.mit.edu/wiki/index.php/IAP2004:Polkadorks   Very simple device level diagrams, with solid PoPS wires, and dashed biochemical-specific wires:  http://parts.mit.edu/wiki/index.php/Image:Intro21-SystemDiagram.jpg  (Note also the green and red input and output tabs on the devices).  (We also had a representation of PoPS "pass-through" in which a PoPS input signal passes through a device, providing a matched output.  This was shown as a solid horizontal yellow bar, connecting input and output tabs.  At the next level down, see the "exploded" devices view, depicting functional parts:  http://parts.mit.edu/wiki/index.php/Image:Intro22-DeviceDiagram.jpg
 * Herbert Sauro: Consider both logical and physical representations. This one looks closer to a physical representation. Do you have any plans for a logical representation? That is, a high level description that gives the logical design of the network with all the regulatory links and other notation required to describe its functional specification and one that could if required be converted into a mathematical representation (and a physical representation) suitable for analysis. One could also imagine  higher level representations where a toggle switch, for example, might be represented as a single box rather than in terms of all its individual components.
 * Reshma 08:20, 11 February 2009 (EST): The graphics might be more useful in a vector graphics format so that they scale nicely and can be used across presentations/publications/posters/wiki's etc. The Registry icons are not currently available as vector graphics; hence most folks are forced to make their own.
 * Cesar 12:50, 11 February 2009 (PST): Completely agree and will begin to convert the symbols to vector graphics. Everyone is welcome to help make the changes!
 * Reshma 08:20, 11 February 2009 (EST): Probably each icon should have a bar behind it as an iconic representation of the DNA. (See the Registry icons for an example.)  The bar tends to be useful when depicting a sequence of parts.
 * Cesar 16:26, 15 February 2009 (PST): Completely agree. I will add this change to the new vector graphics-based symbols going up this week.  The symbols on the Registry are an excellent starting point.
 * Reshma 08:20, 11 February 2009 (EST): It might make more sense to have icons that have identical widths and identical heights (though the width doesn't necessarily need to equal the height). I suspect that consistent heights and widths would make it easier to string together graphics automatically via computer programs.
 * Drew: Review and include the icons from the Registry
 * Cesar 15:36, 13 February 2009 (PST): I'm adding the Registry symbols ASAP.
 * Cesar 17:17, 15 February 2009 (PST): Done.
 * Provide a graphical notation that assists in making the description and development of standard biological parts more consistent and time efficient.