TABASCO: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
 
(29 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{EndyLab}}
{{EndyLab}}
== Summary ==
== Summary ==
Tabasco is a simulator created to address the problem of simulating gene expression at single-base resolution. By defining the logic of transcription and translation rules a priori such as initiation, elongation, termination, and interactions of polymerases and proteins, Tabasco  automatically traverses the state of the system as it develops and thus makes simulation at such high resolution computationally feasible.  Tabasco was designed to allow us to better understand bacteriophage gene expression.  In general, Tabasco would be useful to those interested in explicitly simulating hypotheses of protein-DNA interactions and their relation to gene expression (e.g., eukaryotic gene expression initiation).
TABASCO is a simulator created to address the problem of simulating gene expression at single-base resolution. By defining the logic of transcription and translation rules a priori such as initiation, elongation, termination, and interactions of polymerases and proteins, Tabasco  automatically traverses the state of the system as it develops and thus makes simulation at such high resolution computationally feasible.  Tabasco was designed to allow us to better understand bacteriophage gene expression.  In general, Tabasco would be useful to those interested in explicitly simulating hypotheses of protein-DNA interactions and their relation to gene expression (e.g., eukaryotic gene expression initiation). A research article describing Tabasco has been [http://www.biomedcentral.com/1471-2105/8/480/ published] (BMC Bioinformatics 2007, 8:480 doi:10.1186/1471-2105-8-480).
 
== Note regarding trademark ==
Our use of the word TABASCO here refers to a simulator of gene expression systems, or other systems comprised of elementary chemical reaction events that can be ordered along one or more dimensions.  The origins of our use of the word were as an acronym, abbreviating the words “Transcription And Binding And Serious Computational Overhead.”  The word “Tabasco” is also a registered trademark of the [http://www.tabasco.com/ McIlhenny Company] for use in connection with pepper sauces, clothing, and other consumer products.  The TABASCO simulator is neither affiliated with nor sponsored or endorsed by the McIlhenny Company and our use of the TABASCO name is not intended to suggest any such affiliation, sponsorship, or endorsement.  “Tabasco” can also refer to a [http://en.wikipedia.org/wiki/Tabasco state in Mexico.]


== People ==
== People ==
Line 13: Line 16:


== Software and Installation ==
== Software and Installation ==
===Software===
The software for Tabasco is freely available here with the publication, via [http://www.biomedcentral.com/content/supplementary/1471-2105-8-480-s3.zip Additional File #3 (.zip file)]


The software for Tabasco is freely available here:<br>
The simulator was only tested on [http://java.sun.com/j2se/1.4.2/ J2SE version 1.4.2] and higher.  A basic knowledge of running Java programs is needed. Generally one can either download the source, and then compile it using a java compiler to create byte code or download the precompiled byte code above.
[http://model.mit.edu/ Extractable archive of all code, documentation, and examples] <br>
[http://model.mit.edu/Tabasco/source Java Source Files] <br>
[http://model.mit.edu/Tabasco/doc/ Javadoc Documentation] <br>
[http://model.mit.edu/Tabasco/classes Compiled Java Byte Code] <br>


The simulator was only tested on [http://java.sun.com/j2se/1.4.2/ J2SE version 1.4.2] and higherA basic knowledge of running java programs is needed. Generally one can either download the source, and then compile it using a java compiler to create byte code or download the precompiled byte code above.
===Installation instructions (for Unix systems)===
#Download the [http://www.biomedcentral.com/content/supplementary/1471-2105-8-480-s3.zip extractable archive (.zip file)].
#Extract the archive using the following command: <pre> tar -xvzf tabasco.tar.gz </pre>
#*Windows users should be able to use WinZip or other free programs (or download [http://www.cygwin.com/ Cygwin] to make the process more unix like)
#Make sure java is in your path
#*To check, type <pre>java -version</pre> and make sure the command is found (also a good time to make sure you are above version 1.4.2)If you do not have an up-to-date version of Java, visit [http://java.sun.com/javase/downloads/index.jsp the download site] and follow the installation instructions.


== Usage ==
== Usage ==
The useful executable classes are
There are 3 main classes that can be run.  TabascoSimulator.class actually executes the simulation and is the longest program.  Averager.class averages output from multiple TabascoSimulator simulations.  TabascoJpegMake is the visualizer that makes JPEG stacks to visualize data that can later be made into movies.
 
=== TabascoSimulator ===
=== TabascoSimulator ===
The TabascoSimulator class executes simulations.  The basic usage is as follows.  
The TabascoSimulator class executes simulations.  The basic usage is as follows.  
java TabascoSimulator inputfilename outputfilename [random seed]
java TabascoSimulator inputfilename outputfilename [random seed]
*The inputfilename is the location of the location of the input file.
*The outputfilename is the prefix to be used for the output files.
 
The optional random seed can be used to override the input from the inputfilename.  This field is useful if one is using a script to run simulations on a cluster.


The inputfilename is the location of the location of the [[Tabasco Input File Format|input file]]. <br>
A sample input file is included in the distrubutionTo run it, follow this procedure:
The outputfilename is the location of the output file. <br>
The optional random seed can be used to overide the input from the inputfilenameThis field is useful if one is using a script to run simulations on many different processors.


=== TabascoJpegMake ===
#Go to the TabascoWeb/classes directory by typing <pre>cd TabascoWeb/classes</pre>
#Try to run a sample program <pre>java TabascoSimulator ../examples/t7_input_file.txt ../output-test- </pre>


=== Averager ===
=== Averager ===
The Averager class averages output from the TabascoSimulator.  The basic usage is as follows.
java Averager #_of_iterations outputfilename
*The #_of_iterations is an integer that specifies the number of simulations you want to average together
*The outputfilename is the prefix to be used for the output files.
The example simulation run in TabascoSimulator can be averaged as follows:
java Averager 2 ../output-test-
===TabascoJpegMake===
This class makes a set of jpeg's from the output files to visualize the DNA and molecule levels as a function of time. 
java TabascoJpegMake output-filename mol-input-filename dna-file1 dna file2 ...
*output-filename is the filename prefix for jpeg's that are output
*mol-input-filename is the Molecule file name that you want to visualize
*dna-file1 and dna-file2, etc are the DNA files that will also be a part of the visualization.
For the example input, you can visualize the output by running the following command. 
java TabascoJpegMake ../vis- ../output_AVG.txt ../output-test-DNA_phage1_sim1.txt <br> ../output-test-DNA_phage2_sim1.txt ../output-test-DNA_phage3_sim1.txt


== Examples ==
== Examples ==
[http://web.mit.edu/~skosuri/www/t7-tabascoweb.mov QuickTime] movie showing TABASCO being used to simulate gene expression for the first 1500 seconds of bacteriophage T7 development.
[http://www.biomedcentral.com/content/download/supplementary/1471-2105-8-480-s1.mov QuickTime] movie showing TABASCO being used to simulate gene expression for the first 1500 seconds of bacteriophage T7 development.

Latest revision as of 09:46, 6 May 2011

Home        Teams        Research        Notebooks        Publications        Internal        Contact       


Summary

TABASCO is a simulator created to address the problem of simulating gene expression at single-base resolution. By defining the logic of transcription and translation rules a priori such as initiation, elongation, termination, and interactions of polymerases and proteins, Tabasco automatically traverses the state of the system as it develops and thus makes simulation at such high resolution computationally feasible. Tabasco was designed to allow us to better understand bacteriophage gene expression. In general, Tabasco would be useful to those interested in explicitly simulating hypotheses of protein-DNA interactions and their relation to gene expression (e.g., eukaryotic gene expression initiation). A research article describing Tabasco has been published (BMC Bioinformatics 2007, 8:480 doi:10.1186/1471-2105-8-480).

Note regarding trademark

Our use of the word TABASCO here refers to a simulator of gene expression systems, or other systems comprised of elementary chemical reaction events that can be ordered along one or more dimensions. The origins of our use of the word were as an acronym, abbreviating the words “Transcription And Binding And Serious Computational Overhead.” The word “Tabasco” is also a registered trademark of the McIlhenny Company for use in connection with pepper sauces, clothing, and other consumer products. The TABASCO simulator is neither affiliated with nor sponsored or endorsed by the McIlhenny Company and our use of the TABASCO name is not intended to suggest any such affiliation, sponsorship, or endorsement. “Tabasco” can also refer to a state in Mexico.

People

Sriram Kosuri, Jason Kelly, Drew Endy

Methods

Tabasco avoids previous ‘combinatorial explosion’ problems by tracking the position and state of proteins and genetic elements on the DNA to dynamically generate appropriate reactions, such as promoters being blocked by traversing polymerases. In order to improve computational efficiency, Tabasco makes use of a Gibson-accelerated Gillespie SSA to compute the reaction event timing and the resultant time-evolution of the genetic system.

Software and Installation

Software

The software for Tabasco is freely available here with the publication, via Additional File #3 (.zip file)

The simulator was only tested on J2SE version 1.4.2 and higher. A basic knowledge of running Java programs is needed. Generally one can either download the source, and then compile it using a java compiler to create byte code or download the precompiled byte code above.

Installation instructions (for Unix systems)

  1. Download the extractable archive (.zip file).
  2. Extract the archive using the following command:
     tar -xvzf tabasco.tar.gz 
    • Windows users should be able to use WinZip or other free programs (or download Cygwin to make the process more unix like)
  3. Make sure java is in your path
    • To check, type
      java -version
      and make sure the command is found (also a good time to make sure you are above version 1.4.2). If you do not have an up-to-date version of Java, visit the download site and follow the installation instructions.

Usage

There are 3 main classes that can be run. TabascoSimulator.class actually executes the simulation and is the longest program. Averager.class averages output from multiple TabascoSimulator simulations. TabascoJpegMake is the visualizer that makes JPEG stacks to visualize data that can later be made into movies.

TabascoSimulator

The TabascoSimulator class executes simulations. The basic usage is as follows.

java TabascoSimulator inputfilename outputfilename [random seed]
  • The inputfilename is the location of the location of the input file.
  • The outputfilename is the prefix to be used for the output files.

The optional random seed can be used to override the input from the inputfilename. This field is useful if one is using a script to run simulations on a cluster.

A sample input file is included in the distrubution. To run it, follow this procedure:

  1. Go to the TabascoWeb/classes directory by typing
    cd TabascoWeb/classes
  2. Try to run a sample program
    java TabascoSimulator ../examples/t7_input_file.txt ../output-test- 

Averager

The Averager class averages output from the TabascoSimulator. The basic usage is as follows.

java Averager #_of_iterations outputfilename
  • The #_of_iterations is an integer that specifies the number of simulations you want to average together
  • The outputfilename is the prefix to be used for the output files.

The example simulation run in TabascoSimulator can be averaged as follows:

java Averager 2 ../output-test-

TabascoJpegMake

This class makes a set of jpeg's from the output files to visualize the DNA and molecule levels as a function of time.

java TabascoJpegMake output-filename mol-input-filename dna-file1 dna file2 ...
  • output-filename is the filename prefix for jpeg's that are output
  • mol-input-filename is the Molecule file name that you want to visualize
  • dna-file1 and dna-file2, etc are the DNA files that will also be a part of the visualization.

For the example input, you can visualize the output by running the following command.

java TabascoJpegMake ../vis- ../output_AVG.txt ../output-test-DNA_phage1_sim1.txt 
../output-test-DNA_phage2_sim1.txt ../output-test-DNA_phage3_sim1.txt

Examples

QuickTime movie showing TABASCO being used to simulate gene expression for the first 1500 seconds of bacteriophage T7 development.