BE.180: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
 
(127 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== BE.180 -- Biological Engineering Programming ==
{{Template:BE.180}}


BE.180 is a new course that will be offered for the first time in the Spring of 2006.  This is a required course for second semester sophomores who are majoring in Biological Engineering.  [[User:Endy | Drew Endy]] is leading the development of the course.
<div style="padding: 10px; width: 598px; border: 5px solid #000000;">


The current course description is:
'''Spring 2006'''
"Example problems from biological engineering are used to develop structured computer programming skills and explore the theory and practice of complex systems design and construction."


After thinking a bit, talking with Tom Knight, and thinking some more, the theme of the course is starting to focus on the idea of designing and coding the CAD environment for Biological Engineering (aka, Engineering of Biology. aka, "synthetic biology: an engineering technology based on living systems."  The way this might work is for a different CAD environment feature to be used each week as a motivating problem for exploring concepts in structured system design and implementation, and computer programming.  As a result of this approach, there may not be any *single* language used in the course -- we'll use the languages best suited to the problems.
'''Instructor:''' [[Drew Endy]] (endy at mit dot edu)


== Current Tasks ==
'''TAs:''' [[Lauffenburger:Laura_Sontag | Laura Sontag]] (sontag at mit dot edu) and [[Sabrina_Spencer | Sabrina Spencer]] (spencers at mit dot edu)
#Collect list of features that can serve as motivating examples (finish at 11/3 meeting)


== Future Tasks ==
'''Lecture:'''  T/R noon-1 (36-156)
#Prioirtize list of features, evaluating on coverage of concepts, fun factor, and feasibility
#Develop teaching modules


==Next Planning Meeting==
'''Office Hours:'''  M 3-4pm and W 12-1pm (68-329)


#Thursday November 3, 2005, 6-7:30 (immediately following BE.526), 56-614
'''Welcome to BE.180, MIT Biological Engineering's programming course!''' For many of you this will be the first time learning to program computers; others may already be programming gurus.  Regardless of your background, upon completing BE.180 you'll have learned how to solve complex biological engineering problems using computational approaches.  You'll also discover the rudiments of how to program DNA, the genetic material that runs inside all living organisms.  Along the way, you'll be exposed to powerful ideas that underlie all of modern engineering.  We hope that you have a great experience with the course!


== Feature Sandbox (add features ideas here, no idea is bad in the sandbox) ==
==<font color="red">Announcements</font>==
 
#'''Assignment 5 has been [http://mit.edu/endy/www/courses/180/hw5/Twenty180hw5.pdf posted]It is due at 5p on May 4.'''  
# Analysis of Sequence Data for Patterns / Features
#'''[[BE.180:Assignment4 | Assignment 4]] solutions have been posted'''
#* Manipulation of data/text
#Assignment 4 has been posted: [[BE.180:Assignment4]].  It was due at 5pm on April 25.
#* Pattern Recognition (regular expressions), Logic
#Rolling Stone article on Asilomar rDNA conference [http://web.mit.edu/endy/www/courses/iap2005/reading/RollingStone(189)37.pdf posted].  
# Tuning Codon Usage
#Exam 1 solutions posted. [[User:Endy|Endy]] 01:08, 4 April 2006 (EDT)
#* Codon tables
#Assignment 3 is cancelled; please study the [[BE.180:Devices | Devices]] page and prepare for our in class review on Tuesday.[[User:Endy|Endy]] 15:53, 18 March 2006 (EST)
#* Expression optimization
#Old Announcements: [[BE.180:Old Announcements]]
#* Watermarking/Sign your work
#* Restriction site removal/addition
#* Design a genetic code such that every point mutation is non-sense (HINT: should you use a 4-base code?)
# Some sort of graphing/visual depiction
# Some sort of modeling/dynamic systems
#*Stochastic simulations
#*Feedback
#*Non-linear systems
#**Perhaps an extension for BE.181 - use an ODE solver in 180, then in 181 you study how that ODE solver worked
# Biological information representation (note, this is a half-baked idea, please contribute/revise -[[Reshma Shetty | RS]])
#*There are hierarchies of biological components (parts, devices and systems) and associated information.  There are hierarchical data structures.  How should biological information be represented?  How do you search this information?  Such a topic could conceivably cover topics/concepts like data structures, searching, object-oriented programming etc.
# Search part-part junctions for secondary structure (e.g., ORF/RBS/Operator junctions)
#* Call-out to external tool
# Clustering of parts, devices and systems
#*How do you group or categorize parts? How do you measure "distances" between systems?  This becomes important as the registry gets bigger and as we move to a network of registriesI want to find all BioBricks that are "similar" to mine.
#*clustering methods
#*distance measures
# Homology modeling (useful for synthetic transcription factors or designing linkers?)
#* Call-out to external tool
#* Simulated annealing, energy minimization, MD
# Designing sequences to be synthesized subject to various constraints (e.g. design overlapping oligos with similar melting temps)
# Obfuscating Biological Programming
#* Ok, I know this is probably a terrible idea, but this is something I'd be interested in [[Austin Che | AC]]
#* What's the ''worst'' way to design a biological system?
#* How do I design a sequence that will have the functionality of poliovirus that could not be detected by current algorithms (e.g. whatever Blue Heron uses)
#* Cellular Perl to solve everyone's quick hacking needs
# Automated design and layout
#* If I have a circuit with some repressors, I'd rather a system automatically determine which repressors work best together. How to best determine a kind of family fitness?
#'''Information storage, compression and transmission''' - how well does biology do at storing information in an efficient manner,.  Where is information rich and where is it not (ties into finding sequence motifs above).  Maybe some information theory could feature here.--[[User:Bcanton|BC]]
 
==Concepts in system design and implementation, and computer programming==
In parallel with developing a features list, it may be appropriate to generate a list of fundamentals that students should take away from the course(Credit to Bree for also suggesting this.) Features could be matched with concepts from CS (and lessons from biology) as appropriate.  Started brainstorming a list here.  --[[Reshma Shetty | RS]]
 
I think that it's important to make the teaching of programming concepts the primary goal, and the applications to biological engineering should be secondary.  So students should all take away how to program, and in the process they may develop interesting and/or useful tools.  Maybe the way to do it is to decide on the relevant programming concepts, and then come up with features as teaching examples and assignments that cover these concepts (not the other way around).  I think that there is nothing wrong with teaching it like a normal introductory programming course, but using all biologically relevant examples and problems along the way.  --[[Ty Thomson | TMT]]
<br>[this is a good suggestion, but if your primary goal is to learn how to program computers you should take [http://mitpress.mit.edu/sicp/ 6.001] -- [[User:Endy|Endy]] 20:04, 6 Oct 2005 (EDT)].
<br>What are the prerequisites for the course? If you don't explicitly require 6.001 or 1.00, you'll probably get a number of students who effectively don't know how to program. I don't think that you can require either of those courses without adding them to the BE SB requirements, which isn't currently the case. So you may need to teach it as an intro programming course. -[[User:Jkm|Jkm]] 15:52, 10 Oct 2005 (EDT)
 
#Data structures: linked lists, trees, stacks etc.  Classes and inheritance?
#* Abstraction
#Searching: depth-first, breadth-first, heuristic
#* Backtracking, pruning, efficiency analysis
#Sorting
#Computation: FSM's, Turing machines etc.
#'''Documentation'''
#* Readable commenting and coding styles
#* How not to program like biology
#Recursion
 
* Explicit vrs implicit algorithms: A computer program [generally] embodies an explicit algorithm ie a very specific logic flow that is spelled out in the code. Biological systems, in contrast, embody implicit algorithms -- there is no "master intelligence" making sure that the right things happen at the right time, it all happens as a consequence of the basic laws of physics and chemistry ie the behavior is emergent, at a very basic level. From that perspective, it might be instructive to compare "designed" programs with "evolved" programs eg programs produced via Genetic Programming. See, for example, [http://hampshire.edu/lspector/push.html PushGP] - [[Alex Mallet | AM]]
* If most people have no programming background, using multiple languages may be a bad idea, because they'll not only have to get their head around fundamental programming concepts but also repeatedly have to ramp up on different syntaxes, language idiosyncracies and capabilities etc. That could become pretty frustrating. - [[Alex Mallet | AM]]
*'''Digital vs. continuous systems''' - This may fall under abstraction, but it would be cool to explicitly deal with the pros/cons of digital and continuous systems.  This could lead to a discussion of why biology might use one or the other and why programming might use one or the other and how we can approximate both when programming.--[[User:Bcanton|BC]]
*For abstraction - originally give them a function (or a class) that they use as a black box. Then, as a later assignment (when they know a little more), have them rewrite that function with the restriction that the black box stay the same.
*Thinking more about programming languages, I could understand having two - a traditional first language (Java, C) and MATLAB. In my experience, something like Java is eaiser for a lot of basic CS topics - data structures, OOP, etc. I find Matlab simpler for data manipulation, though.
 
== Biology/CS Matching ==
 
#Simple Control structures (case, for, etc) - Give them a DNA sequence, ask for the mRNA sequence and the protein sequence (given a codon table? or writing their own?)
#*Probably an early exercise - can you manipulate information and pass it back and forth between functions.
#Nonlinear systems - Modeling an oscillator or switch. Analyze it analytically, then duplicate that computationally (stolen from 7.81). Basically, replicate Collins/switch or Elowitz/oscillator models.
#*Elowitz modified his simulation to compare discrete vs continuous. Could be a chance to introduce the idea of stochasticity, why it's important, how to model it.
#Data structures, classes, OOP, etc - Annotation. If I give you a stretch of DNA, how do you represent all of the associated data (object of class DNA contains objects of class Promoter, each of which contain a location/strength/whatever). Includes writing the procedures to pass data between objects?
#Binary Trees - Evaluating lineages. This is getting hazy, but basically feeding off of Elowitz's noise measurements. Noise calculation ends up playing a significant role in BE.309, so this might be a good time to introduce the idea. -[[User:Jkm|Jkm]] 00:08, 4 Nov 2005 (EST)
 
== Programming Concepts and Tutorials ==
# High-level concepts
## Abstraction
### Data abstraction
### Abstraction of functions / reusable code (black-boxing your functions)
## Recursion
## Object orientation
### Inheritance
## Scope
### Local, global variables
## Algorithms
### Sorting, binary trees, linked lists, hashing
#Low-level concepts
## Style
### Comment your code
### Indent, doc-strings (automatic documentation of code)
## Control structures
### Expressions (+,==,*, etc)
### If statements
### For loops, while loops
### Functions
## Input / output
## Programming stategies
### Pseudo-code
### Unit testing / modular coding
### Code validation
### Debugging
### Asserts
### Error handling
# Here we will upload an introductory Python tutorial, including:
## Data structures: lists, dictionaries, strings
## Text manipulation
## Regular expressions
## Function declaration
## Biopython
# Here we will upload an introductory MATLAB tutorial, including:
##Data structures: Matrices, vectors
## Matrix manipulation
## Data visualization
## Numerical solvers
## Function declaration
## Systems biology toolbox
# Possible problem sets or examples
## Practice reading/writing files and using dictionaries to translate a string of nucleotides into amino acids.  Each key would be a codon, and each value would be the corresponding amino acid.  Give students a text file with all 64 codons that they have to read into a dictionary.  Output the protein sequence to a file. (Python)
## Finding the start and stop codons in a cDNA sequence; output their positions and the length of the encoded protein.  (Python)

Latest revision as of 18:18, 25 April 2006

BE.180 Biological Engineering Programming

Home        Syllabus        Assignments        Exams        Expectations       

Programming Tutorial        Biology Tutorial       

Spring 2006

Instructor: Drew Endy (endy at mit dot edu)

TAs: Laura Sontag (sontag at mit dot edu) and Sabrina Spencer (spencers at mit dot edu)

Lecture: T/R noon-1 (36-156)

Office Hours: M 3-4pm and W 12-1pm (68-329)

Welcome to BE.180, MIT Biological Engineering's programming course! For many of you this will be the first time learning to program computers; others may already be programming gurus. Regardless of your background, upon completing BE.180 you'll have learned how to solve complex biological engineering problems using computational approaches. You'll also discover the rudiments of how to program DNA, the genetic material that runs inside all living organisms. Along the way, you'll be exposed to powerful ideas that underlie all of modern engineering. We hope that you have a great experience with the course!

Announcements

  1. Assignment 5 has been posted. It is due at 5p on May 4.
  2. Assignment 4 solutions have been posted
  3. Assignment 4 has been posted: BE.180:Assignment4. It was due at 5pm on April 25.
  4. Rolling Stone article on Asilomar rDNA conference posted.
  5. Exam 1 solutions posted. Endy 01:08, 4 April 2006 (EDT)
  6. Assignment 3 is cancelled; please study the Devices page and prepare for our in class review on Tuesday.Endy 15:53, 18 March 2006 (EST)
  7. Old Announcements: BE.180:Old Announcements