From OpenWetWare
Jump to navigationJump to search

Programming Concepts and Tutorials

  1. High-level general engineering concepts
    1. Abstraction: Abstraction is a mechanism to reduce and factor out details so that one can focus on few concepts at a time. (From
      1. EXAMPLES: Physics to EE (1800s), in synthetic biology: parts (proteins) --> devices (inverter) --> systems (ring oscillator)
    2. Standardization: Standardization is the process of publicly establishing a technical standard. (From
      1. EXAMPLES: use of the International System of Units (SI) in science, standardization of screw threads and nuts, use of SBML (systems biology markup language), use of FASTA files to carry sequence data
    3. Decomposition: Decomposition, otherwise known as factoring or decoupling, refers to the process by which a complex problem or system is broken down into parts that are easier to conceive, understand, program, and maintain. (From
      1. EXAMPLE: break construction of a building into smaller separate tasks to be handled by experts (strucutral engineer, architect, etc)
  2. High-level programming concepts
    1. Data abstraction: Data abstraction is the enforcement of a clear separation between the abstract properties of a data type and the concrete details of its implementation. (From
      1. EXAMPLES: lists and dictionaries in Python
    2. Abstraction of functions / reusable code: Reusability is the likelihood a segment of structured code can be used again to add new functionalities with slight or no modification. Reusable code reduces implementation time, increases the likelihood that prior testing and use has eliminated bugs and localizes code modifications when a change in implementation is required. Subroutines or functions are the simplest form of reuse. A chunk of code is regularly organized using modules or namespaces. The ability to reuse relies on the ability to build larger things from smaller parts, and being able to identify commonalities among those parts. Reusable code can be implemented within the context of the individual, whereas standardization implies a public specification of interface. (From
      1. EXAMPLE: a function "mRNAtoprotein" which converts an mRNA sequence into an amino acid sequence, which can be used as a black box and called without knowing the details of how the function works
    3. Iteration: Iteration is the repetition of a process. It can be used both as a general term, synonymous with repetition, and to describe a specific form of repetition with a mutable state (for example the counter "i" in a "for" loop). When used in the first sense, recursion is an example of iteration. (From
      1. EXAMPLES: for loops, while loops
    4. Recursion: Mathematical recursion involves a function calling on itself over and over until reaching an end state. A commonly used example is the function used to calculate the factorial of an integer. (From
      1. EXAMPLES: Fibonacci numbers: f(n) = f(n − 1) + f(n − 2), factorials
    5. Object orientation: The idea behind object-oriented programming (OOP) is that a computer program may be seen as composed of a collection of individual units, or objects, that act on each other, as opposed to a traditional view in which a program may be seen as a collection of functions or procedures, or simply as a list of instructions to the computer. (From
      1. EXAMPLES: C++, Java, Python and C#
      2. Other languages with object-oriented features: Ada, BASIC, Lisp, Fortran, Pascal
      3. OOP concepts
        1. Class: the unit of definition of data and behavior; a class (for example, Dog) is the basis of modularity and structure in an object-oriented computer program
        2. Object: an instance of a class; for example, Spot the Dog
        3. Inheritance: a mechanism for creating subclasses; inheritance provides a way to define a (sub)class as a specialization or subtype of a more general class (as Dog is a subclass of Canine). It is intended to help reuse of existing code.
        4. Abstraction: the ability of a program to ignore the details of an object's (sub)class and work at a more generic level when appropriate; for example, Spot the Dog may be treated as a Dog much of the time
    6. Scope: The scope of a variable describes where in a program's text a variable may be used, while extent (or lifetime) describes when in a program's execution a variable has a value. (From
      1. Local variable: A variable that is given local scope. Such variables are accessible only from the function or block in which it is declared.
      2. Global variable: A variable that does not belong to any subroutine in particular and can therefore can be accessed from any context in a program.
    7. Algorithm: a finite set of well-defined instructions for accomplishing some task which, given an initial state, will terminate in a corresponding recognizable end-state. Informally, the concept of an algorithm is often illustrated by the example of a recipe, although many algorithms are much more complex; algorithms often have steps that repeat (iterate) or require decisions (such as logic or comparison). (From
      1. EXAMPLES: sort algorithm, search algorithm
      2. Computational complexity theory: the branch of the theory of computation that studies the resources required during computation to solve a given problem. The most common resources are time (how many steps it takes to solve a problem) and space (how much memory it takes).
      3. Big O notation: Big O (standing for "order of") notation is a mathematical notation used to describe the asymptotic behavior of functions. More precisely, it is used to describe an asymptotic upper bound for the magnitude of a function. Big O notation is useful when analyzing algorithms for efficiency. Big O can also be used to describe the error term in an approximation to a mathematical function.(From
        1. EXAMPLE: Consider an instance that is n bits long that can be solved in n² steps. We say the problem has time complexity O(n²).
  3. Low-level programming concepts
    1. Control structures
      1. Expressions and operators (+,==,*, etc)
        1. Expressions: An expression in a programming language is a combination of values, variables, operators, and functions that are interpreted (evaluated) according to the particular rules of precedence and of association for a particular programming language, which computes and then produces (returns, in a stateful environment) another value. The expression is said to evaluate to that value. As in mathematics, the expression is (or can be said to have) its evaluated value; the expression is a representation of that value. (From
        2. Operators: Programming languages generally have a set of operators that are similar to operators in mathematics: they are somehow special functions. In addition to arithmetic operations they often perform boolean operations on truth values and string operations on strings of text. Unlike functions, operators often provide the primitive operations of the language, their name consists of punctuation rather than alphanumeric characters, and they have special infix syntax and irregular parameter passing conventions. (From
      2. Loops: A loop is a sequence of statements which is specified once but which may be carried out several times in succession. (From
        1. Count-controlled loops: (For loops) Loops that can be repeated a certain number of times.
        2. Condition-controlled loops: (While loops) Loops that can be repeated until some condition changes.
      3. Conditional statements: (If-Then clause) Requests to the computer to make an execution choice based on a given condition. (From
      4. Subroutines: (functions, methods, procedures, or subprograms) A portion of code within a larger program, which performs a specific task and is relatively independent of the remaining code. A subroutine is often coded so that it can be executed ("called") several times and/or from several places during a single execution of the program, possibly even by itself. (From
        1. Functions: Function and procedure often denote a subprogram that takes parameters and may or may not have a return value. Many make the distinction between "functions", that possess return values and appear in expressions, versus "procedures", that possess no return values and appear in statements.
    2. Input / output
    3. Programming stategies
      1. Pseudo-code: Description of a computer programming algorithm that uses the structural conventions of programming languages, but omits detailed subroutines or language-specific syntax. (From
      2. Unit testing / modular coding
      3. Debugging: what to do when your program does not work
      4. Code validation: how to know your program does not work
      5. Assertions: A programming language construct that indicates an assumption on which the program is based. Programmers add assertions to the source code as part of the development process. They are intended to simplify debugging and to make potential errors easier to find. Since an assertion failure often indicates a bug, many assertion implementations will print additional information about the source of the problem (such as the filename and line number in the source code or a stack trace). Most implementations will also halt the program's execution immediately. (From
      6. Error handling
    4. Programming style
      1. Comment your code
      2. Indentation and spacing: Using a logical and consistent indent and spacing style makes one's code more readable. (From
  4. Here we will upload an introductory Python tutorial, including:
    1. Getting started with Python
    2. Data structures: lists, dictionaries, strings
    3. Text manipulation
    4. Function declaration


  1. Here we will upload an introductory MATLAB tutorial, including:
    1. Getting started with MATLAB
    2. Data structures: Matrices, vectors
    3. Matrix manipulation
    4. Function declaration
    5. Data visualization
    6. Numerical solvers
  2. Possible problem sets or examples
    1. Use Python to practice reading/writing files and using dictionaries to translate a string of nucleotides into amino acids. Each key would be a codon, and each value would be the corresponding amino acid. Give students a text file with all 64 codons that they have to read into a dictionary. Output the protein sequence to a file. Then, have students verify their translation using Biopython's Bio.Translate as discussed here.
    2. Use Python to find the start and stop codons in a cDNA sequence; output their positions and the length of the encoded protein.