MATLAB

From OpenWetWare
Jump to navigationJump to search

About MATLAB

MATLAB is a numerical computing environment and programming language. Created by The MathWorks, MATLAB allows easy matrix manipulation, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs in other languages. Although it specializes in numerical computing, an optional toolbox interfaces with the Maple symbolic engine, making it a full computer algebra system. It is used by more than one million people in industry and academia and runs on most modern operating systems, including Windows, Mac OS, Linux and Unix. The current version is MATLAB 7.1 Service Pack 3. It is available for commercial use for approximately US$2000 and US$100 for an academic license with a limited set of Toolboxes.
— from Wikipedia:MATLAB

For Bioinformatics and Systems Biology, there are useful toolboxes available:

Elementary Tutorial

The following is an unstructured quick bare bones introduction to some of MATLAB's important commands/functions. MATLAB is reasonably easy to use and intuitive to pick up, so go ahead and experiment and learn!

GENERAL COMMANDS

clc
clears the command window
clear
removes all variables from the workspace
clear all
removes all variables, globals, functions and MEX links
ls
Returns listing of the current directory (works on Windows as well)
!ls
Use ! to run shell commands both in windows and linux. In windows, !explorer . maybe a useful command, to browse the current directory
pwd
Shows the present working directory (in Windows too)
help <command name>
gives a brief documentation
doc <command name>
gives the complete documentation
lookfor <text>
looks for the string <text> in the first comment line of the HELP text in all M-files (MATLAB functions/scripts) found on MATLABPATH.
whos
It lists all the variables in the current workspace, together with information about their size, bytes, class, etc.
whos -file <filename.mat>
lists the variables in the specified .MAT file.
type <filename>
'cats' or echoes the file to the screen


ESSENTIAL FUNCTIONS

size
returns the size of the matrix
length
returns the maximum dimension of a matrix
A'
transposes a matrix
1:7:100
Generates numbers upto 100 beginning from 1 in steps of 7.
   >> 1:7:100
   
   ans =
   
        1     8    15    22    29    36    43    50    57    64    71    78    85    92    99
   
   >>   
linspace
linspace(a,b,n) generates n linearly spaced points between a and b
logspace
generates logarithmically spaced points
:
: (colon) is a wildcard (similar to * in shell). e.g. A(:,4:end). This returns all rows of A and columns from 4 onwards. end is a keyword that tells MATLAB to take elements untill the last column is reached.
Entering a vector
x= [1 2 3 4 5]
Entering a matrix
I_4= [1 0 0 0; 0 1 0 0 ; 0 0 1 0; 0 0 0 1] Instead of a semi-colon, line-breaks (Enter) will also work.
A([1 3 5 7],[2 4 6 8])
any submatrix can be addressed; it can even be assigned a value. For example A([1 3 5 7],[2 4 6 8])=0 sets those elements to zero!
prod(size(x))
a simple construct for calculating the number of elements in x.
find(A)
returns the indices of the non-zero elements in a vector. For matrices, it is better to give two outputs: [i,j]=find(A).
sqrt(i)
complex variables are handled seamlessly. i and j are both square roots of -1. Avoid i,j in loops, therefore.
inline
Can create functions like [math]\displaystyle{ f(x,y)=\sin(x^2+y^2+xy) }[/math]. e.g.:
   >> f=inline('sin(x.^2+y.^2+x.*y)')
   
   f =
   
        Inline function:
        f(x,y) = sin(x.^2+y.^2+x.*y)
sparse
Use whenever memory problems are possible due to the usage of large matrices. MATLAB intelligently applies the corresponding algorithms. Typically, just saying A=sparse(A) may cause your code to run 10 times faster, if your matrix is large and sparse.

SPECIAL MATRICES

ones(m,n[,p,..])
Generates a matrix of the corresponding size with all elements 1.
zeros(m,n)
Generates a matrix filled with zeroes; useful for initialisation some times.
rand(n)
gives an n-by-n matrix with random entries, chosen from a uniform distribution on the interval (0.0,1.0).
eye(n)
eye(n) is the n-by-n identity matrix.
eye(m,n)
is an m-by-n matrix with 1's on the diagonal and zeros elsewhere.

Note that a single argument usually creates a square matrix for the above functions.

diag
Can be used to extract the diagonal of a matrix or even create diagonal matrices.

INPUT/OUTPUT RELATED

x=1 vs x=1;
; suppresses the output
   >> x=1
   
   x =
   
        1
   
   >> x=1;
   >> 
disp
Suppresses some of the extra lines...
   >> x
   
   x =
   
        1
   
   >> disp(x)
        1
   
   >> 
sprintf
similar to sprintf of C/++.
fprintf
similar to fprintf of C/C++. Use either this or sprintf to print on stdout (command window).
load
loads files. Simplest file is one with equal number of entries in each row separated by space or tab. MATLAB will crib if number of entries in any row are different from others.
importdata
Useful for loading text files and other files you would usually load using 'File>Import...'
xlsread
Can read Microsoft Excel files directly
save
Without filename, all variables are saved to 'matlab.mat'. With filename, all are saved to the corresponding file. You can even specify what variables to save, even with a regular expression!
format long
Displays many digits of precision (default format shows only 4 decimals)
format rat
Displays output as rational numbers!
format
Resets format

MATRIX FUNCTIONS

MATLAB has a host of matrix-related functions. lookfor should come handy for searches.

inv
Inverse of a matrix.
inv vs \
Inverse of a matrix is expensive to compute and to be avoided. MATLAB supports the beautiful slash operator: Just as for [math]\displaystyle{ ax=b }[/math], [math]\displaystyle{ x=b/a }[/math], even for matrix equations (simulataneous equations), i.e. when [math]\displaystyle{ \mathbf{A}\mathbf{x}=\mathbf{b} }[/math] MATLAB allows, x=A\b (try to see it is a b 'over' A!)
*
Matrix multiplication
.*
Element-wise multiplication
.
. does element-wise operations.
For example: A^5 vs A.^5: A^5 multiplies A with itself 5 times (matrix product; possible only when A is square) whereas A.^5 raises every element of A to the 5th power.
norm
Various norms maybe calculated for a matrix.
expm(A) vs exp(A)
exp(A) does [math]\displaystyle{ e^{a_{ij}}\ \forall \ a_{ij} \in \mathbf{A} }[/math], whereas expm is defined as:

[math]\displaystyle{ e^\mathbf{A}=\mathbf{I}+\mathbf{A}+\frac{\mathbf{A^2}}{2!}+\frac{\mathbf{A^3}}{3!}+\dots \infty }[/math] This is of great use while solving systems of differential equations.

svd
Singular Value Decomposition
eig
Eigenvalue/Eigenvector finding
chol
Cholesky Decomposition

POLYNOMIAL FUNCTIONS

polyval(P,X)
returns the value of a polynomial P evaluated at X. P is a vector of length N+1 whose elements are the coefficients of the polynomial in descending powers. [math]\displaystyle{ Y = P(1)*X^N + P(2)*X^{N-1} + ... + P(N)*X + P(N+1) }[/math]. If X is a matrix or vector, the polynomial is evaluated at all points in X.
compan
Can be used for solving polynomial equations. Beautiful function! The eigenvalues of the companion matrix of a polynomial are the roots of the polynomial!
Example: Calculate the cube roots of unity, i.e. the roots of the equation [math]\displaystyle{ x^3-1=0 }[/math]:
   >> eig(compan([1 0 0  -1]))
   
   ans =
   
     -0.5000 + 0.8660i
     -0.5000 - 0.8660i
      1.0000
   >>

ADVANCED

cell
symbolic

FIGURE RELATED FUNCTIONS

figure
Opens a new figure window
clf
Clears the current figure
close
Closes the curent figure window
close all
Closes all figure windows
close all hidden
Closes all figure windows including those hidden (clustergrams are strangely 'hidden')
close (1:gcf)
a workaround for closing all including hidden I guess...
hold on
Holds the current plot on the figure window
waitforbuttonpress
Waits on the plot during a script/function, till a button is pressed, before further execution
subplot
Create a grid of plots in a single figure.
plot
Versatile plot function.
ezplot
Can plot anything you imagine! Just try ezplot('sin(x*y)')!
surf
Plots a surface
mesh
Plots a mesh (similar to surf but ~no solid)
title
Titles a figure
xlabel
Add X-Axis title
ylabel
Add Y-Axis title
colormap
Change the colours of plots/images
spy
Sparsity plot of a matrix — puts blue dots wherever entries are non-zero. Shows number of non-zeroes as well.

PROGRAMMING

  • function name and file name should be the same
  • % comments
  • help and comments
help myfun will return the first block of comments in myfun.m
(see the following example as well)
  • return is implicit. For example
   % my function to calculate max of two numbers
   % 
   function a=mymax(x,y)
   if (x>y)
       a=x;
   else
       a=y'
   end
   return %optional: whatever a is at the end of the 
          %function file, that value will be returned.  
  • programs vs scripts
Scripts don't have a function keyword. They're just 'scripts', which run a series of commands. The variables are all in memory after the execution of the script
  • if (x~=1)
This is a significant difference from C/C++: instead of x!=1, use x~=1.
  • elseif
  • end - ENDs any blocks - for, if, elseif, while etc (roughly equivalent of } in C++; there is no { if (x~=1) itself implies — loosely— if(x!=1){ of C++))
  • for k=1:100 (not i!!)
  • All addressing is from 1:n
  • handling arguments
Multiple arguments can be specified in the function definition
   function [mean,median,mode,variance]=statistics(x)
  • for loops is bad programming... there should be vectorisable alternatives!!

INTEGRATING ODEs

Matlab has a number of excellent tools for numerically integrating systems of ordinary differential equations. Here is a brief HOWTO:

  • Step 1) Convert your system of high order, non-autonomous ODEs into autonomous first order ODEs. For example, if the equation is d2y/dt2 + a*dx/dt + b*x + c = f(t), then create auxillary variables y1 = t, y2 = x, y3 = dx/dt. Then the time derivatives of these auxillary variables will be dy1/dt = 1, dy2/dt = dx/dt = y3, dy3/dt = d2x/dt2 = -a*y3 - b*y2 - c + f(y1), which is a system of first order ODEs.
  • Step 2) Create a Matlab M-file with your equations. If we continue with the example, the file should look like:
        function dY = MyMFile(t,Y)
           a=1;b=2;c=3;
           f = @(t) sin(t);
           dY(1) = 1;
           dY(2) = Y(3);
           dY(3) = -a*Y(3) - b*Y(2) - c + f(Y(1));
           dY=dY';

Save this file with name 'MyMFile.m'. Here, we've also used an "anonymous function" called 'f'. It's a fast way of creating a function with one or more parameters.

  • Step 3) Call a Matlab ODE solver function using the M-file we just created. For example,
        [T,Y] = ode45(@MyMFile, [TStart TEnd], Yo);

This will use the 'ode45' solver to integrate the ODEs in the MyMFile M-file from a beginning time of TStart to an ending time of TEnd with an initial condition Yo. So ...

        Yo = [0;-1;0];

Remember that the length of Yo should be equal to the number of ODEs in your system. You can also use [TStart:TStep:TEnd] to force Matlab to output the solution at TStart, TStart+TStep, and so on until TEnd. Otherwise, the solver will dynamically choose an optimal time step and output the solution at these times.

Matlab comes with at least 7 ODE solvers: ode45, ode23, ode113, ode15s, ode23s, ode23t, and ode23tb. Each uses a different numerical method to numerically integrate the system of ODEs. The order of the method is given in the name. The 'ode23' solver is a 2nd-3rd order method with global error proportional to Δt2 or Δt3, depending on the system of ODEs. If the last letter of the solver is 's', then the numerical method is particularly good for "stiff" systems. A stiff system of ODEs has at least two very different timescales (fast and slow). Stiff solvers are more efficient at solving stiff systems of ODEs. The best practice is to try the 'ode45' solver first and (if it's super slow) then switch to the 'ode23s' or 'ode15s' one. The syntax for each ODE solver is very similar (if not exact).

  • Step 4) Plot the result!
       plot(T,Y)
  • There are many available options to make integrating large systems of ODEs more efficient, such as providing the explicit Jacobian or describing the band structure of the Jacobian. Matlab also supports the execution of events. If you're solving a partial differential equation with finite differences, Matlab can also solve the resulting M(t,y)*dy/dt = f(t,y) problem, where M is the "mass" matrix.

BIOINFORMATICS TOOLBOX

clustergram
useful for clustering microarray data etc. Has a lot of options.

Symbolic Toolbox

Store an equation in GrowthEq using t as an independent variable

syms t;
GrowthEq = K / (1 +  exp(-r* (t - l )));

Differentiate with respect to t

GrowthRate = diff(GrowthEq,t);

plot the function with t = [0..50]

ezplot(GrowthEq,[0 50]);

SBMLToolBox

Troubleshooting

Running Matlab remotely on OS X

External Links