Imperial College/Courses/Fall2009/Synthetic Biology (MRes class)/'R' Tutorial/Basic Commands

From OpenWetWare
Jump to navigationJump to search
Fall 2009 - Synthetic Biology (MRes class)

Home        Lecture        'R' Tutorial        Resources        Literature

<html> <body> <!-- Start of StatCounter Code --> <script type="text/javascript"> var sc_project=3315864; var sc_invisible=0; var sc_partition=36; var sc_security="8bb2efcd"; </script>

<script type="text/javascript" src="http://www.statcounter.com/counter/counter_xhtml.js"></script><noscript><div class="statcounter"><a class="statcounter" href="http://www.statcounter.com/"><img class="statcounter" src="http://c37.statcounter.com/3315864/0/8bb2efcd/0/" alt="blog stats" /></a></div></noscript> <!-- End of StatCounter Code -->

</body> </html>

Introduction to 'R'




Useful Commands and Functions

Program management

  • q() # quit
  • help(…),?…,?help,find # help manual
  • help.start() # help in html format
  • ; # cmd separator
  • # # comment mark
  • ls(), objects() # see which R objects are in the R workspace
  • rm(x,y) # remove x,y from workspace
  • source(‘file.R’) # runs file.R from working directory
  • sink(‘file.lis’) # sends output to file.lis in working dir
  • sink() # output reverts to console
  • .Last.value # value from previous expression
  • save(),dump(),write(),dput(),dget(),write()

Data management

  • read.table(“file.dat”,header=TRUE,row.names=1)
  • scan("ex.data", skip = 1) # reading fixed formatted input
  • names(islands) # print the names attribute of the islands data set
  • table(rpois(100,5)) # build a contingency table of the counts at each combination of factor levels
  • make.names(…)
  • matrix(data,nrow = 1,ncol = 1,byrow = FALSE,dimnames) #creates a matrix
  • data() # list all available data sets
  • data(package = base) # list the data sets in the base package
  • data(women) # load the data set women
  • file.show # view file
  • attach(women) # attaches database to search path
  • detach("women") # remove database from search path
  • library() # list all available packages
  • library(eda) # load package ‘eda'
  • print(x) # prints its argument and returns it invisibly (generic)
  • edit(…) # edit a data frame or matrix
  • summary(height) # a generic function used to produce result summaries


Data manipulation

  • mode(object), length(object) # returns mode and length of object
  • str() # displays structure of an arbitrary R object
  • c(1:5, 10.5, "next") # generic fnc which combines args into a vector
  • x[1:10] # indexes vector
  • paste(c(“a”,”b”),1:10) # combine one by one into char vector
  • dim(x) or dim(x) <- c(3,4) # retrieve or set the dimension of an object
  • array # creates or tests for arrays
  • as.matrix(x) # attempts to turn x into a matrix
  • is.matrix(x) # tests if x is a (strict) matrix
  • numeric(3) # produces vector of zeroes of length 3
  • list(x=cars[,1], y=cars[,2]) # collects items together (of different types)
  • unlist # flattens list
  • factor # used to encode a vector as a factor
  1. defines a partition into groups
  • cbind(0, rbind(1, 1:3)) # combine args by columns or rows
  • as.**** (eg as.matrix(x) # coerce numerical data frame to numerical matrix
  • is.**** (eg is.matrix(x) # test of argument
  • args(t.test) # displays the argument names of a function
  • margin.table(m,1) # give margin totals of array

Program control

  • function( arglist ) expr
  • return(value)
  • if(cond) cons.expr else alt.expr
  • for(var in seq) expr
  • while(cond) expr
  • repeat expr
  • break
  • next
  • tapply(1:n, fac, sum) # apply function to each comb of factor levels

Operators

  • + - * / ^ (element by element operations with recycling)
  • %% (mod)
  • %/% (integer division)
  • crossprod
  • %*% (matrix prod, inner product)
  • outer %o% (outer product)
  • a&b (and), a|b (a or b), !a (not a)
  • precedence: $ [] ^ unary- : (%% %/% %*%) (* /) (+ - ?) (< > <= >= == !=) ! (& | && ||) ~ (<- ->)

Mathematical functions

  • solve backsolve forwardsolve t(transpose)
  • uniroot polyroot optimize nlm deriv
  • log log10 sqrt exp sin cos tan acos asin atan cosh sinh tanh gamma lgamma choose lchoose bessel
  • abs sign sum prod diff cumsum cumprod min max pmax pmin range length
  • diag scale nrow ncol length append drop
  • det eigen svd qr chol chol2inv
  • eigen(cbind(c(1,-1),c(-1,1))) # computes eigenvalues and eigenvectors


Statistical functions

  • mean var cov cor sd mad median range IQR fivenum quantile mahalanobis
  • sort rev order rank sort.list
  • ceiling floor round trunc signif zapsmall jitter all duplicated unique any lower.tri upper.tri
  • approx approxfun spline splinefun curve
  • mean(x, trim = .10) # (trimmed) mean

Graphics

  • par(mfrow=c(2,3)) # create 2x3 array of figs filled row-wise
  • plot pairs coplot boxplot boxplot.stats hist stem density piechart barplot dotplot qqplot qqnorm qqline ppoints interaction.plot lowess contour persp image stars symbols
  • par axis box lines abline segments points text mtext title labels legend plotmath arrows polygon Hershey plot.window xy.coords rug
  • colors hsv rgb rainbow gray palette
  • multifigure parameters)
  • graphics devices: postscript pictex windows png jpeg bmp xfig bitmap
  • locator() # read position of graphics cursor
  • identify() # identifies near point in graphic

Statistical distributions & sampling

  • sample(n) # random permutation
  • sample(x,replace=T) # bootstrap sample
  • set.seed RNGkind .Random.seed
  • Prefixes: d (density) p (distribution function) q (quantile function)
  • r (random deviates)
  • chisq t F norm binom pois exp beta gamma lnorm unif geom cauchy logis hyper nbinom weibull wilcox

Statistical tests

  • t.test prop.test binom.test wilcox.test kruskal.test ansari.test bartlett.test cor.test fisher.test fligner.test friedman.test ks.test mantelhaen.test mcnemar.test mood.test pairwise.prop.test pairwise.t.test pairwise.wilcox.test print.pairwise.htest prop.trend.test quade.test shapiro.test var.test

chisq.gof ks.gof

  • contrast contrasts p.adjust pairwise.t.test pairwise.table ptukey qtukey
  • power.prop.test power.t.test print.power.htest

Statistical procedures

  • anova aov lm glm loglin manova fitted add1 drop1 resid deviance predict coef effect dummy.coef fitted.values alias step factor * interaction model.tables proj plot summary


Data entry and manipulation: (x can be any of several types; y and z are vectors)
Command Meaning
x<-c(1, 2, 3, 4) Create a vector of numbers
x Prints contents of x
y[2:5] Returns 2nd to 5th elements of vector y
y[-3] Returns a vector of all elements in y except for the 3rd
y[y<10] Sub-vector of all entries in y less than 10
z[y<10] Sub-vector of all entries in z for which the corresponding entries in y are less than 10 (x & y must be same length)
x<-list(y,z), x$y , x$z Construct of list with two vectors in it , Returns vector y, Returns vector z
x<-data.frame(y,z), x$y, x$z Construct of dataframe* with two vectors in it, Returns vector y, Returns vector z
x<-factor(y) Converts numeric type y into a factor
is.factor(y) Returns “TRUE” if y contains factors (numeric or symbolic)
is.numeric(y) Returns “TRUE” if y contains numeric data
is.na(y) Returns “TRUE” for each entry
dimnames(x) Lists the different attributes of an array or dataframe
levels(x)=c("a", "b",…) Assign names to each factor value
x<-read.table(file="inp.txt") Read a dataset from an ascii text file of data. Add “header=TRUE” if the file contains descriptive headers
load("filename") Loads R data from filename
save(x, "filename") Saves R object x into filename
save.image("filename") Saves all current R objects into filename


Descriptive statistics: (x can be a vector or data frame; y and z are vectors)
Command Meaning
mean(x) Calculate mean of vector x (or of all vectors in data frame x)
median(x) Calculate median of vector x (or of all vectors in data frame x)
sd(x) Calculate standard deviation of vector x (or of all vectors in data frame

x)

var(x) Calculate variance of vector x (or of all vectors in data frame x)
summary(x) Calculate summary of vector x (or of all vectors in data frame x)
boxplot(x), Create boxplot of vector x (or of all vectors in data frame x)
boxplot(x~y) Create multiple boxplots of data in x, based on categories in y.
stripchart(x) Create stripchart of vector x (or of all vectors in data frame x)
stripchart(x~y) Create multiple stripcharts of data in x, based on categories in y.
hist(y) Create histogram of vector y (command will not work on a data frame)
qqnorm(y) Creates a “normal quantile-quantile” plot of y; used to test if data in x is normally distributed
plot(z~y) Makes an “x-y” plot of vector z vs. vector y