Imperial College/Courses/Fall2009/Synthetic Biology (MRes class)/'R' Tutorial/Basic Commands

Fall 2009 - Synthetic Biology (MRes class)

Home Lecture 'R' Tutorial Resources Literature

<html> <body> <script type="text/javascript"> var sc_project=3315864; var sc_invisible=0; var sc_partition=36; var sc_security="8bb2efcd"; </script>

</noscript>

</body> </html>

Introduction to 'R'

Useful Commands and Functions

Program management

Data management

read.table(“file.dat”,header=TRUE,row.names=1)
scan("ex.data", skip = 1) # reading fixed formatted input
names(islands) # print the names attribute of the islands data set
table(rpois(100,5)) # build a contingency table of the counts at each combination of factor levels
make.names(…)
matrix(data,nrow = 1,ncol = 1,byrow = FALSE,dimnames) #creates a matrix
data() # list all available data sets
data(package = base) # list the data sets in the base package
data(women) # load the data set women
file.show # view file
attach(women) # attaches database to search path
detach("women") # remove database from search path
library() # list all available packages
library(eda) # load package ‘eda'
print(x) # prints its argument and returns it invisibly (generic)
edit(…) # edit a data frame or matrix
summary(height) # a generic function used to produce result summaries

Data manipulation

Program control

Operators

+ - * / ^ (element by element operations with recycling)
%% (mod)
%/% (integer division)
crossprod
%*% (matrix prod, inner product)
outer %o% (outer product)
a&b (and), a|b (a or b), !a (not a)
precedence: $ [] ^ unary- : (%% %/% %*%) (* /) (+ - ?) (< > <= >= == !=) ! (& | && ||) ~ (<- ->)

Mathematical functions

solve backsolve forwardsolve t(transpose)
uniroot polyroot optimize nlm deriv
log log10 sqrt exp sin cos tan acos asin atan cosh sinh tanh gamma lgamma choose lchoose bessel
abs sign sum prod diff cumsum cumprod min max pmax pmin range length
diag scale nrow ncol length append drop
det eigen svd qr chol chol2inv
eigen(cbind(c(1,-1),c(-1,1))) # computes eigenvalues and eigenvectors

Statistical functions

mean var cov cor sd mad median range IQR fivenum quantile mahalanobis
sort rev order rank sort.list
ceiling floor round trunc signif zapsmall jitter all duplicated unique any lower.tri upper.tri
approx approxfun spline splinefun curve
mean(x, trim = .10) # (trimmed) mean

Graphics

par(mfrow=c(2,3)) # create 2x3 array of figs filled row-wise
plot pairs coplot boxplot boxplot.stats hist stem density piechart barplot dotplot qqplot qqnorm qqline ppoints interaction.plot lowess contour persp image stars symbols
par axis box lines abline segments points text mtext title labels legend plotmath arrows polygon Hershey plot.window xy.coords rug
colors hsv rgb rainbow gray palette
multifigure parameters)
graphics devices: postscript pictex windows png jpeg bmp xfig bitmap
locator() # read position of graphics cursor
identify() # identifies near point in graphic

Statistical distributions & sampling

sample(n) # random permutation
sample(x,replace=T) # bootstrap sample
set.seed RNGkind .Random.seed
Prefixes: d (density) p (distribution function) q (quantile function)
r (random deviates)
chisq t F norm binom pois exp beta gamma lnorm unif geom cauchy logis hyper nbinom weibull wilcox

Statistical tests

t.test prop.test binom.test wilcox.test kruskal.test ansari.test bartlett.test cor.test fisher.test fligner.test friedman.test ks.test mantelhaen.test mcnemar.test mood.test pairwise.prop.test pairwise.t.test pairwise.wilcox.test print.pairwise.htest prop.trend.test quade.test shapiro.test var.test

chisq.gof ks.gof

Statistical procedures

anova aov lm glm loglin manova fitted add1 drop1 resid deviance predict coef effect dummy.coef fitted.values alias step factor * interaction model.tables proj plot summary

Data entry and manipulation: (x can be any of several types; y and z are vectors)
Command	Meaning
x<-c(1, 2, 3, 4)	Create a vector of numbers
x	Prints contents of x
y[2:5]	Returns 2nd to 5th elements of vector y
y[-3]	Returns a vector of all elements in y except for the 3rd
y[y<10]	Sub-vector of all entries in y less than 10
z[y<10]	Sub-vector of all entries in z for which the corresponding entries in y are less than 10 (x & y must be same length)
x<-list(y,z), x$y , x$z	Construct of list with two vectors in it , Returns vector y, Returns vector z
x<-data.frame(y,z), x$y, x$z	Construct of dataframe* with two vectors in it, Returns vector y, Returns vector z
x<-factor(y)	Converts numeric type y into a factor
is.factor(y)	Returns “TRUE” if y contains factors (numeric or symbolic)
is.numeric(y)	Returns “TRUE” if y contains numeric data
is.na(y)	Returns “TRUE” for each entry
dimnames(x)	Lists the different attributes of an array or dataframe
levels(x)=c("a", "b",…)	Assign names to each factor value
x<-read.table(file="inp.txt")	Read a dataset from an ascii text file of data. Add “header=TRUE” if the file contains descriptive headers
load("filename")	Loads R data from filename
save(x, "filename")	Saves R object x into filename
save.image("filename")	Saves all current R objects into filename

Descriptive statistics: (x can be a vector or data frame; y and z are vectors)
Command	Meaning
mean(x)	Calculate mean of vector x (or of all vectors in data frame x)
median(x)	Calculate median of vector x (or of all vectors in data frame x)
sd(x)	Calculate standard deviation of vector x (or of all vectors in data frame x)
var(x)	Calculate variance of vector x (or of all vectors in data frame x)
summary(x)	Calculate summary of vector x (or of all vectors in data frame x)
boxplot(x),	Create boxplot of vector x (or of all vectors in data frame x)
boxplot(x~y)	Create multiple boxplots of data in x, based on categories in y.
stripchart(x)	Create stripchart of vector x (or of all vectors in data frame x)
stripchart(x~y)	Create multiple stripcharts of data in x, based on categories in y.
hist(y)	Create histogram of vector y (command will not work on a data frame)
qqnorm(y)	Creates a “normal quantile-quantile” plot of y; used to test if data in x is normally distributed
plot(z~y)	Makes an “x-y” plot of vector z vs. vector y