Imperial College/Courses/Fall2009/Synthetic Biology (MRes class)/'R' Tutorial/Basic Commands
<html> <body> <!-- Start of StatCounter Code --> <script type="text/javascript"> var sc_project=3315864; var sc_invisible=0; var sc_partition=36; var sc_security="8bb2efcd"; </script>
<script type="text/javascript" src="http://www.statcounter.com/counter/counter_xhtml.js"></script><noscript><div class="statcounter"><a class="statcounter" href="http://www.statcounter.com/"><img class="statcounter" src="http://c37.statcounter.com/3315864/0/8bb2efcd/0/" alt="blog stats" /></a></div></noscript> <!-- End of StatCounter Code -->
</body> </html>
Introduction to 'R'
Useful Commands and Functions
Program management
- q() # quit
- help(…),?…,?help,find # help manual
- help.start() # help in html format
- ; # cmd separator
- # # comment mark
- ls(), objects() # see which R objects are in the R workspace
- rm(x,y) # remove x,y from workspace
- source(‘file.R’) # runs file.R from working directory
- sink(‘file.lis’) # sends output to file.lis in working dir
- sink() # output reverts to console
- .Last.value # value from previous expression
- save(),dump(),write(),dput(),dget(),write()
Data management
- read.table(“file.dat”,header=TRUE,row.names=1)
- scan("ex.data", skip = 1) # reading fixed formatted input
- names(islands) # print the names attribute of the islands data set
- table(rpois(100,5)) # build a contingency table of the counts at each combination of factor levels
- make.names(…)
- matrix(data,nrow = 1,ncol = 1,byrow = FALSE,dimnames) #creates a matrix
- data() # list all available data sets
- data(package = base) # list the data sets in the base package
- data(women) # load the data set women
- file.show # view file
- attach(women) # attaches database to search path
- detach("women") # remove database from search path
- library() # list all available packages
- library(eda) # load package ‘eda'
- print(x) # prints its argument and returns it invisibly (generic)
- edit(…) # edit a data frame or matrix
- summary(height) # a generic function used to produce result summaries
Data manipulation
- mode(object), length(object) # returns mode and length of object
- str() # displays structure of an arbitrary R object
- c(1:5, 10.5, "next") # generic fnc which combines args into a vector
- x[1:10] # indexes vector
- paste(c(“a”,”b”),1:10) # combine one by one into char vector
- dim(x) or dim(x) <- c(3,4) # retrieve or set the dimension of an object
- array # creates or tests for arrays
- as.matrix(x) # attempts to turn x into a matrix
- is.matrix(x) # tests if x is a (strict) matrix
- numeric(3) # produces vector of zeroes of length 3
- list(x=cars[,1], y=cars[,2]) # collects items together (of different types)
- unlist # flattens list
- factor # used to encode a vector as a factor
- defines a partition into groups
- cbind(0, rbind(1, 1:3)) # combine args by columns or rows
- as.**** (eg as.matrix(x) # coerce numerical data frame to numerical matrix
- is.**** (eg is.matrix(x) # test of argument
- args(t.test) # displays the argument names of a function
- margin.table(m,1) # give margin totals of array
Program control
- function( arglist ) expr
- return(value)
- if(cond) cons.expr else alt.expr
- for(var in seq) expr
- while(cond) expr
- repeat expr
- break
- next
- tapply(1:n, fac, sum) # apply function to each comb of factor levels
Operators
- + - * / ^ (element by element operations with recycling)
- %% (mod)
- %/% (integer division)
- crossprod
- %*% (matrix prod, inner product)
- outer %o% (outer product)
- a&b (and), a|b (a or b), !a (not a)
- precedence: $ [] ^ unary- : (%% %/% %*%) (* /) (+ - ?) (< > <= >= == !=) ! (& | && ||) ~ (<- ->)
Mathematical functions
- solve backsolve forwardsolve t(transpose)
- uniroot polyroot optimize nlm deriv
- log log10 sqrt exp sin cos tan acos asin atan cosh sinh tanh gamma lgamma choose lchoose bessel
- abs sign sum prod diff cumsum cumprod min max pmax pmin range length
- diag scale nrow ncol length append drop
- det eigen svd qr chol chol2inv
- eigen(cbind(c(1,-1),c(-1,1))) # computes eigenvalues and eigenvectors
Statistical functions
- mean var cov cor sd mad median range IQR fivenum quantile mahalanobis
- sort rev order rank sort.list
- ceiling floor round trunc signif zapsmall jitter all duplicated unique any lower.tri upper.tri
- approx approxfun spline splinefun curve
- mean(x, trim = .10) # (trimmed) mean
Graphics
- par(mfrow=c(2,3)) # create 2x3 array of figs filled row-wise
- plot pairs coplot boxplot boxplot.stats hist stem density piechart barplot dotplot qqplot qqnorm qqline ppoints interaction.plot lowess contour persp image stars symbols
- par axis box lines abline segments points text mtext title labels legend plotmath arrows polygon Hershey plot.window xy.coords rug
- colors hsv rgb rainbow gray palette
- multifigure parameters)
- graphics devices: postscript pictex windows png jpeg bmp xfig bitmap
- locator() # read position of graphics cursor
- identify() # identifies near point in graphic
Statistical distributions & sampling
- sample(n) # random permutation
- sample(x,replace=T) # bootstrap sample
- set.seed RNGkind .Random.seed
- Prefixes: d (density) p (distribution function) q (quantile function)
- r (random deviates)
- chisq t F norm binom pois exp beta gamma lnorm unif geom cauchy logis hyper nbinom weibull wilcox
Statistical tests
- t.test prop.test binom.test wilcox.test kruskal.test ansari.test bartlett.test cor.test fisher.test fligner.test friedman.test ks.test mantelhaen.test mcnemar.test mood.test pairwise.prop.test pairwise.t.test pairwise.wilcox.test print.pairwise.htest prop.trend.test quade.test shapiro.test var.test
chisq.gof ks.gof
- contrast contrasts p.adjust pairwise.t.test pairwise.table ptukey qtukey
- power.prop.test power.t.test print.power.htest
Statistical procedures
- anova aov lm glm loglin manova fitted add1 drop1 resid deviance predict coef effect dummy.coef fitted.values alias step factor * interaction model.tables proj plot summary
Command | Meaning |
---|---|
x<-c(1, 2, 3, 4) | Create a vector of numbers |
x | Prints contents of x |
y[2:5] | Returns 2nd to 5th elements of vector y |
y[-3] | Returns a vector of all elements in y except for the 3rd |
y[y<10] | Sub-vector of all entries in y less than 10 |
z[y<10] | Sub-vector of all entries in z for which the corresponding entries in y are less than 10 (x & y must be same length) |
x<-list(y,z), x$y , x$z | Construct of list with two vectors in it , Returns vector y, Returns vector z |
x<-data.frame(y,z), x$y, x$z | Construct of dataframe* with two vectors in it, Returns vector y, Returns vector z |
x<-factor(y) | Converts numeric type y into a factor |
is.factor(y) | Returns “TRUE” if y contains factors (numeric or symbolic) |
is.numeric(y) | Returns “TRUE” if y contains numeric data |
is.na(y) | Returns “TRUE” for each entry |
dimnames(x) | Lists the different attributes of an array or dataframe |
levels(x)=c("a", "b",…) | Assign names to each factor value |
x<-read.table(file="inp.txt") | Read a dataset from an ascii text file of data. Add “header=TRUE” if the file contains descriptive headers |
load("filename") | Loads R data from filename |
save(x, "filename") | Saves R object x into filename |
save.image("filename") | Saves all current R objects into filename |
Command | Meaning |
---|---|
mean(x) | Calculate mean of vector x (or of all vectors in data frame x) |
median(x) | Calculate median of vector x (or of all vectors in data frame x) |
sd(x) | Calculate standard deviation of vector x (or of all vectors in data frame
x) |
var(x) | Calculate variance of vector x (or of all vectors in data frame x) |
summary(x) | Calculate summary of vector x (or of all vectors in data frame x) |
boxplot(x), | Create boxplot of vector x (or of all vectors in data frame x) |
boxplot(x~y) | Create multiple boxplots of data in x, based on categories in y. |
stripchart(x) | Create stripchart of vector x (or of all vectors in data frame x) |
stripchart(x~y) | Create multiple stripcharts of data in x, based on categories in y. |
hist(y) | Create histogram of vector y (command will not work on a data frame) |
qqnorm(y) | Creates a “normal quantile-quantile” plot of y; used to test if data in x is normally distributed |
plot(z~y) | Makes an “x-y” plot of vector z vs. vector y |