R Statistics: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
m (updated version)
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
=R for Statiscal Computing=
{{Back to statistics portal}}
 
[[Image:Rlogo.jpg]]
 
'''R''' is a free software for statistical analysis and graphics.<br>
It runs on various UNIX platforms, Windows, and MacOS.<br>
The latest version, 2.12.1, was released on December 16, 2010.<br>
Since 1997 an international core team of about 15 people develops R.
 
[[Image:R screenshot Unix.jpg|thumb|screenshot of R running on Unix]]


==What is R?==
==What is R?==
R is widely used for statistical software development and data analysis, and has become a de-facto standard among statisticians for the development of statistical software. R's source code is freely available under the GNU General Public License, and pre-compiled binary versions are provided for Microsoft Windows, Mac OS X, and several Linux and other Unix-like operating systems. R uses a command line interface, though several graphical user interfaces are available. [http://en.wikipedia.org/wiki/R_%28programming_language%29 wikipedia entry on R]


(Taken from R for beginners)
===Features===
"R has many functions for statistical analyses and graphics; the latter are visualized immediately in their own window and can be saved in various formats (jpg, png, bmp, ps, pdf, emf, pictex, xfig; the available formats may
depend on the operating system). The results from a statistical analysis are displayed on the screen, some intermediate results (P-values, regression coefficients, residuals,...) can be saved, written in a file, or used in subsequent analyses.


R is a system for statistical analyses and graphics created by Ross Ihaka
The R language allows the user, for instance, to program loops to successively analyse several data sets. It is also possible to combine in a single program different statistical functions to perform more complex analyses. The R users may benefit from a large number of programs written for S and available on the internet, most of these programs can be used directly with R.
and Robert Gentleman. R is both a software and a language considered as a
dialect of the S language created by the AT&T Bell Laboratories. S is available
as the software S-PLUS commercialized by Insightful2 There are important
erences in the designs of R and of S: those who want to know more on this
point can read the paper by Ihaka & Gentleman (1996) or the R-FAQ, a copy
of which is also distributed with R.
R is freely distributed under the terms of the GNU General Public Licence;
its development and distribution are carried out by several statisticians known
as the R Development Core Team.


R is available in several forms: the sources (written mainly in C and
At first, R could seem too complex for a non-specialist. This may not be true actually. In fact, a prominent feature of R is its flexibility."
some routines in Fortran), essentially for Unix and Linux machines, or some
pre-compiled binaries for Windows, Linux, and Macintosh. The les needed
to install R, either from the sources or from the pre-compiled binaries, are
distributed from the internet site of the Comprehensive R Archive Network
(CRAN) where the instructions for the installation are also available. Regarding
the distributions of Linux (Debian, . . . ), the binaries are generally
available for the most recent versions; look at the CRAN site if necessary.


R has many functions for statistical analyses and graphics; the latter are
[http://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf R for Beginners, Emmanuel Paradis, p5/6]
visualized immediately in their own window and can be saved in various formats
(jpg, png, bmp, ps, pdf, emf, pictex, xg; the available formats may
depend on the operating system). The results from a statistical analysis are
displayed on the screen, some intermediate results (P-values, regression coef-
cients, residuals, . . . ) can be saved, written in a le, or used in subsequent
analyses.


The R language allows the user, for instance, to program loops to successively
===History===
analyse several data sets. It is also possible to combine in a single
R was originally created by Ross Ihaka and Robert Gentleman (hence the name R) at the University of Auckland, New Zealand, and is now developed by the R Development Core Team. R is considered by its developers to be an implementation of the S programming language, with semantics derived from Scheme.
erent statistical functions to perform more complex analyses. The


R users may benet from a large number of programs written for S and available
==Install R==
on the internet6, most of these programs can be used directly with R.
* choose a download mirror: [http://cran.r-project.org/mirrors.html list of mirror sites for R download]
At rst, R could seem too complex for a non-specialist. This may not
* download the right package for you (Linux/Windows/Mac)
be true actually. In fact, a prominent feature of R is its
* install the package following the OS-specific instructions
exibility. Whereas a classical software displays immediately the results of an analysis, R stores
these results in an \object", so that an analysis can be done with no result
displayed. The user may be surprised by this, but such a feature is very useful.
Indeed, the user can extract only the part of the results which is of interest.


==Download and Install R==
==Use R==
To use R you will have to learn some R commands (see screenshot), i.e. it's not fully menu based like most Windows and Mac software. This might seem tedious but you will soon realise that while slowing you down initially it will speed up your work and make it better after an initial learning period.


==Links to tutorials==
There is a lot of free documentation available. Shorter manuals are first in the list:
* [http://cran.r-project.org/doc/contrib/Marthews-BeginnersRcourse.zip Friendly Beginners' R Course] by Toby Marthews, ZIP archive containing examples & 12 page PDF
* [http://cran.r-project.org/doc/contrib/Lemon-kickstart/index.html short introduction to R] by Jim Lemon, HTML v1.6
* [http://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf R for beginners] by Emmanuel Paradis, PDF 76 pages
* [http://mercury.bio.uaf.edu/mercury/R/R.html R tutorial] from Biology Fac,  University of Alaska Fairbanks, HTML
* [http://www.cyclismo.org/tutorial/R/index.html R tutorial] from Maths, Union College, NY - very clear layout, HTML
* [http://zoonek2.free.fr/UNIX/48_R/all.html R tutorial] by Vincent Zoonekynd, Maths, Université P.M. Curie, France, HTML


==Examples for commonly used statistis==
* [http://cran.r-project.org/doc/manuals/R-intro.html Official R project manual], also available as [http://cran.r-project.org/doc/manuals/R-intro.pdf PDF]
 
You can find the complete listings on the R project webpage: [http://cran.r-project.org/manuals.html manuals], [http://cran.r-project.org/other-docs.html contributed manuals].
 
==Publications==
* PLoS: [http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000482 A Quick Guide to Teaching R Programming to Computational Biology Students by Stephen J. Eglen*, Cambridge Computational Biology Institute, Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, United Kingdom]
 
 
 
==Examples for commonly used statistics==


==Bioconductor & Microarray data Analysis==
==Bioconductor & Microarray data Analysis==


==References==
== Links ==
 
* [http://www.r-project.org/ home of the R project]: manuals, FAQs, download,..
1. [http://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf R for beginners]. By Emmanuel Paradis
* [http://en.wikipedia.org/wiki/R_%28programming_language%29 R on the wikipedia]
* [http://www.versiontracker.com/dyn/moreinfo/macosx/10661 R (Mac build) on versiontracker]

Latest revision as of 14:46, 7 December 2010

back to stats portal

R is a free software for statistical analysis and graphics.
It runs on various UNIX platforms, Windows, and MacOS.
The latest version, 2.12.1, was released on December 16, 2010.
Since 1997 an international core team of about 15 people develops R.

screenshot of R running on Unix

What is R?

R is widely used for statistical software development and data analysis, and has become a de-facto standard among statisticians for the development of statistical software. R's source code is freely available under the GNU General Public License, and pre-compiled binary versions are provided for Microsoft Windows, Mac OS X, and several Linux and other Unix-like operating systems. R uses a command line interface, though several graphical user interfaces are available. wikipedia entry on R

Features

"R has many functions for statistical analyses and graphics; the latter are visualized immediately in their own window and can be saved in various formats (jpg, png, bmp, ps, pdf, emf, pictex, xfig; the available formats may depend on the operating system). The results from a statistical analysis are displayed on the screen, some intermediate results (P-values, regression coefficients, residuals,...) can be saved, written in a file, or used in subsequent analyses.

The R language allows the user, for instance, to program loops to successively analyse several data sets. It is also possible to combine in a single program different statistical functions to perform more complex analyses. The R users may benefit from a large number of programs written for S and available on the internet, most of these programs can be used directly with R.

At first, R could seem too complex for a non-specialist. This may not be true actually. In fact, a prominent feature of R is its flexibility."

R for Beginners, Emmanuel Paradis, p5/6

History

R was originally created by Ross Ihaka and Robert Gentleman (hence the name R) at the University of Auckland, New Zealand, and is now developed by the R Development Core Team. R is considered by its developers to be an implementation of the S programming language, with semantics derived from Scheme.

Install R

  • choose a download mirror: list of mirror sites for R download
  • download the right package for you (Linux/Windows/Mac)
  • install the package following the OS-specific instructions

Use R

To use R you will have to learn some R commands (see screenshot), i.e. it's not fully menu based like most Windows and Mac software. This might seem tedious but you will soon realise that while slowing you down initially it will speed up your work and make it better after an initial learning period.

There is a lot of free documentation available. Shorter manuals are first in the list:

You can find the complete listings on the R project webpage: manuals, contributed manuals.

Publications


Examples for commonly used statistics

Bioconductor & Microarray data Analysis

Links