Introduction: Why Use R in Computational Biology? Top
The name “R” refers to the computational environment initially created by Robert Gentleman and Robert Ihaka, similar in nature to the “S” statistical environment developed at Bell Laboratories (http://www.r-project.org/about.html) . It has since been developed and maintained by a strong team of core developers (R-core), who are renowned researchers in computational disciplines. R has gained wide acceptance as a reliable and powerful modern computational environment for statistical computing and visualisation, and is now used in many areas of scientific computation. R is free software, released under the GNU General Public License; this means anyone can see all its source code, and there are no restrictive, costly licensing arrangements. One of the main reasons that computational biologists use R is the Bioconductor project (http://www.bioconductor.org), which is a set of packages for R to analyse genomic data. These packages have, in many cases, been provided by researchers to complement descriptions of algorithms in journal articles. Many computational biologists regard R and Bioconductor as fundamental tools for their research. R is a modern, functional programming language that allows for rapid development of ideas, together with object-oriented features for rigorous software development. The rich set of inbuilt functions makes it ideal for high-volume analysis or statistical simulations, and the packaging system means that code provided by others can easily be shared. Finally, it generates high-quality graphical output so that all stages of a study, from modelling/analysis to publication, can be undertaken within R. For detailed discussion of the merits of R in computational biology, see .