Chapter 15 R

R was originally conceived as a Free/Libre and Open Source Software packag for statistical analyses. In a way, it’s an alternative to other statistical packages, such as SPSS, SAS, Stata, jamovi, and JASP. However, because R was designed to be very extensible, it has been extended incredibly, which means that R can now be used for pretty much all tasks a scientist engages in. And for many tasks that are perhaps not commonplace for scientists, but to applied researchers, or basically anybody doing anything with any type of data.

People who use R almost always use another program as an interface to R. The most popular software package is RStudio, which is also FLOSS. RStudio is discussed in Chapter 16; this Chapter will explain the basics of R itself, and as such, can be read independent of the interface you use.

There is one other program that is important to mention separately, and that is jamovi. The vision of jamovi is to produce an accessible but powerful statistical package. Because jamovi is FLOSS, and like R, was designed to be very extensible, in many ways the advantages of R also apply to jamovi. In fact, jamovi runs R as its backend. Note, however, that jamovi is much younger, and therefore as yet, the number of modules in its library can be counted in the tens, whereas R has thousands of packages available.

15.1 R Packages

R’s extensibility is implemented through the concept of packages. A package is a set of files that offer additional functionality that is more or less coherent. For example, there are packages for specific statistical analyses, such as power analyses (pwr), single case design analyses (scda), and meta-analyses (metafor); there are packages for related tasks such as interacting with the local file system (fs), quickly locating a file in an RStudio project (here), and integrating R code in Markdown documents (rmarkdown); and packages offering functionality that is not even statistical, such as packages for qualitative data analysis (rdqa and rock), rendering 3D images (raytracer), and generative art (e.g.,generativeart and jasmines).

The R community developed and maintains the Comprehensive R Archive Network, or CRAN. CRAN is a huge repository of over ten thousand R packages. They did an impressive job when they designed the CRAN infrastructur, or more accurately, the policies. Contributing a package to R is relatively easy, but does force you to adopt some basic hygiene. For example, you need to have documentation for your package (and this documenation is checked for consistency). In addition, your dependencies on other packages has to be clearly (and machine-readably) documented. These ‘hoops’ you have to jump through ensure that CRAN packages meet some basic quality standards.

However - of course, with great power comes ~great responsibility~ some hassle. R is regularly updated, and so are R packages. Because R packages can themselves use functions from other R packages, it can happen that an update in R or another package causes ‘downstream’ packages to break. It also means that if you don’t keep your R version up-to-date, you may at some point not be able to install or update a package. Of course, updating R is a minor inconvenience, and usually something you’ll want to do anyway, but this is a minor hickup that is good to keep in mind.

15.2 Namespaces

15.3 Objects

15.3.1 Dataframes

15.3.2 Alternatives to dataframes

Recently, alternatives to dataframes were designed, such as tibbles and data.tables.

One important alternative is the data.tree. A data.tree is a dataframe for hierarchical data.