Chapter 11 Git

Git is an open source version control system, enabling minimum-effort documentation of changes over time. It can be downloaded and installed from https://git-scm.com/. A collection of directories and files that is kept track of by git is called a repository, or repo for short.

Although Git is decentralized, in simple projects, it is usually used in a centralized fashion. This means that one of the Git repositories is treated as the central repository, and all collaborators sync their local clones of that repository with that central repo.

Unlike cloud synchronization services such as Dropbox and Sync, Git does not automatically synchronize changes. This may seem tedious but it is in fact an important feature: it facilitates a workflow where documenting what you do and why eventually becomes an automatic process.

Also unlike cloud synchronization services, Git keeps a full history of all changes in the project. This means that it’s relatively easy to find out when a paragraph in a manuscript was last changed; who made that change, and how it looked before that change. It is also easy to see how many lines were added or removed and by whom, and what the reasons were for those changes. This also makes it possible to rewind the project to a specific point in time, and even to let collaborators work in parallel versions of a project that are merged together at a later time.

In other words, whereas cloud synchronization services were in principle designed as convenient tools for, well, file synchronization, Git was designed for collaboration and version control. Whereas if you collaborate intensively on a set of files using Sync, Dropbox, Google Drive, OneDrive, or iCloud, it is easy to end up with conlficting versions of files, Git avoid this as much as possible. With Git, if two people edit the same file, instead of just saving as as a ‘conflicting version, Git merges their changes in a line-by-line fashion. If it turns out that both people edited the same information in the file, Git presents a so-called ’merge conflict’, allowing you to choose which changes to retain.

Note that this Chapter introduces Git as a tool, providing the necessary information to start using it from the Git bash command line. The ‘how-to’ guides relating to workflow are located in Part III of this book. Also note that the author are themselves only a novice user of Git. This means that this Chapter probably oversimplifies Git. For a more thorough explanation, just search the internet - there are many excellent free resources, tutorials, books, and movies. If you are familiar with R already, Jenny Bryan wrote an awesome Open Access book called Happy Git With R, hosted at https://happywithwithr.com. Danielle Navarro has a great concise slide deck that in itself pretty much explains the basics, available at https://djnavarro.github.io/chdss2018/day2/git-slides.html (use the space bar or the arrow keys to navigate).

11.1 Git Bash

Git Bash is a Windows application that provides an environment that’s very similar to what you’d get if you’d run Git in the Bash environment provided with the *nix operating systems (such as macOS and Linux). Depending on the options you choose if you install Git, you can also access Git from the standard Windows command line interface, but that usually doesn’t come with the pretty colors that Git Bash has available.

Also, depending on the options you choose during install, you can normally rightclick a directory and choose the options “Git Bash Here” from the context menu that pops up. That’s very convenient, because it allows you to quickly interact with Git for a given repository.

And finally, RStudio offers you direct access to Git Bash through the Terminal tab in the bottom-left pane. Note that RStudio also adds a dedicated Git tab to the top-right pane if it detects that the project you opened is a Git repository. How to interact with Git from RStudio is discussed in Chapter 16.

11.2 Rights

Before you can interact with a Git repository, you need to have the authorization to do so. Some Git repositories are public (most of mine are; see https://gitlab.com/matherion), which means everybody is authorized to clone the repository. However, of course that doesn’t mean that everybody can also change my files. In a Git repository manager such as GitLab (see Chapter 12), the repository’s owner can add other users who have the right to interact with the Git repository. Git projects that are not set to public aren’t even visible unless you’ve been added to that list of members. Once somebody has access to a Git repository, what they can do depends on the rights they have been given.

11.3 Operational chapters about Git

The following chapters discuss operations that involve Git: