A Working in groups

During the course you have been allocated into groups. You are expected to solve the exercises and write the project reports in these groups. Before you start, it is a good idea to agree on a set of group rules. First, agree on a coding convention. Most people in the R community use snake case but camel case is also okay. Next, setup rules on when to meet and how you will organize the work. For instance, it is a good idea that all try to solve some of the exercises before you meet and you then discuss the answers, problems etc. Finally, it is a good idea to have a common place for your code. You have different options:

  1. Use a cloud storage services such as Dropbox, OneDrive or Google Drive.
  2. Use a version control system such as Git together with GitHub. GitHub is a code sharing and publishing service and may be seen as a social networking site for programmers.
  3. If you use Posit Cloud then one person in the group can create a shared workspace with projects:
    • First create a new workspace named e.g. Shared.
    • Press Members and add the group members as moderators.
    • Now go back to Projects in the Tools for Analytics workspace and move one project to the shared workspace. Rename it to e.g. Group Project. Members will now have access to this project where you can share code. NOTE you can not work collectively on a file simultaneously. That is, only one member can change a file at a time! Hence it is a good idea to have your own private project to work on and use this project as a place where you can share code. If you want to download a project to your laptop then press the export button.

The benefit of a cloud storage service is that it is well known to you and easy to setup. Cons are that you cannot work on the same file simultaneously. The benefit of Git and GitHub is that it manages the evolution of a set of files – called a repository – in a sane, highly structured way. If you have no idea what I’m talking about, think of it as the “Track Changes” features from Microsoft Word on steroids. Here you can work on files simultaneously. Moreover, it can be used from within RStudio. Cons are that it is harder to setup and learn. For a detailed description see Why Git? Why GitHub?. The Using Git together with GitHub section gives a tutorial on how to setup Git and GitHub. Skip it if you use a cloud storage service.

Using Git together with GitHub

Git is a version control system. Git manages the evolution of a set of files – called a repository – in a sane, highly structured way. If you have no idea what I’m talking about, think of it as the “Track Changes” features from Microsoft Word on steroids.

GitHub provide a home for your Git-based projects on the internet. If you have no idea what I’m talking about, think of it as DropBox but much, much better. It allows other people to see your stuff, sync up with you, and perhaps even make changes. Even for private solo projects, it’s a good idea to push your work to a remote location for peace of mind.

To configure your computer go though the following steps:

Register a free GitHub account

Sign-up at GitHub. Some thoughts about your username:

  • Incorporate your actual name! People like to know who they’re dealing with. Also makes your username easier for people to guess or remember.
  • Reuse your username from other contexts, e.g., Twitter or Slack. But, of course, someone with no GitHub activity will probably be squatting on that.
  • Pick a username you will be comfortable revealing to your future boss.
  • Shorter is better than longer.
  • Be as unique as possible in as few characters as possible. In some settings GitHub auto-completes or suggests usernames.
  • Make it timeless. Don’t highlight your current university, employer, or place of residence, e.g. JennyFromTheBlock.
  • Avoid the use of upper vs. lower case to separate words. We highly recommend all lowercase. GitHub treats usernames in a case insensitive way, but using all lowercase is kinder to people doing downstream regular expression work with usernames, in various languages. A better strategy for word separation is to use a hyphen - or underscore _.

Install Git

Find installation instructions below for your operating system.

Windows

Install Git from the web. Windows prefers for Git to be installed below C:/Program Files and this appears to be the default. This implies, for example, that the Git executable on my Windows system is found at C:/Program Files/Git/bin/git.exe. Unless you have specific reasons to otherwise, follow this convention. If asked about “Adjusting your PATH environment”, make sure to select “Git from the command line and also from 3rd-party software”.

macOS

Option 1 (highly recommended): Install the Xcode command line tools (not all of Xcode), which includes Git.

Go to the shell and enter one of these commands to elicit an offer to install developer command line tools:

git --version
git config

Accept the offer! Click on “Install”.

Here’s another way to request this installation, more directly:

xcode-select --install

We just happen to find this Git-based trigger apropos.

Note also that, after upgrading macOS, you might need to re-do the above and/or re-agree to the Xcode license agreement. We have seen this cause the RStudio Git pane to disappear on a system where it was previously working. Use commands like those above to tickle Xcode into prompting you for what it needs, then restart RStudio.

Option 2 (recommended): Install Git from here: http://git-scm.com/downloads.

  • This arguably sets you up the best for the future. It will certainly get you the latest version of Git of all approaches described here.
  • The GitHub home for the macOS installer is here: https://github.com/timcharper/git_osx_installer.
    • At that link, you can find more info if something goes wrong or you are working on an old version of macOS.

Option 3 (recommended): If you anticipate getting heavily into scientific computing, you’re going to be installing and updating lots of software. You should check out Homebrew, “the missing package manager for OS X”. Among many other things, it can install Git for you. Once you have Homebrew installed, do this in the shell:

brew install git

Linux

Install Git via your distro’s package manager.

Ubuntu or Debian Linux:

sudo apt-get install git

Fedora or RedHat Linux:

sudo yum install git

A comprehensive list for various Linux and Unix package managers: https://git-scm.com/download/linux

Check your installation

Quit and re-launch RStudio if there’s any doubt in your mind about whether you opened RStudio before or after installing Git.

You can set your Git user name and email from within R using the usethis package:

## install if needed (do this exactly once):
## install.packages("usethis")

library(usethis)
use_git_config(user.name = "Jane Doe", user.email = "jane@example.org")

What user name should you give to Git? This does not have to be your GitHub user name, although it can be. Another good option is your actual first name and last name. If you commit from different machines, sometimes people work that info into the user name. Your commits will be labelled with this user name, so make it informative to potential collaborators and future you.

What email should you give to Git? This must be the email associated with your GitHub account.

These commands return nothing. You can check that Git understood what you typed by looking at the output of git config --global --list from a shell. An easy way to get into a shell from RStudio is **Tools > Terminal* or *Tools > Shell**. If you have any problems go though Chapters 4-14 on the Happy Git site.

Setup projects using Git and GitHub

You have different options depending on how you start you project. I will only highlight the prefererd one.

New project, GitHub first

Here we create a project with “GitHub first, then RStudio” sequence:

Step 1: Go to GitHub and make sure you are logged in. Click green “New repository” button. Or, if you are on your own profile page, click on “Repositories”, then click the green “New” button.

  • Repository name: test (or whatever you wish)
  • Public
  • YES Initialize this repository with a README

Click the big green button “Create repository.”

Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button.

Step 2: In RStudio, start a new Project:

  • File > New Project > Version Control > Git. In the “repository URL” paste the URL of your new GitHub repository. It will be something like this https://github.com/[you-username]/test.git.
  • Be intentional about where you create this Project.
  • Suggest you “Open in new session”.
  • Click “Create Project” to create a new directory, which will be all of these things:
    • a directory or “folder” on your computer
    • a Git repository, linked to a remote GitHub repository
    • an RStudio Project
  • In the absence of other constraints, I suggest that all of your R projects have exactly this set-up.

This should download the README.md file that we created on GitHub in the previous step. Look in RStudio’s file browser pane for the README.md file.

There’s a big advantage to the “GitHub first, then RStudio” workflow: the remote GitHub repo is added as a remote for your local repo and your local master branch is now tracking master on GitHub. This is a technical but important point about Git. The practical implication is that you are now set up to push and pull. No need to fanny around setting up Git remotes and tracking branches on the command line.

Step 3: Make local changes, save, commit.

Do this every time you finish a valuable chunk of work, probably many times a day.

From RStudio, modify the README.md file, e.g., by adding the line “This is a line from RStudio”. Save your changes.

Commit these changes to your local repo. How?

  • Click the “Git” tab in upper right pane
  • Check “Staged” box for any files whose existence or modifications you want to commit.
    • To see more detail on what’s changed in file since the last commit, click on “Diff” for a Git pop-up
  • If you’re not already in the Git pop-up, click “Commit”
  • Type a message in “Commit message”, such as “Commit from RStudio”.
  • Click “Commit”

Step 4: Push your local changes to GitHub

Do this a few times a day, but possibly less often than you commit.

You have new work in your local Git repository, but the changes are not online yet.

This will seem counterintuitive, but first let’s stop and pull from GitHub.

Why? Establish this habit for the future! If you make changes to the repo in the browser or from another machine or (one day) a collaborator has pushed, you will be happier if you pull those changes in before you attempt to push.

Click the blue “Pull” button in the “Git” tab in RStudio. I doubt anything will happen, i.e. you’ll get the message “Already up-to-date.” This is just to establish a habit.

Click the green “Push” button to send your local changes to GitHub. You should see some message along these lines.

[master dc671f0] blah
 3 files changed, 22 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 myrepo.Rproj

Step 5: Confirm the local change propagated to the GitHub remote

Go back to the browser. I assume we’re still viewing your new GitHub repo.

Refresh.

You should see the new “This is a line from RStudio” in the README.

If you click on “commits,” you should see one with the message “Commit from RStudio”.

Step 6: Make a change on GitHub

Click on README.md in the file listing on GitHub.

In the upper right corner, click on the pencil for “Edit this file”.

Add a line to this file, such as “Line added from GitHub.”

Edit the commit message in “Commit changes” or accept the default.

Click the big green button “Commit changes.”

Step 7: Pull from GitHub

Back in RStudio locally …

Inspect your README.md. It should NOT have the line “Line added from GitHub”. It should be as you left it. Verify that.

Click the blue Pull button.

Look at README.md again. You should now see the new line there.

The end

Now just repeat these operations when you do group work. Do work somewhere. Commit it. Push it or pull it depending on where you did it, but get local and remote “synced up”. Repeat.

Note that in general (and especially in future when collaborating with other developers) you will usually need to pull changes from the remote (GitHub) before pushing the local changes you have made. For this reason, it’s a good idea to try and get into the habit of pulling before you attempt to push.

If you have to type in your password over and over again, this can be avoided. Have a look at Chapter 10 of Happy Git.

Existing project, GitHub first

See details in Chapter 16 of Happy Git.

Existing project, GitHub last

See details in Chapter 17 of Happy Git.