Configuring git

Sara Mortara and Andrea Sánchez-Tapia
2022-07-06

Introduction

Git is the most used version control software today. It lets you track the different versions of your files on your computer over time. Every time the user decides to create a version is called commit. Unlike working in Dropbox (for example) each commit is a discrete moment and decided by the user. There is no continuous update. This allows you to know/decide which are the relevant changes and separate the work into stages.

A git repository is a folder where the latest version of each file is visible, but the entire commit history of the files is available, can be explored, rolled back. With git the user knows what was added, modified, deleted in each commit, and therefore does not need to create duplicate versions of files, or rename along the way.

Git works locally, but it also allows you to establish remote repositories. This allows working from different computers, with different users and having a backup remotely. In this sense, git is said to be a distributed version control system, where the loss of a “central” computer or user does not imply the loss of the entire work.

Currently, GitHub (www.github.com) is the most popular and used git repository storage system. However, institutions can implement servers so that they serve as remotes, and there are other similar services such as Gitlab (www.gitlab.com, our favorite <3) and Bitbucket (www. bitbucket.com).

Git can be used in any folder on the computer and is a system independent of the R workflow. It was developed by Linus Torvalds to be able to collaborate with different linux authors and to be able to work offline (between commits). In this tutorial we will configure the computer so that data analysis projects can take advantage of git and workflows are more organized.

This tutorial is inspired by the R course from Page Piccinini

Git configuration on computer

First let’s do the git configuration on the computer. To do this, open a terminal window in RStudio.

Identification

Every git command in the terminal starts with git ;) Let’s enter name and email for identification:

Type it:

git config --global user.name git config --global user.email

The first time nothing should appear, if something appears it has already been shot.

If there is no response or if there is an error in the return, run:

git config --global user.name [your name] git config --global user.email [your github email!]

git config --global user.name "Andrea Sánchez-Tapia" git config --global user.email katori@gmail.com

The quotes in the name allow git to understand that the full name with spaces is user.name.

When checking, the data entered should appear, type again:

git config user.name git config user.email

The variables you entered should appear

So far git is configured on the computer and it knows who you are.

Creating a git and GitHub repository

There are several ways to create the local git repository that can communicate remotely with GitHub, GitLab or Bitbucket. In this case, we already have a local folder so we just need to start git locally and create a remote repository and add it locally.

In other workflows, you may want to create the repository directly on GitHub and clone it to your computer, and only add content later.

In general, read the instructions available on the hosting services :) The GitHub, GitLab and Bitbucket (Atlassian) help are very useful.

locally

  1. Always check: git status

this is not yet a git repository:

fatal: not a git repository (or any of the parent directories): .git

  1. In the terminal: git init

Initialized empty Git repository in /Users/andreasancheztapia/Desktop/project_work_area/.git/

  1. See if there are any remotes for this folder: git remote -v Nothing, right?

Let’s create and add a remote repository created on GitHub

Always remember to check:

git status

At this point the message in the terminal should be:

On branch master No commits yet Untracked files:...

Working on the repo

Let’s make a modification to the README.md, add the changes (add), commit (commit) and push (push).

  1. Edit your README.md in an interesting and meaningful way -

  2. Add your README.md: this means git will start monitoring this file.

git add README.md

Always do git status between steps to understand what is happening

  1. Let’s commit this file that was added. The commit requires you to write a message explaining why you made the changes:

git commit -m "I made the changes because it felt good"

[master b9cdaf7] I made the changes because it was good 1 file changed, 1 insertion(+)

Let’s connect the computer with GitHub

YOU ONLY NEED TO DO THIS ONCE ON EACH COMPUTER

To do this, we generate a security key that identifies the computer and copy it to GitHub.

This key is for each individual computer. You can only have one GitHub account but work on different computers, and each will have its own key.

in RStudio: create RSA key

  1. In the RStudio options, look for the option Preferences > git/svn
  2. Check that git is pointing to a file git.exe on windows, mac and linux /usr/bin/git
  3. If you have never done this, there should be nothing in the RSA key field, click Create RSA Key. If you already have something, go to the next step.
  4. View the RSA Key, copy the key. It is a key that identifies your computer and we will copy it from GitHub.com

on GitHub: paste the RSA key

  1. Log in
  2. Find Settings > SSH and GPG keys > create a ssh key
  3. In title: your computer name
  4. Paste the key that had been copied.
  5. Add, OK.

So far github and your computer can communicate :D This key configuration only needs to be done once on each computer. The rest needs to be run every time you create a repository

on GitHub

  1. New repository (green button)

  1. Enter a name, create as public, no readme because you already have a readme locally.

  1. An instruction page will open

We are going to add the remote, copy the SSH option from the top frame

git@github.com:AndreaSanchezTapia/blah.git

Go back to the local terminal and add this remote:

git remote add origin + paste content with ctrl + v

git remote add origin git@github.com:AndreaSanchezTapia/blah.git

You just added the remote you created on GitHub

Check if it exists:

git remote -v

The response should be something similar to this:

$ origin git@github.com:AndreaSanchezTapia/blah.git (fetch) $ origin git@github.com:AndreaSanchezTapia/blah.git (push)

So far you have a remote and a local repository

  1. Now all you need to do is push the commit

git push -u origin master

The -u marks an “upstream”: any changes on the remote can be retrieved locally

The push message should look like this:

Warning: Permanently added the RSA host key for IP address '18.228.52.138' to the list of known hosts. Enumerating objects: 5, done. Counting objects: 100% (5/5), done. Writing objects: 100% (3/3), 313 bytes | 313.00 KiB/s, done. Total 3 (delta 0), reused 0 (delta 0) To github.com:AndreaSanchezTapia/blah.git af45751..b9cdaf7 master -> master


Make one more edit to the README.md and repeat steps 2 to 4: add, commit, push


Let’s go to GitHub: what’s changed?