Git is a powerful version control system allowing to record, access, and restore the history of projects.
After setting up remotes on the internet or other network, Git is also a mighty collaboration tool.
In this workshop, we will use the popular online Git repository hosting site GitHub to practice a collaboration workflow typical of many research teams.
1 - Properly configured Git
These minimum configurations should be set properly:
- your user name
- your email address
- your preferred text editor
- the end of line formatting matching your operating system
2 - GitHub account
A free GitHub account.
(Optional) If you don't want to type your password all the time, set SSH for your account.
Basic knowledge of Git:
- familiarity with the concept of staging area,
- experience with staging and committing.
Git is a great tool for version control. But how can it be used to collaborate on projects?
To collaborate, you need a syncing hub
Git projects are local and self-contained: except for a few configuration files, everything lives in the root of your project. To work jointly with your collaborators however, everybody needs to have access to the project.
Git has a great solution for this: remotes.
What are remotes, really?
Remotes are copies of a project that reside outside of it and are connected to it so that data can be synced back and forth. "Outside" can be anywhere, including on an external drive, or even on the same machine. If you want your remotes to serve as backups, you want them outside your machine. And if you want your remotes to allow for collaboration, you want them on a network your collaborators have access to. One option, of course, is the internet.
A project can have several remotes. An address (or a path if they are local) specifies their location.
Collaborating on projects through Git and GitHub
There are multiple scenarios:
- You created a project on your machine, put it under version control, and now you want others to contribute to it.
- You want to contribute to a project started by others. ◦ You do not have write access to the GitHub repository. ◦ You were granted write access to it.
Let's go over these scenarios.
You create a project and want others to contribute to it
Let's quickly create a project:
$ cd /location/of/new/project
$ mkdir myproject
$ cd myproject
$ echo "This is our great project" > README
This is the content of our project:
$ ls -a
. .. README
Then, let's put it under version control with Git:
$ git init
You can see that this is now a Git repository:
$ ls -a
. .. .git README
Let's create a first commit:
$ git add README
$ git commit -m "Initial commit: add README"
Now, you need to create a remote on GitHub.
First, you need to create a new GitHub repository.
Creating an empty repository on GitHub
From there, select thetab, then click the green button.
Enter the name you want for your repo, without spaces. It can be the same name you have for your project on your computer (it would be sensible and make things less confusing), but it doesn't have to be.
You can make your repository public or private. Choose the private option if your research contains sensitive data or you do not want to share your project with the world. If you want to develop open source projects, of course, you want to make them public.
Then, you have this empty repository on GitHub, but it is not connected to your local repository.
Adding the new GitHub repo as a remote
Click on thegreen drop-down button, select SSH (if you have set SSH for your GitHub account) or HTTPS (if you haven't) and copy the address.
Then, go back to your command line,
cd inside your project if you aren't already there and add your remote.
You add a remote with:
git remote add <remote-name> <remote-address>
origin when you clone a repo, it is common practice to use
origin as the name for the first remote.
if you have set SSH for your GitHub account—the SSH form.is the address of your remote in the https form or—
Example (using an SSH address):
git remote add origin email@example.com:<user>/<repo>.git
In our case:
$ git remote add origin firstname.lastname@example.org:<user>/myproject.git
Example (using an HTTPS address):
git remote add origin https://github.com/<user>/<repo>.git
In our case:
$ git remote add origin https://github.com/<user>/myproject.git
git remote add origin, then paste the address you have just copied on GitHub).
Finally, if you want to grant your collaborators write access to the project, you need to add them to it (note that you don't have to give them write access: we will see later how one can contribute to a project without having write access to it. But if you are involved in a serious collaboration with others on a project, you might want to facilitate the process by letting them edit the project directly).
Inviting collaborators to a GitHub repo
- Go to your GitHub project page
- Click on the tab
- Click on the section on the left-hand side (you will be prompted for your GitHub password)
- Click on the green button
- Invite your collaborators with one of their GitHub user name, their email address, or their full name
Getting information on remotes
To list remotes, run:
To list the remotes with their addresses:
git remote -v
You can see that your local project now has a remote called
origin and that it has the address of your GitHub repo.
To get yet more information about a particular remote, you can run:
git remote show <remote-name>
For instance, to inspect your new remote, run:
git remote show origin
You rename a remote with:
git remote rename <old-remote-name> <new-remote-name>
And you delete a remote with:
git remote remove <remote-name>
You can change the url of the remote with:
git remote set-url <remote-name> <new-url> [<old-url>]
Working with remotes
Downloading data from the remote
If you collaborate on your project through the GitHub remote, you will have to download data added by your teammates to keep your local project up to date.
To download new data from the remote, you have 2 options:
git fetch and
Fetching downloads the data from your remote that you don't already have in your local version of the project.
git fetch <remote-name>
The branches on the remote are now accessible locally as
<remote-name>/<branch>. You can inspect them or you can merge them into your local branches.
Example: To fetch from your new GitHub remote, you would run:
git fetch origin
Pulling does 2 things: it fetches the data (as we just saw) and it then merges the changes onto your local branches.
git pull <remote-name> <branch>
git pull origin master
If your branch is already tracking a remote branch (see below), then you simply need to run:
Now, how do you upload data to the remote?
Pushing to a remote
Uploading data to the remote is called pushing and is done with:
git push <remote-name> <branch-name>
To push your branch
master to the remote
git push origin master
You can also set an upstream branch to track a local branch with the
git push -u origin master
From now on, all you have to run when you are on
Git knows that your local
master branch is being tracked by the upstream
You want to contribute to a project created by someone else
When you want to contribute to someone's project, you are in one of two scenarios: either you have write access to the project or you don't.
Read access only
If you do not have write access to the remote, you cannot push to it and you need to submit a pull request (PR).
Let's try it using this project.
Here is how to set things up:
- Fork the project.
- Clone your fork on your machine (this will automatically set your fork as a remote to your new local project and that remote is automatically called
- Add a second remote, this one pointing to the initial project. Usually, people call that remote
Fork the repo
First, go to GitHub and fork the project by clicking on thebutton in the top right corner.
Clone your fork
Then, navigate to the directory in which you want to clone the project and clone your fork:
$ cd /location/of/new/project
There are 2 ways to clone a project. If you have set SSH for your account, the command is:
git clone email@example.com:<user>/<repo>.git
$ git clone firstname.lastname@example.org:<user>/git_practice.git
If you haven't set SSH for your account, use the HTTPS address and enter your GitHub user name and password when prompted. The general command looks like this:
git clone https://github.com/<user>/<repo>.git
$ git clone https://github.com/<user>/git_practice.git
Note that, if you want to give your copy of the project a different name, you can clone it with either of:
git clone email@example.com:<user>/<repo>.git <name>
git clone https://github.com/<user>/<repo>.git <name>
Add the initial project as upstream
We already saw how to add a remote:
$ git remote add upstream firstname.lastname@example.org:razoumov/git_practice.git
- Make sure to use 'upstream' and not 'origin' for the name of this remote
- Make sure to use the address of the initial project and not your fork
From there on, you can:
- Pull from
upstream(the repo to which you do not have write access and to which you want to contribute). This allows you to keep your fork up-to-date.
- Push to and pull from
origin(this is your fork, to which you have read and write access).
You are now ready to submit pull requests.
Here is the workflow:
- Pull from
upstreamto make sure that your contributions are made on an up-to-date version of the project
- Create and checkout a new branch
- Make and commit your changes on that branch
- Push that branch to your fork (i.e.
origin— remember that you do not have write access on
- Go to the original project GitHub's page and open a pull request from your fork. Note that after you have pushed your branch to origin, GitHub will automatically offer you to do so.
The maintainer of the original project may accept or decline the PR. They may also make comments and ask you to make changes. If so, make new changes and push additional commits to that branch.
Once the PR is merged by the maintainer, you can delete the branch on your fork and pull from
upstream to update your local fork with the recently accepted changes.
Let's try this with our new project.
If you have write access to the project, you can clone the project directly, without forking it, and push changes to it.
This is a very simple setup: the copy on GitHub is the central copy—the one allowing various team members to work jointly on the same project. You now have a copy of it (as well as its entire history) on your machine and you push and pull freely to/from it. Your collaborators have their own copies on their own machines and can also freely push/pull to the same remote.