Github Tutorial

The following walks you through:

  1. Creating a new account.
  2. Creating a new repository.
  3. Downloading a repository from GitHub: clone.
  4. Saving snapshots and uploading changes: commit and push.
  5. Updating with changes from GitHub: pull and merge.
  6. Undoing private mistakes.
  7. Undoing public mistakes.
  8. More commands.
  9. Troubleshooting.

Creating a new account

  1. First, go to github.com.
  2. Next, look for a form that looks like the following:
    github login form
  3. Choose a username, enter your email address, and enter a password. Be sure to remember your password!
    github login form, filled
  4. Click the “Sign up for GitHub” button.
  5. You will be asked whether you want the free or paid account. All of your work in this course needs to be private but Williams College will supply you with private repositories for this course. I recommend that you select the “Unlimited public repositories for free” option.
    github repositories
  6. I suggest that you leave the “Help me set up an organization” and “Send me updates” blank.
    github options
  7. Click the “Continue” button.
  8. Unless you want to fill out the survey on the next page, click the “skip this step” button.
  9. Check your email. You should have an email asking for you to verify your account. It is important that you click the “Verify email address” link in order for your account to work correctly.
    github verification email

You now have a GitHub account. If you are following along for CS334, be sure to write you GitHub username on your homework. For example, for this account, I would write shazzameter.

Creating a new repository

In GitHub lingo, a source code project is referred to as a “repository”. This section shows you how to create a new repository.

Note that if you are a student in CS334, a private repository will be automatically assigned to you for homework submissions.

  1. First, look for a plus button at the top right of your GitHub web browser window. Click on the plus and select “New repository”.
    github new repo action
  2. Do you see a message that asks you to please verify your email address? If so, you probably got really excited and missed a step when creating your account. Make sure to follow GitHub’s instructions to verify your email address. If you don’t see this, good; go to the next step.
    github you really do need to verify
  3. Enter the name of the repository you want to create. Remember, this is the name of your project, and by default it is public, so you may want to consider making something not embarrassing.
    github repository name
  4. If the name is available, you will see a green check mark next to the repository name.
    github repository name not taken
  5. Choose any of the other options that interest you, provide a description, and click the “Create repository” button. I recommend selecting the “Initialize this repository with a README” option. If you know which programming language you plan to use, select the language from the “Add .gitignore” dropdown; this will instruct git to ignore files you probably don’t want to commit (e.g., Java .class files).
  6. If you chose to add a README, you should see something like the following. If you didn’t, you will see something different. Don’t worry! You still have a repository, but it will be empty.
    github empty repo

Downloading a repository from GitHub: clone

The rest of this tutorial will need to be performed on the command line.

GitHub is a service that “hosts” your repository, meaning that they store your files on their servers for you. However, the tool you will use to interact with your repository is a different tool, called git.

(For an interesting history about why git was created, read this.)

We will assume for the purposes of this tutorial that you already have git installed on your computer. If you are using one of the UNIX lab machines, you already have git. If not, I recommend using either Homebrew (macos) or Cygwin (Windows) to install git.

Before you can work with a git repository, you must first download a copy from your git host, in this case Github. This operation is called cloning.

  1. Every git repository has an address. You can find the git address of your GitHub repository by looking for a button labeled “Clone or download”.
    github clone button
  2. For now, we will select the default clone method, “Clone with HTTPS”. Just a warning: this method requires that you enter your GitHub username and password frequently. When you eventually tire of this, I suggest using SSH instead. For now, read on.
    github clone method https
  3. Click the little clipboard button. This copies the address into your clipboard.
    github clone clipboard icon
  4. Now, open your command line application.
    1. On lab UNIX machines, look for the Terminal application.
    2. In macos, look for the Terminal application.
    3. In Cygwin on Windows, look for the Cygwin application.
  5. Navigate to the place you want to put your repository, e.g., /Users/dbarowy/Documents.
   MyComputer:~ dbarowy$ cd ~/Documents
   MyComputer:Documents dbarowy$

(note that your prompts may look a little different than mine)

  1. Now, we use git to clone the repository. Type git clone <repository address> and press <ENTER>, e.g.,
   MyComputer:~ dbarowy$ git clone https://github.com/shazzameter/hula-hoop-o-matic.git
   Cloning into 'hula-hoop-o-matic'...
   remote: Counting objects: 4, done.
   remote: Compressing objects: 100% (3/3), done.
   remote: Total 4 (delta 0), reused 0 (delta 0), pack-reused 0
   Unpacking objects: 100% (4/4), done.
   MyComputer:Documents dbarowy$ 

  1. You’ve now made a local copy of your git repository.cd into it to see your files. If you ls your files, you should see that they are the same as those listed on the website.
   MyComputer:Documents dbarowy$ cd hula-hoop-o-matic/
   MyComputer:hula-hoop-o-matic dbarowy$ ls -al
   total 16
   drwxr-xr-x   5 dbarowy  staff  160 Jan 31 17:32 .
   drwx------+ 22 dbarowy  staff  704 Jan 31 17:32 ..
   drwxr-xr-x  12 dbarowy  staff  384 Jan 31 17:32 .git
   -rw-r--r--   1 dbarowy  staff  272 Jan 31 17:32 .gitignore
   -rw-r--r--   1 dbarowy  staff   40 Jan 31 17:32 README.md
   MyComputer:hula-hoop-o-matic dbarowy$ 

Now you’re ready to work with your repository.

Saving snapshots and uploading changes: commit and push

git is what is referred to as a “distributed version control repository”. This means that every person who works with a given repository (e.g., hula-hoop-o-matic) gets their very own copy. You obtain your copy by cloning as described above. Everything you do to your clone of the repository is private until you share your changes.

Before we discuss sharing, let’s talk about what a version control repository does. The purpose of version control is to save “snapshots” of your code. Each snapshot, called a commit in git lingo, is a complete saved copy of your code along with comments about that code and a timestamp. New versions are related to old versions. In fact, the “version history” of your code is a directed graph of these snapshots (commits).

directed graph of commits

You’ve probably made a complete copy of your code before so that you could undo whatever possibly-ill-advised code experiment you’ve wanted to try. With git, you don’t need to do any of that. Instead, you can “revert” your code back to any previous commit without having to worry about saving a separate copy. git does the hard work for you.

reverted commit

All you need to remember to do in order to use this handy facility is to commit your changes periodically. Most developers commit their changes anytime they make an important change. Let’s try committing.

  1. First, make sure the working directory of your shell is inside your repository (e.g., run cwd) and then create a file. Run touch testfile.txt at the command line. If you feel like adding text to your file, fire up emacs and add some text to it.
  2. git needs to be informed which files you plan to commit. It does not track files automatically.
   MyComputer:hula-hoop-o-matic dbarowy$ git add testfile.txt

  1. Once you’ve added a file, git will save snapshot of that file the next time you commit. You can ask git at any time which files will be updated by using the git status command:
   MyComputer:hula-hoop-o-matic dbarowy$ git status
   On branch master
   Your branch is up-to-date with 'origin/master'.
   
   Changes to be committed:
	   (use "git reset HEAD <file>..." to unstage)
   
	       new file:   testfile.txt
   
   MyComputer:hula-hoop-o-matic dbarowy$

Here you can see that git marked testfile.txt as a new file.

  1. Let’s commit our changes.
   MyComputer:hula-hoop-o-matic dbarowy$ git commit -m "my first commit"
   [master 5b796b6] my first commit
    1 file changed, 0 insertions(+), 0 deletions(-)
	create mode 100644 testfile.txt
   MyComputer:hula-hoop-o-matic dbarowy$

Note that the -m "my first commit" part of the command specifies a commit message. If you do not supply a commit message, git will open up an editor for you. The default editor for most UNIX machines is vi, and given that we encourage you to learn emacs in this department, vi will likely seem quite foreign to you. I recommend supplying a commit message to circumvent this problem; you may also change your default editor by changing the EDITOR environment variable. If you do find yourself in a vi session, remember: to quit without saving, press ESC then type :q!; to save and then quit, press ESC then type :wq.

  1. Now if we run git status again, we should see that all changes have been committed.
   MyComputer:hula-hoop-o-matic dbarowy$ git status
   On branch master
   Your branch is ahead of 'origin/master' by 1 commit.
	   (use "git push" to publish your local commits)
	   
   nothing to commit, working tree clean
   MyComputer:hula-hoop-o-matic dbarowy$

  1. We can also view our commit history by typing git log.
   MyComputer:hula-hoop-o-matic dbarowy$ git log
   commit 5b796b6d901b0b77b22e55afd7c35bd94a598a77 (HEAD -> master)
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 10:16:46 2018 -0500
   
	   my first commit
   
   commit 72a90420a759b172e4f8aebaf1d259e58e913cb4 (origin/master, origin/HEAD)
   Author: shazzameter <shazz@barowy.net>
   Date:   Wed Jan 31 16:52:53 2018 -0500
   
	   Initial commit
   MyComputer:hula-hoop-o-matic dbarowy$

The history shows two commits, my first commit, which we just did, and Initial commit. Wait, didn’t I only just commit once? What gives? Remember that we cloned an existing repository from GitHub. It turns out that when we initially created the repository, the first thing that GitHub did was to add a README.md file and then commit it. When we cloned the repository, we inherited all of the commits up to that point. Therefore, we see two commits.

  1. At this point, we are 1 commit ahead of the repository on GitHub. Remember the git status command we ran earlier? Notice that it said Your branch is ahead of 'origin/master' by 1 commit. Pictorially,
    the local branch, on MyComputer:
    github local branch

    vs the remote branch, on GitHub:
    github remote branch
  2. In order to share our changes back to GitHub, we need to push the repository. Pushing will make the remote repository look like the local repository. We push using the git push command. Note that, at this point, git will prompt you for your github username and password.
   MyComputer:hula-hoop-o-matic dbarowy$ git push
   Username for 'https://github.com': shazzameter
   Password for 'https://shazzameter@github.com': 
   Counting objects: 3, done.
   Delta compression using up to 4 threads.
   Compressing objects: 100% (2/2), done.
   Writing objects: 100% (3/3), 322 bytes | 322.00 KiB/s, done.
   Total 3 (delta 0), reused 0 (delta 0)
   To https://github.com/shazzameter/hula-hoop-o-matic.git
	   72a9042..5b796b6  master -> master
   MyComputer:hula-hoop-o-matic dbarowy

  1. If we now visit our GitHub repository using our web browser, we can see that testfile.txt now appears.
    changes appear in github

Updating with changes from GitHub: pull and merge

When two or more collaborators use git, since each collaborator maintains their own separate copy, a fundamental problem arises: merge conflicts. To illustrate, imagine the GitHub repository, User A, and User B all have git repositories in the following states:

GitHub:

github remote branch

User A:

github local branch for User A

User B:

github local branch for User B

No matter whether User A or User B pushes to GitHub first, an important thing will happen: the other person will not be able to push because that user’s local history will not agree with the history on GitHub. For example, suppose that User B pushes first. GitHub’s history will look like:

github local branch for User B

while User A’s history will look like:

github local branch for User A

User A’s changes cannot be mechanically incorporated into the changes in the GitHub repository because the two histories diverge: User A’s “snazzy design” is based on “my first commit”, but User B’s “new layout” is also based on “my first commit”. When git encounters this situation, you will see the following message when trying to git push:

MyComputer:hula-hoop-o-matic dbarowy$ git push
To https://github.com/shazzameter/hula-hoop-o-matic.git
 ! [rejected]        master -> master (fetch first)
error: failed to push some refs to 'https://github.com/shazzameter/hula-hoop-o-matic.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
MyComputer:hula-hoop-o-matic dbarowy$

It is tempting to want to panic at this point. But you shouldn’t, because this is an everyday occurrence when using version control. What is the message telling us? Essentially, that “Updates were rejected because the remote contains work that you do not have locally.” It even suggest what you should do “hint: (e.g., 'git pull ...') before pushing again.

The solution is for User A to pull changes from GitHub to their own repository, merge those changes, commit, and then push. Let’s go through each of those steps.

  1. First, git pull the changes from GitHub:
   MyComputer:hula-hoop-o-matic dbarowy$ git pull
   remote: Counting objects: 3, done.
   remote: Compressing objects: 100% (3/3), done.
   remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
   Unpacking objects: 100% (3/3), done.
   From https://github.com/shazzameter/hula-hoop-o-matic
	   5b796b6..457aa8b  master     -> origin/master
   Auto-merging README.md
   CONFLICT (content): Merge conflict in README.md
   Automatic merge failed; fix conflicts and then commit the result.
   MyComputer:hula-hoop-o-matic dbarowy$

git seems to be saying scary things here because of the word CONFLICT in angry capitals, but it’s just trying to be helpful, reminding you that you will need to manually incorporate changes.

  1. We can see what files git thinks needs our attention, referred to as conflicts, by running git status.
   MyComputer:hula-hoop-o-matic dbarowy$ git status
   On branch master
   Your branch and 'origin/master' have diverged,
   and have 1 and 1 different commits each, respectively.
	   (use "git pull" to merge the remote branch into yours)
   
   You have unmerged paths.
	   (fix conflicts and run "git commit")
	   (use "git merge --abort" to abort the merge)
   
   Unmerged paths:
	   (use "git add <file>..." to mark resolution)
	   
		   both modified:   README.md
		   
   no changes added to commit (use "git add" and/or "git commit -a")
   MyComputer:hula-hoop-o-matic dbarowy$

  1. git says that we both modified the file called README.md. Let’s open that file up in our editor and take a look.
   #hula-hoop-o-matic
   You know, for kids.
   
   <<<<<<< HEAD
   Snazzy design.
   =======
   And stuff.
   >>>>>>> 457aa8b909bd15a77e55cda20d8c34a317c839c6

Aha. Our local repository, referred to as HEAD here conflicts with the commit (helpfully?) called 457aa8b909bd15a77e55cda20d8c34a317c839c6 on GitHub. The difference is that User A wrote Snazzy design whereas User B wrote And stuff. The conflicting region will be delimited by <<<<<<<, =======, and >>>>>>> symbols. We need to manually merge the two regions and remove these symbols before git allows us to proceed.

  1. I’ll change the file to the following:
   #hula-hoop-o-matic
   You know, for kids.
   
   Snazzy design and stuff.

I have “manually merged” the two sets of changes.

  1. Now, I can git commit those changes.
   MyComputer:hula-hoop-o-matic dbarowy$ git add README.md 
   MyComputer:hula-hoop-o-matic dbarowy$ git commit -m "merge"
   [master 15b90e0] merge

  1. Finally, I am now allowed to git push.
   MyComputer:hula-hoop-o-matic dbarowy$ git push
   Counting objects: 6, done.
   Delta compression using up to 4 threads.
   Compressing objects: 100% (6/6), done.
   Writing objects: 100% (6/6), 765 bytes | 765.00 KiB/s, done.
   Total 6 (delta 0), reused 0 (delta 0)
   To https://github.com/shazzameter/hula-hoop-o-matic.git
	   457aa8b..15b90e0  master -> master
   MyComputer:hula-hoop-o-matic dbarowy$

  1. git log shows the current history,
   MyComputer:hula-hoop-o-matic dbarowy$ git log
   commit 15b90e01ab92c80b501ef3a235d20d36156dd17d (HEAD -> master, origin/master, origin/HEAD)
   Merge: 9c05f64 457aa8b
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 11:47:04 2018 -0500
   
	   merge
	   
   commit 9c05f6451c78817160e20b3cdd91b454298b4436
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 11:30:02 2018 -0500
   
	   snazzy design
	   
   commit 457aa8b909bd15a77e55cda20d8c34a317c839c6
   Author: hoopsucker <hoop@barowy.net>
   Date:   Thu Feb 1 11:29:26 2018 -0500
   
	   new layout
	   
   commit 5b796b6d901b0b77b22e55afd7c35bd94a598a77
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 10:16:46 2018 -0500
   
	   my first commit
	   
   commit 72a90420a759b172e4f8aebaf1d259e58e913cb4
   Author: shazzameter <shazz@barowy.net>
   Date:   Wed Jan 31 16:52:53 2018 -0500
   
	   Initial commit
   MyComputer:hula-hoop-o-matic dbarowy$

which is the same as

merged git history

Undoing private mistakes

Occasionally you want to revert a repository back to a state that it was in earlier. This is the specific scenario I described earlier, where you want to undo a change. Suppose you have a repository with the following history

directed graph of commits

and you want to undo the commit called “experiment” so that it looks like

reverted commit

This process is called reverting a repository. You need to be aware that there is a small wrinkle, depending on whether you’ve pushed the “experiment” commit to GitHub or not. If you have not pushed it, reverting a repository is trivial. If you have pushed it, see Undoing public mistakes instead.

Let’s assume that you have not pushed your “experiment” commit to GitHub. In other words, only your local repository knows about the “experiment” commit.

  1. First, determine how many commits you want to undo by running git log.

    MyComputer:hula-hoop-o-matic dbarowy\( git log
    commit 1d8fbf3cfd54a40745afe4b7f032625e1d8fbf3c
    Author: shazzameter <shazz@barowy.net>
    Date:   Thu Feb 1 11:30:02 2018 -0500
    
    	experiment
    
    commit 417249d510519449393c55756200f127417249d5
    Author: shazzameter <shazz@barowy.net>
    Date:   Thu Feb 1 11:29:26 2018 -0500
    
    	added button
    
    commit 4d3d7e62db2123ac94c2a144d084e9a14d3d7e62
    Author: shazzameter <shazz@barowy.net>
    Date:   Thu Feb 1 10:16:46 2018 -0500
    
    	added form
    
    commit c9756fbd2778520c337fe3a7b7ce35aec9756fbd
    Author: shazzameter <shazz@barowy.net>
    Date:   Wed Jan 31 16:52:53 2018 -0500
    
    	new project
    MyComputer:hula-hoop-o-matic dbarowy\)
    
  2. We run git reset --soft HEAD~n where n is the number of commits to revert. In this case, we only want to undo 1 commit.

   MyComputer:hula-hoop-o-matic dbarowy$ git reset --soft HEAD~1

Suppose that what we did in the “experiment” commit was to add a file called foo.txt. When you check git status, you will see the previously committed changes are now shown as pending changes in your git status.

MyComputer:hula-hoop-o-matic dbarowy$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

Changes to be committed:
   (use "git reset HEAD <file>..." to unstage)

	 new file:   foo.txt

which means that foo.txt is now uncommitted. If you want git to forget about foo.txt, run

MyComputer:hula-hoop-o-matic dbarowy$ git rm -f foo.txt

Now git status won’t know anything about foo.txt at all.

There are more concise commands for uncommitting changes (e.g., git reset --hard) but I do not show them here because it is easier to make unrecoverable mistakes when using them.

Undoing public mistakes

Sometimes you want to undo a commit that you’ve already pushed to a remote GitHub repository. Seasoned git developers often rewrite history in this case. Rewriting history is a bad idea for a couple reasons:

  1. Collaborators will all need the rewritten history, otherwise their normal workflows will break.
  2. Rewriting history breaks the contract that you can always undo changes you don’t like. Rewriting breaks undos.

There are some scenarios where rewriting is appropriate. For example, when you commit and push a file containing a difficult-to-change password to GitHub, you should probably rewrite history. GitHub repositories are often public. You probably do not want your passwords accessible to the general public.

(Fun disgression: I once accidentally committed to GitHub using my password as my username. This was the result of late-night coding and a tight deadline. Even normal history rewriting did not work in this case because of extra information tracked by GitHub itself, outside of git. I ended up speaking with a technical support person at GitHub who incredulously told me that he didn’t think fixing such a stupid mistake was even possible. After he was done berating me, I figured out how to remove it using the GitHub API.)

A better approach is to fetch the changes from the commit you want to revert to and then to commit them as if they were new changes. I caution you not to use the git revert command. Despite the name, git revert interacts in subtle ways that are hard to appreciate until you’ve been working with git for awhile.

Let’s try this approach instead.

  1. Suppose we want to revert the repository to the commit “snazzy design”. Run git log to find the commit’s ID.
   MyComputer:hula-hoop-o-matic dbarowy$ git log
   commit ce9d029b7b7202b93da264f9bf0f39406c6037d4 (HEAD -> master, origin/master, origin/HEAD)
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 14:18:49 2018 -0500

	   experiment

   commit 15b90e01ab92c80b501ef3a235d20d36156dd17d (HEAD -> master, origin/master, origin/HEAD)
   Merge: 9c05f64 457aa8b
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 11:47:04 2018 -0500

	   merge

   commit 9c05f6451c78817160e20b3cdd91b454298b4436
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 11:30:02 2018 -0500

	   snazzy design

   commit 457aa8b909bd15a77e55cda20d8c34a317c839c6
   Author: hoopsucker <hoop@barowy.net>
   Date:   Thu Feb 1 11:29:26 2018 -0500

	   new layout

   commit 5b796b6d901b0b77b22e55afd7c35bd94a598a77
   Author: shazzameter <shazz@barowy.net>
   Date:   Thu Feb 1 10:16:46 2018 -0500

	   my first commit

   commit 72a90420a759b172e4f8aebaf1d259e58e913cb4
   Author: shazzameter <shazz@barowy.net>
   Date:   Wed Jan 31 16:52:53 2018 -0500

	   Initial commit
   MyComputer:hula-hoop-o-matic dbarowy$

  1. The commit ID for “snazzy design” is 9c05f6451c78817160e20b3cdd91b454298b4436. Now run the following somewhat-magical incantation:

    MyComputer:hula-hoop-o-matic dbarowy$ git checkout -f 9c05f6451c78817160e20b3cdd91b454298b4436 -- .
    

  2. git status shows that this will change the README.md file:

    MyComputer:hula-hoop-o-matic dbarowy$ git status
    On branch master
    Your branch is up-to-date with 'origin/master'.
    
    Changes to be committed:
      (use "git reset HEAD <file>..." to unstage)
    
    	modified:   README.md
    
    
  3. We now commit and push.

    MyComputer:hula-hoop-o-matic dbarowy\( git commit -m "Revert to snazzy design."
    [master 8e75b4e] Revert to snazzy design.
     1 file changed, 1 insertion(+), 1 deletion(-)
    MyComputer:hula-hoop-o-matic dbarowy\) git push
    Counting objects: 3, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (3/3), done.
    Writing objects: 100% (3/3), 336 bytes | 336.00 KiB/s, done.
    Total 3 (delta 1), reused 0 (delta 0)
    remote: Resolving deltas: 100% (1/1), completed with 1 local object.
    To https://github.com/shazzameter/hula-hoop-o-matic.git
       ce9d029..8e75b4e  master -> master
    MyComputer:hula-hoop-o-matic dbarowy$
    

  4. Checking GitHub for the change should confirm that the repository was reverted but that all history was retained.

More commands

You can do a lot more with git than what was discussed here. For example, it is common for developers to work with multiple branches, “squash” commits, and stash changes. There are many good online references, but a good, concise reference is the following.

Troubleshooting

git is a very powerful tool, but it is famously unfriendly to new users, inspiring lots of jokes. Still, git is worth learning because you should get into the habit of using version control, and if you’re going to do that, you might as well use the most popular one currently being used. I use git for all of my collaborative projects, but I prefer Mercurial for personal projects.

The important thing to remember is that you probably aren’t the first person to encounter such a situation. There are lots of great resources online to help you understand why git behaves the way it does. Atlassian (a GitHub competitor) has an excellent set of git turorials, including this great section on collaborating. As always, StackOverflow is also an excellent resource for specific questions.

If you find yourself in a tough spot, you can also ask the TAs or instructors for help—between the bunch of us, we’ve probably gotten ourselves into and out of most of the sticky situations you might find yourself in.