Theme NexT works best with JavaScript enabled
0%

Git Manual


IMPORTANT:
Some of the content here is a personal summary/abbreviation of contents on the Offical Git Guide. Feel free to refer to the official site if you think some of the sections written here are not clear.


Git Intro

  • Version Control
    • Version control is a system that records and manages changes for a file or a set of file, so that you could later revert those changes if needed
  • Why Version Control
    • Many default version-control method of choice is to simply copy files into another directory, and renaming them somehow to keep track of the changes
    • however, this could be inconvenient and error prone
  • Solutions
    • One of the most popular VCS tools was a system called RCS, which is still distributed with many computers today. RCS works by keeping patch sets (that is, the differences between files) in a special format on disk; it can then re-create what any file looked like at any point in time by adding up all the patches.
      • Problem
        • The next major issue that people encounter is that they need to collaborate with developers on other systems.
    • Centralized Version Control Systems (CVCSs) were developed as a solution to the problem above. These systems (such as CVS, Subversion, and Perforce) have a single server that contains all the versioned files, and a number of clients that check out files from that central place. For many years, this has been the standard for version control.
      • Advantages
        • everyone knows to a certain degree what everyone else on the project is doing. Administrators have fine-grained control over who can do what, and it’s far easier to administer a CVCS than it is to deal with local databases on every client.
      • Problems
        • If that server goes down for an hour, then during that hour nobody can collaborate at all or save versioned changes to anything they’re working on. If the hard disk the central database is on becomes corrupted, and proper backups haven’t been kept, you lose absolutely everything — the entire history of the project except whatever single snapshots people happen to have on their local machines.
    • Here Distributed Version Control Systems (DVCSs) step in. In a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients don’t just check out the latest snapshot of the files; rather, they fully mirror the repository, including its full history. Thus, if any server dies, and these systems were collaborating via that server, any of the client repositories can be copied back up to the server to restore it. Every clone is really a full backup of all the data.
      • Advantages
        • you can collaborate with different groups of people in different ways simultaneously within the same project. This allows you to set up several types of workflows that aren’t possible in centralized systems, such as hierarchical models.

Git Setup

Please follow the offical documentation for setup and installation.

Getting Help in Git

To get more information about a command in Git, use any of the three below:

1
2
3
$ git help <verb>
$ git <verb> --help
$ man git-<verb>

It is recommended to use git <verb> --help because it sticks to the pattern of command line code. For example, for a quick refresher on the available options for a Git command, you do git <verb> -h

  • Notice
    • Those commands are accessible even offline

Git Basics

  • Getting a Git Repository
    This can be done in two ways

    1. take a local directory that is currently not under version control and turn it into a Git repository
    2. clone an existing Git repository from elsewhere

    Then you will end up with a Git repo on your local machine

  1. To do achieve the first option, you need to first go to the project directory (for Windows):

    1
    $ cd your_path

    Then type:

    1
    $ git init

    This creates a new subdirectory named .git that contains all of your necessary repository files — a Git repository skeleton. At this point, nothing in your project is tracked yet.

    To start version-controlling, you need to git add your file that you need to control, and then git commit

    1
    2
    3
    $ git add yourFile1
    $ git add yourFile2
    $ git commit -m 'Initial project version'
  2. To achieve the second option, for example, a project you’d like to contribute to — the command you need is git clone, so that Git receives a full copy of nearly all data that the server has

    1
    git clone <url>

    For example:

    1
    $ git clone https://github.com/libgit2/libgit2 mylibgit
  • Notice

    • you can clone a Git linkable library only
    • The above command will create a directory named libgit2 under the current directory, initializes a .git directory inside it, pulls down all the data for that repository, and checks out a working copy of the latest version.
    • Git has a number of different transfer protocols you can use. The previous example uses the https:// protocol, but you may also see git:// or user@server:path/to/repo.git, which uses the SSH transfer protocol.

    If you want to clone the repository into a directory named something other than libgit2, you can specify the new directory name as an additional argument:

    1
    $ git clone https://github.com/libgit2/libgit2 customizedName

Recording Changes

Typically, you’ll want to start making changes and committing snapshots of those changes (i.e. recording/saving those changes) into your repository each time the project reaches a state you want to record.

As you edit files, Git sees them as modified, because you’ve changed them since your last commit. As you work, you selectively stage these modified files and then commit/save all those staged changes, and the cycle repeats.

  • Note
    • Not all files under the directory will be tracked, and only tracked files will Git know about
    • once you git add to start your file to start tracking. Then it will be snapedshotted if you then call git commit

Checking File Status

The main tool you use to determine which files are in which state is the git status command.

For example, if you see:

1
2
3
4
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

A clean working directory means none of your tracked files are modified.

  • Note
    • this command can also show untracked files under the current directory
    • if you want to start tracking those, you can call git add fileNeedsToBeTracked

Modifying a Tracked File

There are three status that a tracked file can be at after being modified:

  1. Unstaged and uncomitted
    • This happens right after you have modified a tracked file. At this point, if you did not call git add thatFile again, Git will only know about the previous checkpoint.
      1
      2
      3
      4
      Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
      modified: firstFile.txt
  2. Staged but uncommitted
    • This happens when you have modified a tracked file and called git add thatFile. Now, Git knows about it and it is thrown into the staging area
      1
      2
      3
      Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
      new file: LICENSE
  3. Staged and committed
    • This happens when you call git commit yourFile, so that the changes move from the stanging area to be saved in your local repository

Difference between git add and git commit

  • Q: why add and commit as two functions? They seem to be two repeated checkpoint procedures.

The git add command adds a change in the working directory to the staging area. It tells Git that you want to include updates to a particular file in the next commit. However, git add doesn’t really affect the repository in any significant way—changes are not actually recorded until you run git commit.

The command git commit takes all changes in the Staging Area, wraps them together and** puts them in your Local Repository. A commit is simply a checkpoint telling git to track all changes that have occurred up to this point using our last commit as a comparison. **After committing, your Staging Area will be empty.

Abbreviated version for git status

To quickly view which file is at which status, you can execute git status -s to see each file under the current directory with their status

For example, you could see

1
2
3
4
5
MM LICENSE.txt
AM README.md
M firstFile.txt
A justTracked.txt
?? unTracked.txt
  • A stands for files just added after .git existed or git init was called
  • M stands for files being modified
    • a green M (at position 0) means it is modified and staged
    • a red M (at position 1, if there is) means it is modified but unstaged
  • ?? stands for files not being tracked by Git
  • [x]Note
    • Although it is not shown above, but the first letter will be green, and if there is a second letter, it will be red.

Ignoring Files

A gitignore file specifies intentionally untracked files that Git should ignore. Files already tracked by Git are not affected.

Each line in a gitignore file specifies a pattern. When deciding whether to ignore a path, Git normally checks gitignore patterns from multiple sources, with the following order of precedence, from highest to lowest.

  1. First, you need to have/create a .gitignore file under the repository. This can be simply done by touch .gitignore
  2. Then, you need to add texts to that file to specify rules for ignoring files. This can be done by cat >> .gitgnore, where >> means append. Then you press enter and start typing texts/rules, and when you finish, enter Ctrl-D to save.
    1
    2
    3
    4
    $ cat >> .gitignore
    unTracked.txt
    *.[oa]
    *~
    (alternatively, you could use > instead of >>, which will override the content)
    Now, since unTracked.txt was not tracked, it will be ignored and will not appear if you execute git status
  • Note
    • * matches anything except a slash
    • ? matches any one character except “/“
    • [a-zA-Z]is the range notation, which can be used to match one of the characters in a range.
    • ! negates a pattern. Put a backslash \ in front of the first ! for patterns that begin with a literal !, for example, \!important.txt
    • **/someDirectory means match in all directories named [someDirectory]. For example, **/foo matches file or directory “foo” anywhere
    • someDirectory/** matches everything inside [someDirectory]. For example, abc/** matches all files inside directory [abc], relative to the location of the [.gitignore] file
    • someDirectory/**/otherDirectory matches neted directories between [someDirectory] and [otherDirectory]

Knowing the Modifications

  1. By just executing git diff will compare what is in your working directory with what is in your staging area. The result tells you the changes you’ve made that you haven’t yet staged.

    For example:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    $ git diff
    warning: LF will be replaced by CRLF in firstFile.txt.
    The file will have its original line endings in your working directory
    diff --git a/firstFile.txt b/firstFile.txt
    index 640bb73..caada99 100644
    --- a/firstFile.txt
    +++ b/firstFile.txt
    @@ -1,3 +1,5 @@
    hello
    hello
    hello
    +
    +aa

    The + and +aa lines in the end means those are the lines added to the file, but not being staged yet.

  2. By executing git diff --staged will compare changes of the files between the staged state and the last edited state.

    For example:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    $ git diff --staged
    diff --git a/firstFile.txt b/firstFile.txt
    index e69de29..640bb73 100644
    --- a/firstFile.txt
    +++ b/firstFile.txt
    @@ -0,0 +1,3 @@
    +hello
    +hello
    +hello

    Notice that the last two lines + and +aa are not present.

Committing Changes

Now if you want to commit to those changes, you can execute git commit. However, this will only commit changes that are staged, so modified yet unstaged changes will not be recorded.

After executing git commit, it will bring up your editor of choice, which can be chosen when you setup your Git.

For example

1
2
3
4
5
6
$ git commit
[master 2031fb7] first commit made
4 files changed, 11 insertions(+)
create mode 100644 LICENSE.txt
create mode 100644 README.md
create mode 100644 justTracked.txt
  • Note
    • the first line will be intentionally made blank, which is for you to write/describe what has been changed.
    • you can use the command git commit -v to also have the difference shown in the editor as well, so you know exactly what you have done
    • for a quick commit, you can do git commit -m "yourMessageHere", which finishes the commit without even opening the editor
    • you also see which branch you committed to (master), what SHA-1 checksum the commit has (2031fb7)

After you have made a commmit, all the tracked file will not appear if you execute git status -s

Committing and Skipping the Staging

If you want to skip the staging area, Git provides a simple shortcut. Adding the -a option to the git commit command makes Git automatically stage every file that is already tracked before doing the commit, letting you skip the git add part.

Though this is convenient, but be careful; sometimes this flag will cause you to include unwanted changes.

Removing Tracked File/ Untracking File

  • If you want to remove a tracked file, meaning you want to delete the file, you need to execute git rm yourFile. This will:

    1. remove the file from the current working directory
    2. remove the file from the staging area

    Now, the next time you commit, it will be no longer tracked by Git since it will be completely removed.

  • If you want to untrack a tracked file, you execute git rm --cached yourFile. This will keep the file in your directory, but removed from the Git.

    Now, same as before, you will still need to execute git commit to complete the removal from the Git

Renaming a File

If you rename a file, and wants it still being tracked by Git properly, you should do
git mv myFile myNewFile

Now, if you call git status, you will see:

1
2
3
4
5
6
7
8
$ git mv README.md README
$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)

renamed: README.md -> README

This is also equivalent of doing seperately three lines:

1
2
3
$ mv README.md README
$ git rm README.md
$ git add README

Viewing Commits

You can see all commits in a repo by:

1
$ git log

By default, with no arguments, git log lists the commits made in that repository in reverse chronological order

  • Useful options for git log

    • git log -p -2 will display 2 most recent commits, and also showing the patches/changes made

    • git log --stat helps you quickly browse what happened during a series of commits that a collaborator has added. It will show additional information such as: a list of modified files, how many files were changed, and how many lines in those files were added and removed. It also puts a summary of the information at the end.

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      19
      20
      21
      22
      23
      24
      25
      26
      27
      $ git log --stat
      commit cae63d011d113c1d45818de4bec434216e10e9ef (HEAD -> master)
      Author: Xiao Yu <jasonyux17@gmail.com>
      Date: Wed May 20 10:59:55 2020 +0800

      tested git file removal and untracking

      firstFile.txt | 6 ------
      justTracked.txt | 0
      2 files changed, 6 deletions(-)

      commit 0e6428a6389d7748a0da2f0798be94999ce8c248
      Author: Xiao Yu <jasonyux17@gmail.com>
      Date: Wed May 20 10:34:31 2020 +0800

      second commit made

      firstFile.txt | 1 +
      1 file changed, 1 insertion(+)

      commit 2031fb7fd0a90cf62d1c13d7cfb972e1ce10ad30
      Author: Xiao Yu <jasonyux17@gmail.com>
      Date: Wed May 20 10:28:14 2020 +0800

      first commit made

      LICENSE.txt | 5 +++++
    • git log --pretty=oneline This option changes the log output to formats other than the default. A few prebuilt options are available for you to use. The oneline option prints each commit on a single line, which is useful if you’re looking at a lot of commits.

    • git log --pretty=format:"%h - %an, %ar : %s" This is customizing the format of the commit output. For example,

      1
      2
      3
      4
      5
      $ git log --pretty=format:"%h - %an - %ar, commit: %s"
      cae63d0 - Xiao Yu - 23 minutes ago, commit: tested git file removal and untracking
      0e6428a - Xiao Yu - 49 minutes ago, commit: second commit made
      2031fb7 - Xiao Yu - 55 minutes ago, commit: first commit made
      75d2c61 - Xiao Yu - 18 hours ago, commit: Initial project version

      So we can have options such as:

      • %h Abbreviated commit hash
      • %an author name
      • %ar author date, relative to now
      • %s subject/commit message
    • git log --since="2008-10-01" Limit the commits to those made after the specified date. You can also put “2.weeks” or “2 years 1 day 3 minutes ago”

    • git log --before="2008-10-01" Limit the commits to those made before the specified date.

    • git log --author="Xiao Yu" Only show commits in which the author entry matches the specified string.

    • git log --committer="Xiao Yu" Only show commits in which the committer entry matches the specified string.

    • git log --no-merges to prevent the display of merge commits cluttering up your log history, which usually are not informative

Changing your Staging

  1. If you forget to do some small changes, but you have just committed, you can easily “re-commit” by

    • first make those additional changes you forgot
    • then stage them
    • then committ again using git commit --amend to automatically include the new changes into the previous commit
  2. If you want to “re-stage” a file for making additional changes, you use git restore --staged <file>. In fact, this adviced by the command git status as well:

    1
    2
    3
    4
    5
    $ git status
    On branch master
    Changes to be committed:
    (use "git restore --staged <file>..." to unstage)
    modified: README.md

    Now, the file [README.md] is unstaged and you can add changes to it and restage.

Changing/Undoing your Commit

If you want to revert the file to its previous committed state (you cannot revert to those added state), you have several options to do. You could:

  1. Run git restore <file>. In fact, this is also adviced by the command git status if the file is in the unstaged area. It is usually used for restoring files in the working tree from either the index or another commit. This command does not update your branch.

  2. Run git reset <commit-id> is about updating your branch, moving the tip in order to add or remove commits from the branch

  3. Run git revert which creates a new commit that undoes the changes from a previous commit. This command adds new history to the project (it doesn’t modify existing history).

  4. Run git checkout <commit-id>. This is the most conservative and prefered by myself. It first move the HEAD to that commit, so that you can view changes at that point. However, this will make the HEAD DETACHED, because it does not follow any branch (in fact, Git will create a temporary branch for you, which will disappear once you switch to other branches), which means that if you do some changes here, they will not be visible once checkout to other branches. Therefore, now you will need to run git branch -b <yourBranchName> to create a branch and attach this DETACHED HEAD to that branch. Now, everything will be saved if you move around and switch bewteen branches.

    Note:

    • If you use the restore command, any local changes you made to that file are gone — Git just replaced that file with the most recently-committed version. Don’t ever use this command unless you absolutely know that you don’t want those unsaved local changes.

    If you would like to keep the changes you’ve made to that file but still need to get it out of the way for now, we’ll go over stashing and branching in Git Branching; these are generally better ways to go.

Working with Remotes

To be able to collaborate on any Git project, you need to know how to manage your remote repositories. Remote repositories are versions of your project that are hosted on the Internet or network somewhere. Managing these remote repositories involves pushing and pulling data to and from them when you need to share work.

  • To show the remote server for repo in the current directory, you can execute git remote, or git remote -v to see the URLs stored as well.
  • To add a remote server, you can execute git remote add <shortname> <url>, which will configure the shortname/nickname for that repo. Then you can use that shortname to access that repo.

    Note:

    • You can only add remote repo in an existing repo
    • Executing git clone <url> will give you a default shortname of origin

Difference between git clone and git remote add

  • git remote add just creates an entry in your git config that specifies a name for a particular URL. You must have an existing git repo to use this. You will not see a folder or any file from that repo. To get them, you need to use git pull <shortname> <branch>

  • git clone creates a new git repository by copying an existing one located at the URL you specify. This will be physical.

Fetching and Pulling from Remote Repos

  1. git fetch <remote> This command goes out to that remote project and pulls down all the data from that remote project that you don’t have yet. After you do this, you should have references to all the branches from that remote, which you can merge in or inspect at any time (merge operations will be discussed in Git Branching.
  2. If you cloned a repository, the command git fetch origin fetches any new work that has been pushed to that server since you cloned (or last fetched from) it

    Note:

    • git fetch command only downloads the data to your local repository — it doesn’t automatically merge it with any of your work or modify what you’re currently working on. You have to merge it manually into your work when you’re ready
    • get pull will fetch and merge the remote branches into your current branch. More covered in Git Branching

Viewing Changes After Fetching

After you have fetched from your repo with git fetch <remote>, it will automatically update (or create) all the so-called “remote-tracking branches” for that remote (if you only want to fetch from a specific branch, you need to run git fetch <remote>/<branchName>).

Now, to see what has been changed, you can use the ^ or .. operator. For example, If you want to see what remote master has compared to your local master, you ask Git to show you exactly this:

1
git log <remote>/master ^master

which means all commits reachable from <remote>/master which do not include commits reachable from master or, alternatively

1
git log master..<remote>/master

Pushing/Sharing to Your Remote

When you have your project at a point that you want to share, you have to push it upstream. The command for this is simple: git push <remote> <branch>. If you want to push your master branch to your origin server (again, cloning generally sets up both of those names for you automatically), then you can run this to push any commits you’ve done back up to the server:

1
$ git push origin master

Note:

  • This command works only if you cloned from a server to which you have write access and if nobody has pushed in the meantime. If someone else if pushing, your push will be rejected and you will have to fetch their work first and incorporate it into yours before you’ll be allowed to push.

Renaming or Removing Remote Repo

  1. If you want to rename a shorthand for a remote repo, you can execute git remote rename <originalName> <newName> to change a remote’s shortname.
  2. If you want to remove a remote for some reason — you’ve moved the server or are no longer using a particular mirror, or perhaps a contributor isn’t contributing anymore, you can use git remote remove <shortname>

Inspecting a Remote for Pushing and Pulling Information

If you want to see more information about a particular remote, you can use the git remote show <remote> command

1
2
3
4
5
6
7
8
9
10
$ git remote show pb
* remote pb
Fetch URL: https://github.com/paulboone/ticgit
Push URL: https://github.com/paulboone/ticgit
HEAD branch: master
Remote branches:
master tracked
ticgit tracked
Local ref configured for 'git push':
master pushes to master (local out of date)

It lists the URL for the remote repository as well as the tracking branch information. The command helpfully tells you that if you’re on the master branch and you run git pull, it will automatically merge in the master branch on the remote after it fetches all the remote references. It also lists all the remote references it has pulled down.

In general, it will show you:

  • which branch is automatically pushed to when you run git push while on certain branches
  • which remote branches on the server you don’t yet have
  • which remote branches you have but have been removed from the server
  • multiple local branches that are able to merge automatically with their remote-tracking branch when you run git pull

Tagging

Git has the ability to tag specific points in a repository’s history as being important. Typically, people use this functionality to mark release points (v1.0, v2.0 and so on).

Listing the Tags of a Repo

You can list all the tags by calling:

1
$ git tag

Note:

  • this command lists the tags in alphabetical order; the order in which they are displayed has no real importance.

You can also filter the tags by using the optional -l or --list, for example

1
$ git tag -l "v1.8.5*"

This will list all the tags that *starts with *v1.8.5

Creating Tags

You can have two types of tags:

  1. a lightweight tag

    • it’s just a pointer to a specific commit. It does not have much information attached to it. (It does have a Checksum, and includes the author information) It is often used as a temporary tag.
  2. an annotated tag

    • stored as full objects in the Git database. They’re checksummed; contain the tagger name, email, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG). It’s generally recommended that you create annotated tags so you can have all this information

Note:

  • Those tags will be attached to the lastest committed version of your repo

Creating a Lightweight Tag

This is basically the commit checksum stored in a file — no other information is kept. To create a lightweight tag, don’t supply any of the -a, -s, or -m options, just provide a tag name.

1
$ git tag <tag-id>

Creating an Annotated Tag

This is created by

$ git tag -a <tag-id> -m "yourMessage"

  • -a, -s, or -u <keyid> is passed, the command creates a tag object, and requires a tag message. Unless -m <msg> or -F <file> is given, an editor is started for the user to type in the tag message.

Then to see the message in that tag, use

1
$ git show <tag-id>

Note:

  • Don’t be mistaken to execute git tag show <tag-id>!

Tagging Previous Commits

To tag any specific commit, you specify the commit checksum (or part of it) at the end of the command:

1
$ git tag -a <tag-id> -m "yourMessage" <commit-checksum>

Sharing Tags

By default, the git push command doesn’t transfer tags to remote servers. You will have to explicitly push tags to a shared server after you have created them. This process is just like sharing remote branches — you can run

1
$ git push <repo-shortname> <tagname>

To push all the tags, you run

1
$ git push <repo-shortname> --tags

Deleting Tags

You can use

1
$ git tag -d <tag-id>

to delete a specific tag. However, this does not remove the tag from the remote servers.

To delete the tag from the server, you could either use:

1
$ git push origin --delete <tag-id>

or, the slightly less intuitive command

1
$ git push origin :refs/tags/v1.4-lw

which means the null value before the colon is being pushed to the remote tag name, effectively deleting it.

Viewing Files at a Different State using Tag

You can view the different versions of files by using its tag. This is done by a git checkout of that tag, although this puts your repository in “detached HEAD” state, (which has some ill side effects)

  • in a detached-head state, if you make changes and then create a commit, the tag will stay the same, but your new commit won’t belong to any branch and will be unreachable, except by the exact commit hash. Thus, if you need to make changes — say you’re fixing a bug on an older version, for instance — you will generally want to create a branch (covered more in Git Branching)

Q: What does git checkout mean?

If you call git log, you will see something called HEAD. HEAD is a label noting where you are in the commit tree. It moves with you when you move from one commit to another. git checkout <commit> is the basic mechanism for moving around in the commit tree, moving your focus (HEAD) to the specified commit.

The commit can be specified by any of a number of ways, commit hash, branch name, tag name, the relative syntax (HEAD^, HEAD~1, etc.) and so on. (It is often useful to consider a checkout to be changing branches)

Git Alias

Alias can be understood as shortcuts to execute commands. They can be setup by using the git config command.

For example:

1
2
3
4
$ git config --global alias.co checkout
$ git config --global alias.br branch
$ git config --global alias.ci commit
$ git config --global alias.st status

This means that

  • git checkout becomes git co
  • git branch becomes git br
  • git commit becomes git ci
  • git status becomes git st

Note:

  • global means access for one user (yourself) in all the repos. In comparison, local means for one user in this specific repo. system means for all users in all repos.

Some useful alias include:

1
$ git config --global alias.last 'log -1 HEAD'

So you can see the last commit easily:

1
2
3
4
5
6
$ git last
commit e946cadfbd3b378582b0be5e5385a7702d1d10da (HEAD -> master, tag: v1.0)
Author: Xiao Yu <jasonyux17@gmail.com>
Date: Wed May 20 10:59:55 2020 +0800

amended: edited AND deleted file removal and untracking

Resetting the Alias

If you find certain alias used before as a bad idea, you can do

1
$ git config --global --unset alias.<yourAlias>

You can also directly edit in the config file to change the alias by:

1
$ git config --global --edit

which will open the default text editor and then you can edit your alias directly (delete it if not needed).

Git Branching

Git doesn’t store data as a series of changesets or differences, but instead as a series of snapshots.

When you make a commit, Git stores a commit object that contains a pointer to the snapshot of the content you staged. This object also contains the author’s name and email address, the message that you typed, and pointers to the commit or commits that directly came before this commit (its parent or parents): zero parents for the initial commit, one parent for a normal commit, and multiple parents for a commit that results from a merge of two or more branches.

  1. When you staged a file/files
    • Staging the files computes a checksum for each one (the SHA-1 hash we mentioned in Getting Started)
    • then stores that version of the file in the Git repository (Git refers to them as blobs)
    • finally adds that checksum to the staging area
  2. When you commit by running git commit
    • Git checksums each subdirectory (in this case, just the root project directory)
    • stores them as a tree object in the Git repository.
    • Git then creates a commit object that has the metadata and a pointer to the root project tree so it can re-create that snapshot when needed.

If you had 3 files that were staged, and you committed them, you will get in totl five objects: three blobs (each representing the contents of one of the three files), one tree that lists the contents of the directory and specifies which file names are stored as which blobs, and one commit with the pointer to that root tree and all the commit metadata.

If you make some changes and commit again, the next commit object stores a (parent) pointer to the commit object that came immediately before it. Essentially it is like a linked list of commit objects.

A branch in Git is simply a lightweight movable pointer to one of these commits objects. The default branch name in Git is master (default when you run git init). As you start making commits, you’re given a master branch that points to the last commit you made. Every time you commit, the master branch pointer moves forward automatically.

Create a New Branch

Creating a new branch means creating a new pointer for you to move around. Let’s say you want to create a new branch called testing. You do this with the git branch command:

1
$ git branch <newBranchName>

Note:

  • this new branch will by default point to the current commit object (the head) you are working on.
  • you can see the head commit object by running git log

Switching Branches

To switch to an existing branch, you run the git checkout command.

1
$ git checkout <branchName>

For example, if you have another branch called testing, you switch to that by git checkout testing.

This will move the head to the new branch you are at. This means that now, if you make modifications and commit, you will create a new commit object that the head/your current branch points to, but the other branch still points to the original commit object.

Note:

  • Now, if you run git log, you will not see the new commit object that is made by the testing branch. By default, git log will only show commit history below the branch you’ve checked out.
  • In general, if you move backward in the tree, you will not be able to see future commit objects by running git log
  • To show commit history for the desired branch you have to explicitly specify it: git log <branchName>. To show all of the branches, run git log --all.
  • You can actually create a new branch and switching to it at the same time — this can be done in one operation with git checkout -b <newBranchName>

However, if you have made some changes in this new branch and switched back to the old branch, your current working directory will be reverted back to the snapshot that old branch points to. Now, if you make some changes on the old branch, the two branch will diverge:

  • the new branch does not know that change made in the old branch
  • the old branch cannot go forward directly to the new branch because a commit object that both shared has been changed now
  • basically, you will have two different commit objects, one for the old branch, and another for the new branch, both links to the same parent.

But don’t worry, in the end you can merge them together when you’re ready.


Making a Disaster:

  1. You can create a branch that points to the current head position, and then switch to the master branch. Then you can move the head using the checksum of the first commit by doing
    1
    git checkout <checkSum>
    Now, the master head will point to the inital commit. If your friend is also working on this branch, they will be dismayed as thinking somehow all the progress is lost, but in fact it is kept by the other branch that you just created.

Viewing the Diversion Tree

The entire tree for all branches can be seen by running

1
$ git log --oneline --graph --all

For example, it can look like this:

1
2
3
4
5
6
7
$ git log --oneline --graph --all
* ae182e6 (HEAD, testing) first commit made now more with the new testing branch
| * e946cad (tag: v1.0, master) amended: edited AND deleted file removal and untracking
| * 0e6428a second commit made
| * 2031fb7 (tag: v0.1) first commit made
|/
* 75d2c61 Initial project version

So we see that the two branches are diverged at the first commit object.

Merging Different Branches

There are basically two cases:

  1. You have two diverged branches, but they used different files, or, to put it more suggestively, the file they shared are the same. In this case, it is simple for Git to figure out how to merge (basically just putting all the files together), and you run:

    1
    2
    $ git checkout master
    $ git merge testing

    Note:

    • This means that you are merging files from the testing branch to the master branch
    • The merge operation will only be allowed if you have committed in the testing branch/current working directory
  2. You have two diverged branches, and they share a file that is modified differently in both branches. Now, if you want to merge them, Git will show you:

    1
    2
    3
    4
    $ git merge testing
    Auto-merging firstFile.txt
    CONFLICT (content): Merge conflict in firstFile.txt
    Automatic merge failed; fix conflicts and then commit the result.

    In this example, [firstFile.txt] is modified for both branches, and they both depend on it. Therefore, the merging cannot be completed, and you will need to manually edit that file.

    When you finish editing that file, you run git add <thatFile> to mark it as resolved, and then call git commit to complete the merge.

In the end, when you have finished all the work, you can delete the branch you don’t need by:

1
$ git branch -d <branchName>

Branch Management

There are a handful of commands with git branch that is useful for branch management:

  • git branch This will show you a list of branches that yuo have, with an asterisk * in front of the current branch you are on.
  • git branch -v Shows you a list of branches with their last commit message
  • git branch --merged shows you branches that have been merged
  • git branch --unmerged shows you branches that have not been merged

Note

  • Each of the command above shows the information relative to the current branch. To specify which branch you want to know, you can do, for example, git branch --no-merged testing (to get information relative to the testing branch)

Typical WorkFlow Using Git Branches

Usually, people tend to put code that is entirely stable in their master branch — possibly only code that has been or will be released. They have another parallel branch named develop or next that they work from or use to test stability — it isn’t necessarily always stable, but whenever it gets to a stable state, it can be merged into master.

In general, for some larger projects, you will want your branches are at various levels of stability, ordered from most stable top branch (master) to the least stable bottom branch(topic, for example); when they reach a more stable level, they’re merged into the branch above them.

Note:

  • It’s important to remember when you’re doing all this that these branches are completely local. When you’re branching and merging, everything is being done only in your Git repository — there is no communication with the server, for now.

Remote Branches

Remote-tracking branches are references to the state of remote branches. They’re local references that you can’t move; Git moves them for you whenever you do any network communication, to make sure they accurately represent the state of the remote repository. Think of them as bookmarks, to remind you where the branches in your remote repositories were the last time you connected to them.

Remote-tracking branch names take the form <remoteShortName>/<branch>. For instance, if you wanted to see what the master branch on your origin remote looked like as of the last time you communicated with it, you would check the origin/master branch. This means that, if you were working on an issue with a partner and they pushed up an iss53 branch, but you also have your own local iss53 branch, the branch on the server would be represented by the remote-tracking branch origin/iss53 updated with your friend’s version

Now, to synchronize your work with a given remote, you run

1
$ git fetch <remoteShortName> command

This command looks up which server “origin” is (for example, it might be cloned from git.ourcompany.com), fetches any data from it that you don’t yet have, and updates your local database, moving your origin/master pointer to its new, more up-to-date position, but your current master pointer is still the same as you were working with.

Colaborating with Multiple Repos

Say you have a main repo that you have cloned, with the shortname origin at git.outcompany.com, and you have another team working on that repo as well, with their repo being at git team1.ourcompany.com

You can add their work to your current repo by

1
$ git remote add git.team1.ourcompany.com teamone

Now, you have their repo with the shortname teamone. You can fetch their progress to your local machine by:

1
$ git fetch teamone

In this case, since teamone actually didn’t have anything more than the server had, nothing will be fecthed, but you will still obtain a branch teamone that marks their progress

Note:

  • At this point, if all you have called is fetch, you will not have a physical copy of their work on your local machine, but just a unmovable pointer. To get their work to your local machine and work on it, you need to call git merge teamone/master and switch to that, or git checkout -b teamone teamone/master, which creates a new local branch called teamone that has all the progress on the server of teamone.

Pushing Your Branch to the Server

Your local branches aren’t automatically synchronized to the remotes you write to — you have to explicitly push the branches you want to share. That way, you can use private branches for work you don’t want to share, and push up only the topic branches you want to collaborate on.

To do this, you run:

1
$ git push <remoteShortName> <yourBranchName>

For example, you want to push up a branch serverfix to the server with shortname origin, you run:

1
2
3
4
5
6
7
8
$ git push origin serverfix
Counting objects: 24, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (15/15), done.
Writing objects: 100% (24/24), 1.91 KiB | 0 bytes/s, done.
Total 24 (delta 2), reused 0 (delta 0)
To https://github.com/schacon/simplegit
* [new branch] serverfix -> serverfix

Note:

  • The command git push origin serverfix is a bit of a shortcut. The slightly more complete version is git push origin serverfix:serverfix, which means to push the serverfix branch of the local machine (left side of colon) to the serverfix branch of the server (right side of colon). This means you could also push to have different names by git push origin serverfix:anotherNameOnServer

Tracking Branches

Checking out a local branch from a remote-tracking branch automatically creates what is called a “tracking branch” (and the branch it tracks is called an “upstream branch”). Tracking branches are local branches that have a direct relationship to a remote branch. If you’re on a tracking branch and type git pull, Git automatically knows which server to fetch from and which branch to merge in.

For example, when you made a clone from the server with shorthand origin, you made a tracking branch with a master branch that tracks the origin/master branch.

In general, when you do

1
$ git checkout -b <branchName> <remoteShortName>/<branchName>

You create a tracking branch of <branchName> tracking the <remoteShortName>/<branchName>

In fact, this is so common that you could do the same thing by running, more explicitly:

1
2
3
$ git checkout --track origin/serverfix
Branch serverfix set up to track remote branch serverfix from origin.
Switched to a new branch 'serverfix'

But this will setup the same branchname as the branchname on the remote. If you want to change the local branchname, use the first one git checkout -b <branchName> <remoteShortName>/<branchName>

Difference between git fetch and git pull

git fetch only gets you information on what their progess is, but does not have a physical copy of their work. git pull gets you both the information about their progess and a physical copy on your local machine of their work on server.

git pull is essentially a git fetch immediately followed by a git merge in most cases. If you have a tracking branch set up as demonstrated in the last section, either by explicitly setting it or by having it created for you by the clone or checkout commands, git pull will look up what server and branch your current branch is tracking, fetch from that server and then try to merge in that remote branch for you.

Viewing the Current Tracking Branch

You can view the current tracking branches by:

1
2
3
4
5
6
$ git fetch --all
$ git branch -vv
iss53 7e424c3 [origin/iss53: ahead 2] Add forgotten brackets
master 1ae2a45 [origin/master] Deploy index fix
* serverfix f8674d9 [teamone/server-fix-good: ahead 3, behind 1] This should do it
testing 5ea463a Try something new

So here we can see that:

  • iss53 branch is tracking origin/iss53 and is “ahead” by two, meaning that we have two commits locally that are not pushed to the server.
  • master branch is tracking origin/master and is up to date.
  • serverfix branch is tracking the server-fix-good branch on our teamone server and is ahead by three and behind by one, meaning that there is one commit on the server we haven’t merged in yet and three commits locally that we haven’t pushed.
  • testing branch is not tracking any remote branch.

    Note:

    • We ran git fetch --all before to make sure the data we have locally about the server is up-to-date. This means that git branch --vv will only compare the cached data from the server, and it does not update from the server if there are any changes.

Switching a Tracking Branch

If you have a local branch that you want to track another branch on the server, you can use the -u or --set-upstream-to option to git branch to explicitly set your current branch at any time:

1
2

$ git branch -u <remoteShortName>/<branchName>

Deleting a Remote Branch

Suppose you’re done with a remote branch — say you and your collaborators are finished with a feature and have merged it into your remote’s master branch (or whatever branch your stable codeline is in). You can delete a remote branch using the

1
git push <remoteShortName> --delete <remoteBranchName>

Note:

  • You don’t need to use --delete <remoteShortName>/<remoteBranchName>, because it is repetitive
  • This operation does not remove the branch in the server immediately. All this does is remove the pointer from the server. The Git server will generally keep the data there for a while until a garbage collection runs, so if it was accidentally deleted, it’s often easy to recover.

Rebasing

In Git, there are two main ways to integrate changes from one branch into another: the merge and the rebase.

This second operation rebase basically stand from the fact that merge does a three-way-merge, so that in all case, even if two branches had modification on completely different files.

So if you rebase, what you want to do is actually this:

So, you can take the patch of the change that was introduced in C4 and reapply it on top of C3. With the rebase command, you can take all the changes that were committed on one branch and replay them on a different branch.

To place experiment on top of master, you run:

1
2
$ git checkout experiment
$ git rebase master

By replaying the changes, it means that this rebase operation does:

  • going to the common ancestor of the two branches (the one you’re on and the one you’re rebasing onto)
  • getting the diff introduced by each commit of the branch you’re on, saving those diffs to temporary files, resetting the current branch to the same commit as the branch you are rebasing onto, and finally applying each change in turn

Finally, if you want to put your master in place of the head of the same branch as experiment, you need to switch back to master and merge. This time, it is guaranteed to have no conflicts because the diversion will be eliminated as the changes are replayed and the ancestors will become the same.

Note:

  • The end product from merge and rebase will be the same, but rebasing makes for a cleaner history. If you examine the log of a rebased branch, it looks like a linear history: it appears that all the work happened in series, even when it originally happened in parallel.
  • Of course, do not rebase commits that exist outside your repository and that people may have based work on, because the changes are applied to that branch and it might destory their work.

A More Complicated Rebase

Conside the case:

And you want to rebase the client branch to go from the master branch. This can be done simply using one line:

1
git rebase --onto master server client

This means to take the client branch, figure out the patches since it diverged from the server branch, and replay these patches in the client branch as if it was based directly off the master branch instead.


Making a Disaster


A Disasterous Case

Consider if the server side actually rebased their master branch, and your work depends on it, and you did a pull, which automatically merges so it looks like this:

However, since we know that C4 and C6 are deleted/rebased because they are useless, we would also want C4 and C6 become removed. This can be done by git pull --rebase instead of git pull, which automatically successfully figure out what is uniquely yours and apply them back on top of the new branch.

Note:

  • If you are using git pull and want to make --rebase the default, you can set the pull.rebase config value with something like git config --global pull.rebase true.

Running Git On Server

Running a Git server is fairly straightforward. First, you choose which protocols you want your server to support.

A remote repository is generally a bare repository — a Git repository that has no working directory. Because the repository is only used as a collaboration point, there is no reason to have a snapshot checked out on disk; it’s just the Git data. In the simplest terms, a bare repository is the contents of your project’s .git directory and nothing else.

Basically, a bare repository does not directly contain the actual files and code that you want to edit on your local machine. However, when you do git clone from it, you will get the progress and the code translated from the bare repo on the server to your local machine where you cloned it. Everytime you push changes to the remote bare repo, only the progress information will be pushed, but there will still be no actual code files in the remote bare repo

The Protocals for Git Data Transfer

Git can use four distinct protocols to transfer data: Local, HTTP, Secure Shell (SSH) and Git.

  1. Local Protocal

    • The most basic is the Local protocol, in which the remote repository is in another directory on the same host. This is often used if everyone on your team has access to a shared filesystem such as an NFS mount, or in the less likely case that everyone logs in to the same computer.

    • Then to clone a local bare repository, you run

      1
      $ git clone /your/gitrepo/directory.git

      Or you can run

      1
      $ git clone file:///your/gitrepo/directory.git

      Note:

      • Doing it in the first way is a faster transfer process, but it will let Git use hardlinks or directly copy the files it needs.
      • Doing it the second way will fire up the processes that Git normally uses to transfer data over a network, which is generally much less efficient. The main reason to specify the file:// prefix is if you want a clean copy of the repository with extraneous references or objects left out
  2. HTTP Protocols

    • Smart HTTP
      • Smart HTTP operates very similarly to the SSH or Git protocols but runs over standard HTTPS ports and can use various HTTP authentication mechanisms, meaning it’s often easier on the user than something like SSH, since you can use things like username/password authentication rather than having to set up SSH keys.
      • This has become the most popilar way to use Git now, and in fact, for services like GitHub, the URL you use to view the repository online (for example, https://github.com/schacon/simplegit) is the same URL you can use to clone and, if you have access, push over.
    • Dumb HTTP
      • If the server does not respond with a Git HTTP smart service, the Git client will try to fall back to the simpler Dumb HTTP protocol. The Dumb protocol expects the bare Git repository to be served like normal files from the web server. The beauty of Dumb HTTP is the simplicity of setting it up. Basically, all you have to do is put a bare Git repository under your HTTP document root and set up a specific post-update hook, and you’re done (See Git Hooks).
  3. SSH Protocal

    • A common transport protocol for Git when self-hosting is over SSH. This is because SSH access to servers is already set up in most places — and if it isn’t, it’s easy to do. SSH is also an authenticated network protocol and, because it’s ubiquitous, it’s generally easy to set up and use.

    • To clone a Git repository over SSH, you can specify an ssh:// URL like this:

      1
      $ git clone ssh://[user@]server/project.git

      Or you can use the shorter scp-like syntax for the SSH protocol:

      1
      $ git clone [user@]server:project.git

      In both cases above, if you don’t specify the optional <username>, Git assumes the user you’re currently logged in as ?.

      Note

      • A negative aspect of SSH is that it doesn’t support anonymous access to your Git repository. If you’re using SSH, people must have SSH access to your machine, even in a read-only capacity, which doesn’t make SSH conducive to open source projects for which people might simply want to clone your repository to examine it.
  4. Git Protocal

    • Finally, we have the Git protocol. This is a special daemon that comes packaged with Git; it listens on a dedicated port (9418) that provides a service similar to the SSH protocol, but with absolutely no authentication. In order for a repository to be served over the Git protocol, you must create a git-daemon-export-ok file — the daemon won’t serve a repository without that file in it — but, other than that, there is no security. Either the Git repository is available for everyone to clone, or it isn’t. This is means that it will be read-only for either everyone or no-one.

    • However, you can also setup push access, but, this means that, given the lack of authentication, it will be pushable for everyone or no-one on the internet.

      Note:

      • It is the fastest network transfer protocal available currently. It uses the same data-transfer mechanism as the SSH protocol but without the encryption and authentication overhead.
      • It’s generally undesirable for the Git protocol to be the only access to your project. Generally, you’ll pair it with SSH or HTTPS access for the few developers who have push (write) access and have everyone else use git:// for read-only access. It’s also probably the most difficult protocol to set up. It must run its own daemon, which requires xinetd or systemd configuration or the like, which isn’t always a walk in the park.

Setting up Git on a Server

Setting up an Existing Git Repo on Github

  1. First you need to decide which protocal to use. Github provides two protocals: HTTP and SSH, which you can get when you see the Quick Setup tab or when you are inside the repo with the Clone or Download.

  2. Then, you navigate to your existing repo in your Git Bash, and add the remote repo by:

    1
    $ git remote add <remoteShortName> <remoteURLYouJustCopied>
  3. Then you can push any of your branch in your local repo to the remote repo by:

    1
    $ git push <remoteShortName> <localBranchName:remoteBranchName>

    If the is the same as your , then you can just

    1
    $ git push <remoteShortName> <localBranchName>

Distributed WorkFlows

The aim for these flows is how to contribute code successfully to a project and make it as easy on you and the project maintainer as possible, and also how to maintain a project successfully with a number of developers contributing.

In general, due to the flexibility of Git, it presents a vast range of workflow possibilities for your project and/or your team, so we’ll cover a few common paradigms that take advantage of this flexibility.

  1. Centralized Workflow

    This means that if two developers clone from the hub and both make changes, the first developer to push their changes back up can do so with no problems. The second developer must merge in the first one’s work before pushing changes up, so as not to overwrite the first developer’s changes.

  2. Integration-Manager Workflow
    Because Git allows you to have multiple remote repositories, it’s possible to have a workflow where each developer has write access to their own public repository and read access to everyone else’s.

    This scenario often includes a canonical repository that represents the “official” project. To contribute to that project, you create your own public clone of the project and push your changes to it. Then, you can send a request to the maintainer of the main project to pull in your changes. The maintainer can then add your repository as a remote, test your changes locally, merge them into their branch, and push back to their repository.

    1. The project maintainer/manager pushes to their public/blessed repository.
    2. A contributor/developer clones that repository and makes changes.
    3. The contributor/developer pushes to their own public repository.
    4. The contributor/developer sends the maintainer an email asking them to pull changes.
    5. The maintainer/manager adds the contributor/developer’s repository as a remote and merges locally.
    6. The maintainer/manager pushes merged changes to the main repository.

Commit Guidelines

Having a good guideline for creating commits and sticking to it makes working with Git and collaborating with others a lot easier. The Git project provides a document that lays out a number of good tips for creating commits from which to submit patches — you can read it in the Git source code in the Documentation/SubmittingPatches file

  • Do not commit/make whitespace changes (i.e. adding meaningless whitespaces into the program and committing those as changes). This can be easily detected by running

    1
    $ git diff
  • Try to make each commit a logically separate changeset. If you can, try to make your changes digestible — don’t code for a whole weekend on five different issues and then submit them all as one massive commit on Monday. Even if you don’t commit during the weekend, use the staging area on Monday to split your work into at least one commit per issue, with a useful message per commit.

  • Creating quality commit messages makes using and collaborating with Git a lot easier. As a general rule, your messages should start with a single line that’s no more than about 50 characters and that describes the changeset concisely, followed by a blank line, followed by a more detailed explanation. (The Git project requires that the more detailed explanation include your motivation for the change and contrast its implementation with previous behavior — this is a good guideline to follow. Write your commit message in the imperative: “Fix bug” and not “Fixed bug” or “Fixes bug.”)

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    CAPITALIZED, short (50 chars or less) summary

    More detailed explanatory text, if necessary. Wrap it to about 72
    characters or so. In some contexts, the first line is treated as the
    subject of an email and the rest of the text as the body. The blank
    line separating the summary from the body is critical (unless you omit
    the body entirely); tools like rebase can get confused if you run the
    two together.

    Write your commit message in the imperative: "Fix bug" and not "Fixed bug"
    or "Fixes bug." This convention matches up with commit messages generated
    by commands like git merge and git revert.

    Further paragraphs come after blank lines.

    - Bullet points are okay, too

    - Typically a hyphen or asterisk is used for the bullet, followed by a
    single space, with blank lines in between, but conventions vary here

    - Use a hanging indent

Branch Naming

Sometimes, the maintainer of the Git project tends to namespace these branches — such as sc/ruby_client, where sc is short for the person who contributed the work, and ruby_client is the branch the person worked on.

Therefore, it is sometimes a good idea to format your branchname to:

1
$ git checkout -b <yourNickName/descriptiveBranchName>

Forking a Project

Sometimes you want to work with a project that you do not have open push access to, but you might have the fork option, which allows you to clone the repo at least.

In this case, if you want to add some feature to the project, say featureA, you could:

  1. clone the project to your local machine
  2. create a new branch featureA and work on it
  3. create your own remote repo of the same project by forking that project
  4. push changes (preferably just the branch you edited, because if the change is cherry-picked or rejected, you don’t need to do too much rewindings) to your own remote repo that your forked
  5. Now you need to notify the maintainers of the original project that you have work you’d like them to merge. This is often called a pull request, and you typically generate such a request either via the website — GitHub has its own “Pull Request” mechanism — or you can run the git request-pull command and email the subsequent output to the project maintainer manually.
    1
    $ git request-pull <targetRemoteShortName>/<targetBranch> <myRemote>

    Note:

    • To keep your forked project up-to-date, you could add another upstream remote to your local forked repo. Then, to update your local master branch with what the server has, you can run git fetch upsteam/master and then checkout to your local master, run git merge upstream/master to merge in the new changes

Difference bewtween git merge --squash and git rebase

Sometimes, when there where quite a lot of updates in the server’s master branch, and you would like your new changes to be based on the latest commit/change made, you would either do a rebase or your featureA branch to be directly off from the latest <remoteRepo>/master branch, or you could do a merge --squash.

  • doing a rebase would produce a cleaner history by actually moving your branch to continue off from the <remoteRepo>/master branch. This means it deletes the your original diverged branch and places it newly there.
  • doing a merge --squash would not delete your original diverged branch, and base it off from the lastest <remoteRepo>/master branch. It does not delete the divergent history. This works by squashing all your changes/commit objects in your divergent branch to a single one, and puts it at the tip of the <remoteRepo>/master branch.

Note:

  • After you do a git merge --squash <targetBranch>, you need to *call another git commit *manually to finish the change.

Nofifying Changes via Email

Sometimes, some projects might accept patches via a developer mailing list instead of pull requests. In this case, you need to mail your changes by:

1
$ git format-patch -M <remoteShortName>/master

For example

1
2
$ git format-patch -M testRepo/master
0001-TESTING-EMAIL-PATCH.patch
  • The git format-patch is used to generate the mbox-formatted files that you can email to the list.

This will create a patch file under your current directory, which will look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ cat 0001-TESTING-EMAIL-PATCH.patch
From 41d19b4ea8be07b026f28cd4052bfdbc4a2f38b0 Mon Sep 17 00:00:00 2001
From: Xiao Yu <jasonyux17@gmail.com>
Date: Thu, 21 May 2020 17:53:36 +0800
Subject: [PATCH] TESTING EMAIL PATCH

Created a new file for testing to send an email patch.
---
newFileTestingEmailPatch.txt | 0
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 newFileTestingEmailPatch.txt

diff --git a/newFileTestingEmailPatch.txt b/newFileTestingEmailPatch.txt
new file mode 100644
index 0000000..e69de29
--
2.26.2.windows.1

Note:

  • If you want to edit some writings. you can also edit text between the — line and the beginning of the patch (the diff –git line), if you don’t want them to show up in the commit message, so that the developers can read it, but that content is ignored by the patching process.

To email this to a mailing list, you can either paste the file into your email program or send it via a command-line program. Pasting the text often causes formatting issues, especially with “smarter” clients that don’t preserve newlines and other whitespace appropriately.

  1. Sending it via commandline through an IMAP server

    • You will need to navigate to your ~/.gitconfig file and add those lines in the end of the file:

      1
      2
      3
      4
      5
      6
      7
      [imap]
      folder = "[Gmail]/Drafts"
      host = imaps://imap.gmail.com
      user = user@gmail.com
      pass = YX]8g76G_2^sFbd
      port = 993
      sslverify = false

      If your IMAP server doesn’t use SSL, the last two lines probably aren’t necessary, and the host value will be imap:// instead of imaps://. When that is set up, you can use git imap-send to place the patch series in the Drafts folder of the specified IMAP server:

      1
      2
      3
      4
      5
      6
      7
      $ cat *.patch | git imap-send
      Resolving imap.gmail.com... ok
      Connecting to [74.125.142.109]:993... ok
      Logging in...
      sending 2 messages
      100% (2/2) done
      At this point, you should be able to go to your Drafts folder, change the To field to the mailing list you’re sending the patch to, possibly CC the maintainer or person responsible for that section, and send it off.
  2. Sending through a SMTP server

    • Similary, you will need to edit the ~/.gitconfig file to add in the end:

      1
      2
      3
      4
      5
      [sendemail]
      smtpencryption = tls
      smtpserver = smtp.gmail.com
      smtpuser = user@gmail.com
      smtpserverport = 587

      After this is done, you can use git send-email to send your patches:

      1
      2
      3
      4
      5
      6
      7
      $ git send-email *.patch
      0001-add-limit-to-log-function.patch
      0002-increase-log-output-to-30-from-25.patch
      Who should the emails appear to be from? [Jessica Smith <jessica@example.com>]
      Emails will be sent from: Jessica Smith <jessica@example.com>
      Who should the emails be sent to? jessica@example.com
      Message-ID to be used as In-Reply-To for the first email? y

Maintaining a Project

Basically, your aim is to accept work in a way that is clearest for other contributors and sustainable by you over the long run.

Maintaining using patches from email

  • basically, you need to run git apply for a git diff formatted object, or git am for a patch formatted object.

Using git apply

If you received the patch from someone who generated it with git diff or some variation of the Unix diff command (which is not recommended; see the next section), you can apply it with the git apply command. Assuming you saved the patch at /tmp/patch-ruby-client.patch, you can apply the patch like this:

1
$ git apply /tmp/patch-ruby-client.patch

This will also handle file adds, deletes, and renames if they’re described in the git diff format, which patch won’t do. Finally, git apply is an “apply all or abort all” model where either everything is applied or nothing is, whereas git am for patch can partially apply patchfiles, leaving your working directory in a weird state. git apply is overall much more conservative than patch.

However, you can use git apply --check, for example:

1
$ git apply --check 0001-see-if-this-helps-the-gem.patch

To check if a patch applies cleanly to your working directory. If there is no output, then the patch should apply cleanly.

Using git am

To apply a patch generated by format-patch, you use git am (the command is named am as it is used to “apply a series of patches from a mailbox”).

  • if someone emailed you the patch properly using git send-email, namely the content of the patch being the mail content + subject title, and you download that into an mbox format, then you can point git am to that mbox file, and it will start applying all the patches it sees.

  • if someone uploaded a patch file generated via git format-patch to a ticketing system or something similar, you can save the file locally and then pass that file saved on your disk to git am to apply it

However, it’s possible that the patch won’t apply cleanly. Perhaps your main branch has diverged too far from the branch the patch was built from, or the patch depends on another patch you haven’t applied yet. In that case, the git am process will fail and ask you what you want to do:

1
2
3
4
5
6
7
8
$ git am 0001-see-if-this-helps-the-gem.patch
Applying: See if this helps the gem
error: patch failed: ticgit.gemspec:1
error: ticgit.gemspec: patch does not apply
Patch failed at 0001.
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".

This command puts conflict markers in any files it has issues with, much like a conflicted merge or rebase operation. You solve this issue much the same way — edit the file to resolve the conflict, stage the new file, and then run git am --resolved to continue to the next patch in the series of patches that you haven’t yet completed:

1
2
3
4
$ (fix the file)
$ git add ticgit.gemspec
$ git am --resolved
Applying: See if this helps the gem

Managing Using the Remote Branches

If your contribution came from a Git user who set up their own repository, pushed a number of changes into it, and then sent you the URL to the repository and the name of the remote branch the changes are in, you can add them as a remote and do merges locally.

For example, if a developer named Jessica sends you an email saying that she has a great new feature in the ruby-client branch of her repository, you can test it by adding the remote and checking out that branch locally:

1
2
3
$ git remote add jessica git://github.com/jessica/myproject.git
$ git fetch jessica
$ git checkout -b rubyclient jessica/ruby-client

Note:

  • You do a fetch here instead of a direct pull because you might not want to automatically merge her work into any of your branch. You might want to see if her work functions properly before you run merge.

However, if you aren’t working with a person consistently but do something like a one-time pull, you can use the URL of their remote repository to the git pull command. This does a one-time pull and doesn’t save the URL as a remote reference:

1
2
3
4
$ git pull https://github.com/onetimeguy/project
From https://github.com/onetimeguy/project
* branch HEAD -> FETCH_HEAD
Merge made by the 'recursive' strategy.

Note:

  • Here the command git pull https://github.com/onetimeguy/project pulls in all the progress to your current branch. If you just want to pull a specific branch, remember to add the option of the like this git pull https://github.com/onetimeguy/project <remoteBranchName>

Determining What is Changed

Now you have a topic branch that contains contributed work. At this point, you can determine what you’d like to do with it. This section revisits a couple of commands so you can see exactly what you’ll be introducing if you merge this into your main branch.

  1. to see only commits that are different between the topic branch and your master branch (for example), you could run

    1
    $ git log <branch1> --not <branch2>

    In this example:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    $ git log topic --not master
    commit 5b6235bd297351589efc4d73316f0a68d484f118
    Author: Scott Chacon <schacon@gmail.com>
    Date: Fri Oct 24 09:53:59 2008 -0700

    See if this helps the gem

    commit 7482e0d16d04bea79d0dba8988cc78df655f16a0
    Author: Scott Chacon <schacon@gmail.com>
    Date: Mon Oct 22 19:38:36 2008 -0700

    Update gemspec to hopefully work better

    Note

    • To see what changes each commit introduces, remember that you can pass the -p option to git log and it will append the diff introduced to each commit.
  2. The more useful case might be seeing a full diff of what would happen if you merge the topic branch. So you might want to run:

    1
    $ git diff <branch1>...<branch2>

    In this example:

    1
    $ git diff master...topic

    This command shows you only the work your current topic branch has introduced since its common ancestor with master.

    This triple dot technically is a shorthand command created by Git to achieve the following:

    1
    2
    3
    $ git merge-base contrib master
    36c7dba2c95e6bbb78dfa822519ecfec6e1ca649
    $ git diff 36c7db

    or, more concisely:

    1
    $ git diff $(git merge-base contrib master)

    Note:

    • You might think there is an easier way by doing git diff master. This might work, but not always. This is because this command gives you a diff by comparing the last commit of the topic branch you’re on and the last commit on the master branch. This means that if your master branch has moved forward since you created the topic branch from it, then you’ll get seemingly strange results. For example, if you’ve added a line in a file on the master branch, a direct comparison of the snapshots will look like the topic branch is going to remove that line.

Determining your Merging Workflow

Basically, you need to decide how to manage all your merge and rebase actions, so that you could easily track changes and make it easier to understand. There are several example styles that you could use for managing your project merges:

  1. Merging Workflow for small projects
  2. Merging Workflow for large projects
  3. Rebasing and Cherry-Picking Workflows

Small Merging Workflow

The simplest way is to do git merge all the time to your master branch. However, this might cause unpleasant history line to review. However, this is often manageable for small projects, and typically you will have a main track with two branches: master and develop, in which you determine that master is updated only when a very stable release is cut and all new code is integrated into the develop branch.

Then, since you typically first merge into your develop before deciding whether or not it is to be released in the master branch:

And then if you do decide to merge in to the master branch:

Large Merging Workflow

This workflow will have four branches on the main track: master,next,pu,and maint.

  • master branch will still be the one with the most stable version of the system
  • next branch will be controlling/merging the topics/features branches that you got from your colaborators repo and you tested are functioning properly, considering whether to include or not.
  • pu branch means proposed updates, these could be branches with work that are interesting but need more work
  • maint branch usually stores the project at the state of the last release, so that it can be forked off to provide backported patches in case a maintenance release is required.

For example, you might have your topic branches from your colaborators’ repos like this:

And you organizes/maintains your main branches (maint not shown here) like this:

Rebasing and Cherry-Picking Workflow

Other maintainers prefer to rebase or cherry-pick contributed work on top of their master branch, rather than merging it in, to keep a mostly linear history. When you have work in a topic branch and have determined that you want to integrate it, you move to that branch and run the rebase command to rebuild the changes on top of your current master (or develop, and so on) branch. If that works well, you can fast-forward your master branch, and you’ll end up with a linear project history.

Or you could do a cherry-pick, which does not remove the original branch but “rebases” on top of the current branch, and you can choose which commit/commits you would like to incorporate (which means you might want to incorporate only several commits in that branch, but not all).

For example, if you are given this:

1
2
3
4

A---B---C topic
/
D---E---F---G master
  • Then a rebase command:

    1
    $ git rebase master topic

    You get:

    1
    2
    3
                  A'--B'--C' topic
    /
    D---E---F---G master
  • The run cherry-pick <from-commit-id>^..<to-commit-id>:

    1
    2
    $ git checkout master -b topic_new
    $ git cherry-pick A^..C

    You get:

    1
    2
    3
    4
    5
          A---B---C topic
    /
    D---E---F---G master
    \
    A'--B'--C' topic_new

Rerere in Git

Rerere stands for “reuse recorded resolution” — it’s a way of shortcutting manual conflict resolution. When rerere is enabled, Git will keep a set of pre- and post-images from successful merges, and if it notices that there’s a conflict that looks exactly like one you’ve already fixed, it’ll just use the fix from last time, without bothering you with it.

This feature comes in two parts: a configuration setting and a command. The configuration setting is rerere.enabled, and it’s handy enough to put in your global config:

1
$ git config --global rerere.enabled true

Now, whenever you do a merge that resolves conflicts, the resolution will be recorded in the cache in case you need it in the future.

If you need to, you can interact with the rerere cache using the git rerere command. When it’s invoked alone, Git checks its database of resolutions and tries to find a match with any current merge conflicts and resolve them (although this is done automatically if rerere.enabled is set to true).

Generating a Build Number

If you run git describe, Git generates a string consisting of the name of the most recent tag earlier than that commit, followed by the number of commits since that tag, followed finally by a partial SHA-1 value of the commit being described (prefixed with the letter “g” meaning Git):

1
2
$ git describe
v1.0-3-gca2e64e

Note

  • By default, the git describe command requires annotated tags (tags created with the -a or -s flag); if you want to take advantage of lightweight (non-annotated) tags as well, add the --tags option to the command.

Preparing a Release Using a Zip/Gzip

Now you want to release a build. One of the things you’ll want to do is create an archive of the latest snapshot of your code for those poor souls who don’t use Git. The command to do this is git archive:

  • To create a GZip file with the name being the build number generated from the previous section, you do:

    1
    2
    3
    4
    $ git archive master --prefix='projectArchive/' | gzip > `git describe master`.tar.gz

    $ ls *.tar.gz
    v1.0-3-gca2e64e.tar.gz

    where the --prefix='projectArchive' means that if you open that Gzip file, you will find a directory called projectArchive in which all your files are saved.

  • You can also create a zip archive in much the same way, but by passing the --format=zip option to git archive:

    1
    $ git archive master --format=zip > `git describe master`.zip

    Here you will have all the files directly in that Zip, without opening any more directory

Shortlog

If you use the git shortlog command. It summarizes all the commits in the range you give it; for example, the following gives you a summary of all the commits since your last release, if your last release was named v0.1:

1
2
3
4
5
6
$ git shortlog --no-merges master --not v0.1
Xiao Yu (4):
second commit made
amended: edited AND deleted file removal and untracking
first commit made now more with the new testing branch
testing rebase, added a new file independent for the branch master

Git Commit Selection Tools

Now you’ll explore a number of very powerful things that Git can do that you may not necessarily use on a day-to-day basis but that you may need at some point.

Revision Selection Type Command Outcome/Example Note
Showing a single commit using Short SHA-1 git show <commit-sha-1> git show ae1934jf -
Showing a single commit at the top of a branch git show <branchName> git show topic1 gives to top commit at branch topic1 You can use git rev-parse <branchName> to get a SHA-1 code of the top commit from a branch
Showing a single commit using a reflog git show <reflog-data-of-HEAD> git show HEAD@{5} The reflog is data that keeps track of your HEAD position, which you can get from running git reflog
Showing a single commit using ancestry reference using ^ (means the parent of something) git show <commit-id>^ git show HEAD^ shows the parent of the HEAD You can also specify a number for going back multiple times, d921970^2 means “the second parent of d921970.”
Showing multiple commits using the difference of two branches git log <branch1>..<branch2> — that means “all commits on <branch2> that are not reachable from <branch1> git log master..topic1 The sequence of the <branch> matters in this command.
Showing multiple commits using the difference of two branches git log <branch1>...<branch2> which specifies all the commits that are reachable by either of two references but not by both of them/all the divergent commits git log master...topic1 A common switch to use with the log command in this case is --left-right, which shows you which branch (side) of the range each commit is in. This helps make the output more useful: $ git log --left-right master...experiment

Note:

  • Whenver you want to see multiple commits, use git log would be better than git show, because you can use git log --oneline to simplify the output while it is not available for git show, which is more for looking at a specific commit.

Git Tools for Interactive Staging

These can help you craft your commits to include only certain combinations and parts of files. These tools are helpful if you modify a number of files extensively, then decide that you want those changes to be partitioned into several focused commits rather than one big messy commit.

Basically, interactive staging is initiated if you call git add -i:

1
2
3
4
5
6
7
8
9
10
$ git add -i
staged unstaged path
1: unchanged +0/-1 TODO
2: unchanged +1/-1 index.html
3: unchanged +5/-1 lib/simplegit.rb

*** Commands ***
1: [s]tatus 2: [u]pdate 3: [r]evert 4: [a]dd untracked
5: [p]atch 6: [d]iff 7: [q]uit 8: [h]elp
What now>

Here, you will see changes that are staged, changes that are unstaged, and the file/path

Command Function Example
1 or s Shows the same content as you see above, with the updated status -
2 or u First allows you to select file(s) of interest, and then add them to be staged When you typed 2 or u, you will see Update>>, and you can type in 1,2 to select the first and second file, and press enter so you will see the files with * mark as selected. Then press Enter again to do the changes.
3 or r First allows you to select file(s) of interest, and then *undo the staging * Similar as above
5 or p First allows you to select file(s) of interest, and then goes through each of the changes in that file and and let you decide whether if to incorporate the changes Similar as above, but when you need to select which change to add, you will be prompted with Stage this hunk [y,n,a,d,/,j,J,g,e,?], with typing ? allowing to see what each letter means
6 or d First allows you to select file(s) of interest that are staged, then you can see the staged diff Similar as above

Git Tools for Stashing

When things are in a messy state and you want to switch branches for a bit to work on something else, but you don’t want to do a commit of half-done work just so you can get back to this point later, you can use git stash or git stash push

For example, you have some modifications done but need to switch urgently to something else:

1
2
3
4
5
6
7
8
9
10
11
$ git status
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)

modified: index.html

Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)

modified: lib/simplegit.rb

Then you do:

1
2
3
4
5
6
7
8
9
$ git stash push
Saved working directory and index state \
"WIP on master: 049d078 Create index file"
HEAD is now at 049d078 Create index file
(To restore them type "git stash apply")

$ git status
# On branch master
nothing to commit, working directory clean

At this point, you can switch branches and do work elsewhere; your changes are stored on your stack. To see which stashes you’ve stored, you can use git stash list:

1
2
3
4
$ git stash list
stash@{0}: WIP on master: 049d078 Create index file
stash@{1}: WIP on master: c264051 Revert "Add file_size"
stash@{2}: WIP on master: 21d80a5 Add number to log

Then, when you finished your urgent work, you can reapply by git stash apply stash@{<number>}, which will reapply changes on the branch you run this command, much like a rebase

For example:

1
$ git stash apply stash@{2} --index

The --index is often added because, when changes to your files were reapplied, by default the file you staged before will not be restaged, since it only reapplies the changes. To also re-stage what you have staged, you must run the git stash apply command with a --index option to tell the command to try to reapply the staged changes.

Note:

  • Similar to rebase or merge, there could be conflicts as changes might not re-apply cleanly, if the files you edited in your stash were edited again. To resolve the conflicts, the same mechanism for merge conflicts will pop up for you to deal with.
  • After you have applied a stash, remember to drop it (i.e. delete it), because by default git stash apply will only apply the changes, and you have to delete that resolved stash manually by git stash drop stash@{<number>}
  • By default, git stash will stash only modified and staged tracked files. If you specify --include-untracked or -u, you can stash untracked files that are not ignored. To even include the ignored files, use --all or -a instead.

You can also use the interactive stash by running git stash --patch (not -i. This will prompt you interactively which of the changes you would like to stash and which you would like to keep/commit in your working directory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
$ git stash --patch
diff --git a/lib/simplegit.rb b/lib/simplegit.rb
index 66d332e..8bb5674 100644
--- a/lib/simplegit.rb
+++ b/lib/simplegit.rb
@@ -16,6 +16,10 @@ class SimpleGit
return `#{git_cmd} 2>&1`.chomp
end
end
+
+ def show(treeish = 'master')
+ command("git show #{treeish}")
+ end

end
test
Stash this hunk [y,n,q,a,d,/,e,?]? y

Saved working directory and index state WIP on master: 1b65b17 added the index file

A last option is to create a new branch for recovering your stashed work. This is done by git stash branch <newBranchname>, for you with your selected branch name, checks out the commit you were on when you stashed your work, reapplies your work there, and then drops the stash if it applies successfully:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ git stash branch testchanges
M index.html
M lib/simplegit.rb
Switched to a new branch 'testchanges'
On branch testchanges
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)

modified: index.html

Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)

modified: lib/simplegit.rb

Dropped refs/stash@{0} (29d385a81d163dfd45a452a2ce816487a6b8b014)

Git Tools for Cleanig Up Your Workspace

You can use git clean for deleting files that are not tracked. However, you’ll want to be pretty careful with this command, since, If you change your mind, there is often no retrieving the content of those files.

Therefore, often you might want to run git clean --dry-run or git clean -n to see what will happen before you run git clean

1
2
3
$ git clean --dry-run
Would remove v1.0-3-gca2e64e.tar.gz
Would remove v1.0-3-gca2e64e.zip

However, these do not include ignored files. To include those files for removal, you need to add the option -x

1
2
3
4
5
$ git clean --dry-run -x
Would remove .gitignore
Would remove unTracked.txt
Would remove v1.0-3-gca2e64e.tar.gz
Would remove v1.0-3-gca2e64e.zip

Note

  • Sometimes you might need to add the flag -f to do a git clean, because it is required if the Git configuration variable clean.requireForce is not explicitly set to false. Therefore, you will need to force it by -f.

Git Tools for Signing a Commit

In more recent versions of Git (v1.7.9 and above), you can now also sign individual commits. If you’re interested in signing commits directly instead of just the tags, all you need to do is add a -S to your git commit command.

However, first you need to get your GPG configured and your personal key installed.

If you don’t have a key installed, you can generate one with gpg --gen-key.

1
$ gpg --gen-key

Then you can check it with:

1
2
3
4
5
6
$ gpg --list-keys
/Users/schacon/.gnupg/pubring.gpg
---------------------------------
pub 2048R/0A46826A 2014-06-04
uid Scott Chacon (Git signing key) <schacon@gmail.com>
sub 2048R/874529A9 2014-06-04

Once you have a private key to sign with, you can configure Git to use it for signing things by setting the user.signingkey config setting.

1
2
$ git config --global user.signingkey 0A46826A
Now Git will use your key by default to sign tags and commits if you want.

Now, you can sign your commit properly with -S, which will pop up a window for putting in the password you created when you create the key.

1
2
3
4
5
6
7
8
9
10
$ git commit -a -S -m 'Signed commit'

You need a passphrase to unlock the secret key for
user: "Scott Chacon (Git signing key) <schacon@gmail.com>"
2048-bit RSA key, ID 0A46826A, created 2014-06-04

[master 5c3386c] Signed commit
4 files changed, 4 insertions(+), 24 deletions(-)
rewrite Rakefile (100%)
create mode 100644 lib/git.rb