IMPORTANT:
Some of the content here is a personal summary/abbreviation of contents on the Offical Git Guide. Feel free to refer to the official site if you think some of the sections written here are not clear.
Git Intro
- Version Control
- Version control is a system that records and manages changes for a file or a set of file, so that you could later revert those changes if needed
- Why Version Control
- Many default version-control method of choice is to simply copy files into another directory, and renaming them somehow to keep track of the changes
- however, this could be inconvenient and error prone
- Solutions
- One of the most popular VCS tools was a system called RCS, which is still distributed with many computers today. RCS works by keeping patch sets (that is, the differences between files) in a special format on disk; it can then re-create what any file looked like at any point in time by adding up all the patches.
- Problem
- The next major issue that people encounter is that they need to collaborate with developers on other systems.
- Problem
- Centralized Version Control Systems (CVCSs) were developed as a solution to the problem above. These systems (such as CVS, Subversion, and Perforce) have a single server that contains all the versioned files, and a number of clients that check out files from that central place. For many years, this has been the standard for version control.
- Advantages
- everyone knows to a certain degree what everyone else on the project is doing. Administrators have fine-grained control over who can do what, and it’s far easier to administer a CVCS than it is to deal with local databases on every client.
- Problems
- If that server goes down for an hour, then during that hour nobody can collaborate at all or save versioned changes to anything they’re working on. If the hard disk the central database is on becomes corrupted, and proper backups haven’t been kept, you lose absolutely everything — the entire history of the project except whatever single snapshots people happen to have on their local machines.
- Advantages
- Here Distributed Version Control Systems (DVCSs) step in. In a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients don’t just check out the latest snapshot of the files; rather, they fully mirror the repository, including its full history. Thus, if any server dies, and these systems were collaborating via that server, any of the client repositories can be copied back up to the server to restore it. Every clone is really a full backup of all the data.
- Advantages
- you can collaborate with different groups of people in different ways simultaneously within the same project. This allows you to set up several types of workflows that aren’t possible in centralized systems, such as hierarchical models.
- Advantages
- One of the most popular VCS tools was a system called RCS, which is still distributed with many computers today. RCS works by keeping patch sets (that is, the differences between files) in a special format on disk; it can then re-create what any file looked like at any point in time by adding up all the patches.
Git Setup
Please follow the offical documentation for setup and installation.
Getting Help in Git
To get more information about a command in Git, use any of the three below:
1 | $ git help <verb> |
It is recommended to use git <verb> --help
because it sticks to the pattern of command line code. For example, for a quick refresher on the available options for a Git command, you do git <verb> -h
- Notice
- Those commands are accessible even offline
Git Basics
Getting a Git Repository
This can be done in two ways- take a local directory that is currently not under version control and turn it into a Git repository
clone
an existing Git repository from elsewhere
Then you will end up with a Git repo on your local machine
To do achieve the first option, you need to first go to the project directory (for Windows):
1
$ cd your_path
Then type:
1
$ git init
This creates a new subdirectory named .git that contains all of your necessary repository files — a Git repository skeleton. At this point, nothing in your project is tracked yet.
To start version-controlling, you need to
git add
your file that you need to control, and thengit commit
1
2
3$ git add yourFile1
$ git add yourFile2
$ git commit -m 'Initial project version'To achieve the second option, for example, a project you’d like to contribute to — the command you need is
git clone
, so that Git receives a full copy of nearly all data that the server has1
git clone <url>
For example:
1
$ git clone https://github.com/libgit2/libgit2 mylibgit
Notice
- you can
clone
a Git linkable library only - The above command will create a directory named libgit2 under the current directory, initializes a .git directory inside it, pulls down all the data for that repository, and checks out a working copy of the latest version.
- Git has a number of different transfer protocols you can use. The previous example uses the
https://
protocol, but you may also seegit://
oruser@server:path/to/repo.git
, which uses the SSH transfer protocol.
If you want to clone the repository into a directory named something other than libgit2, you can specify the new directory name as an additional argument:
1
$ git clone https://github.com/libgit2/libgit2 customizedName
- you can
Recording Changes
Typically, you’ll want to start making changes and committing snapshots of those changes (i.e. recording/saving those changes) into your repository each time the project reaches a state you want to record.
As you edit files, Git sees them as modified, because you’ve changed them since your last commit. As you work, you selectively stage these modified files and then commit/save all those staged changes, and the cycle repeats.
- Note
- Not all files under the directory will be tracked, and only tracked files will Git know about
- once you
git add
to start your file to start tracking. Then it will be snapedshotted if you then callgit commit
Checking File Status
The main tool you use to determine which files are in which state is the git status
command.
For example, if you see:
1 | $ git status |
A clean working directory means none of your tracked files are modified.
- Note
- this command can also show untracked files under the current directory
- if you want to start tracking those, you can call
git add fileNeedsToBeTracked
Modifying a Tracked File
There are three status that a tracked file can be at after being modified:
- Unstaged and uncomitted
- This happens right after you have modified a tracked file. At this point, if you did not call
git add thatFile
again, Git will only know about the previous checkpoint.1
2
3
4Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: firstFile.txt
- This happens right after you have modified a tracked file. At this point, if you did not call
- Staged but uncommitted
- This happens when you have modified a tracked file and called
git add thatFile
. Now, Git knows about it and it is thrown into the staging area1
2
3Changes to be committed:
(use "git restore --staged <file>..." to unstage)
new file: LICENSE
- This happens when you have modified a tracked file and called
- Staged and committed
- This happens when you call
git commit yourFile
, so that the changes move from the stanging area to be saved in your local repository
- This happens when you call
Difference between git add
and git commit
- Q: why add and commit as two functions? They seem to be two repeated checkpoint procedures.
The git add
command adds a change in the working directory to the staging area. It tells Git that you want to include updates to a particular file in the next commit. However, git add
doesn’t really affect the repository in any significant way—changes are not actually recorded until you run git commit.
The command git commit
takes all changes in the Staging Area, wraps them together and** puts them in your Local Repository. A commit is simply a checkpoint telling git to track all changes that have occurred up to this point using our last commit as a comparison. **After committing, your Staging Area will be empty.
Abbreviated version for git status
To quickly view which file is at which status, you can execute git status -s
to see each file under the current directory with their status
For example, you could see
1 | MM LICENSE.txt |
A
stands for files just added after .git existed orgit init
was calledM
stands for files being modified- a green
M
(at position 0) means it is modified and staged - a red
M
(at position 1, if there is) means it is modified but unstaged
- a green
??
stands for files not being tracked by Git
- [x]Note
- Although it is not shown above, but the first letter will be green, and if there is a second letter, it will be red.
Ignoring Files
A gitignore file specifies intentionally untracked files that Git should ignore. Files already tracked by Git are not affected.
Each line in a gitignore file specifies a pattern. When deciding whether to ignore a path, Git normally checks gitignore patterns from multiple sources, with the following order of precedence, from highest to lowest.
- First, you need to have/create a .gitignore file under the repository. This can be simply done by
touch .gitignore
- Then, you need to add texts to that file to specify rules for ignoring files. This can be done by
cat >> .gitgnore
, where>>
means append. Then you press enter and start typing texts/rules, and when you finish, enterCtrl-D
to save.(alternatively, you could use1
2
3
4$ cat >> .gitignore
unTracked.txt
*.[oa]
*~>
instead of>>
, which will override the content)
Now, since unTracked.txt was not tracked, it will be ignored and will not appear if you executegit status
- Note
*
matches anything except a slash?
matches any one character except “/“[a-zA-Z]
is the range notation, which can be used to match one of the characters in a range.!
negates a pattern. Put a backslash\
in front of the first!
for patterns that begin with a literal!
, for example,\!important.txt
**/someDirectory
means match in all directories named [someDirectory]. For example,**/foo
matches file or directory “foo” anywheresomeDirectory/**
matches everything inside [someDirectory]. For example,abc/**
matches all files inside directory [abc], relative to the location of the [.gitignore] filesomeDirectory/**/otherDirectory
matches neted directories between [someDirectory] and [otherDirectory]
Knowing the Modifications
By just executing
git diff
will compare what is in your working directory with what is in your staging area. The result tells you the changes you’ve made that you haven’t yet staged.For example:
1
2
3
4
5
6
7
8
9
10
11
12
13$ git diff
warning: LF will be replaced by CRLF in firstFile.txt.
The file will have its original line endings in your working directory
diff --git a/firstFile.txt b/firstFile.txt
index 640bb73..caada99 100644
--- a/firstFile.txt
+++ b/firstFile.txt
@@ -1,3 +1,5 @@
hello
hello
hello
+
+aaThe
+
and+aa
lines in the end means those are the lines added to the file, but not being staged yet.By executing
git diff --staged
will compare changes of the files between the staged state and the last edited state.For example:
1
2
3
4
5
6
7
8
9$ git diff --staged
diff --git a/firstFile.txt b/firstFile.txt
index e69de29..640bb73 100644
--- a/firstFile.txt
+++ b/firstFile.txt
@@ -0,0 +1,3 @@
+hello
+hello
+helloNotice that the last two lines
+
and+aa
are not present.
Committing Changes
Now if you want to commit to those changes, you can execute git commit
. However, this will only commit changes that are staged, so modified yet unstaged changes will not be recorded.
After executing git commit
, it will bring up your editor of choice, which can be chosen when you setup your Git.
For example
1 | $ git commit |
- Note
- the first line will be intentionally made blank, which is for you to write/describe what has been changed.
- you can use the command
git commit -v
to also have the difference shown in the editor as well, so you know exactly what you have done - for a quick commit, you can do
git commit -m "yourMessageHere"
, which finishes the commit without even opening the editor - you also see which branch you committed to (
master
), what SHA-1 checksum the commit has (2031fb7
)
After you have made a commmit, all the tracked file will not appear if you execute git status -s
Committing and Skipping the Staging
If you want to skip the staging area, Git provides a simple shortcut. Adding the -a
option to the git commit
command makes Git automatically stage every file that is already tracked before doing the commit, letting you skip the git add
part.
Though this is convenient, but be careful; sometimes this flag will cause you to include unwanted changes.
Removing Tracked File/ Untracking File
If you want to remove a tracked file, meaning you want to delete the file, you need to execute
git rm yourFile
. This will:- remove the file from the current working directory
- remove the file from the staging area
Now, the next time you commit, it will be no longer tracked by Git since it will be completely removed.
If you want to untrack a tracked file, you execute
git rm --cached yourFile
. This will keep the file in your directory, but removed from the Git.Now, same as before, you will still need to execute
git commit
to complete the removal from the Git
Renaming a File
If you rename a file, and wants it still being tracked by Git properly, you should dogit mv myFile myNewFile
Now, if you call git status
, you will see:
1 | $ git mv README.md README |
This is also equivalent of doing seperately three lines:
1 | $ mv README.md README |
Viewing Commits
You can see all commits in a repo by:
1 | $ git log |
By default, with no arguments, git log
lists the commits made in that repository in reverse chronological order
Useful options for
git log
git log -p -2
will display 2 most recent commits, and also showing the patches/changes madegit log --stat
helps you quickly browse what happened during a series of commits that a collaborator has added. It will show additional information such as: a list of modified files, how many files were changed, and how many lines in those files were added and removed. It also puts a summary of the information at the end.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27$ git log --stat
commit cae63d011d113c1d45818de4bec434216e10e9ef (HEAD -> master)
Author: Xiao Yu <jasonyux17@gmail.com>
Date: Wed May 20 10:59:55 2020 +0800
tested git file removal and untracking
firstFile.txt | 6 ------
justTracked.txt | 0
2 files changed, 6 deletions(-)
commit 0e6428a6389d7748a0da2f0798be94999ce8c248
Author: Xiao Yu <jasonyux17@gmail.com>
Date: Wed May 20 10:34:31 2020 +0800
second commit made
firstFile.txt | 1 +
1 file changed, 1 insertion(+)
commit 2031fb7fd0a90cf62d1c13d7cfb972e1ce10ad30
Author: Xiao Yu <jasonyux17@gmail.com>
Date: Wed May 20 10:28:14 2020 +0800
first commit made
LICENSE.txt | 5 +++++git log --pretty=oneline
This option changes the log output to formats other than the default. A few prebuilt options are available for you to use. Theoneline
option prints each commit on a single line, which is useful if you’re looking at a lot of commits.git log --pretty=format:"%h - %an, %ar : %s"
This is customizing the format of the commit output. For example,1
2
3
4
5$ git log --pretty=format:"%h - %an - %ar, commit: %s"
cae63d0 - Xiao Yu - 23 minutes ago, commit: tested git file removal and untracking
0e6428a - Xiao Yu - 49 minutes ago, commit: second commit made
2031fb7 - Xiao Yu - 55 minutes ago, commit: first commit made
75d2c61 - Xiao Yu - 18 hours ago, commit: Initial project versionSo we can have options such as:
%h
Abbreviated commit hash%an
author name%ar
author date, relative to now%s
subject/commit message
git log --since="2008-10-01"
Limit the commits to those made after the specified date. You can also put “2.weeks” or “2 years 1 day 3 minutes ago”git log --before="2008-10-01"
Limit the commits to those made before the specified date.git log --author="Xiao Yu"
Only show commits in which the author entry matches the specified string.git log --committer="Xiao Yu"
Only show commits in which the committer entry matches the specified string.git log --no-merges
to prevent the display of merge commits cluttering up your log history, which usually are not informative
Changing your Staging
If you forget to do some small changes, but you have just committed, you can easily “re-commit” by
- first make those additional changes you forgot
- then stage them
- then committ again using
git commit --amend
to automatically include the new changes into the previous commit
If you want to “re-stage” a file for making additional changes, you use
git restore --staged <file>
. In fact, this adviced by the commandgit status
as well:1
2
3
4
5$ git status
On branch master
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: README.mdNow, the file [README.md] is unstaged and you can add changes to it and restage.
Changing/Undoing your Commit
If you want to revert the file to its previous committed state (you cannot revert to those added state), you have several options to do. You could:
Run
git restore <file>
. In fact, this is also adviced by the commandgit status
if the file is in the unstaged area. It is usually used for restoring files in the working tree from either the index or another commit. This command does not update your branch.Run
git reset <commit-id>
is about updating your branch, moving the tip in order to add or remove commits from the branchRun
git revert
which creates a new commit that undoes the changes from a previous commit. This command adds new history to the project (it doesn’t modify existing history).Run
git checkout <commit-id>
. This is the most conservative and prefered by myself. It first move the HEAD to that commit, so that you can view changes at that point. However, this will make the HEAD DETACHED, because it does not follow any branch (in fact, Git will create a temporary branch for you, which will disappear once you switch to other branches), which means that if you do some changes here, they will not be visible oncecheckout
to other branches. Therefore, now you will need to rungit branch -b <yourBranchName>
to create a branch and attach thisDETACHED HEAD
to that branch. Now, everything will be saved if you move around and switch bewteen branches.Note:
- If you use the
restore
command, any local changes you made to that file are gone — Git just replaced that file with the most recently-committed version. Don’t ever use this command unless you absolutely know that you don’t want those unsaved local changes.
If you would like to keep the changes you’ve made to that file but still need to get it out of the way for now, we’ll go over stashing and branching in Git Branching; these are generally better ways to go.
- If you use the
Working with Remotes
To be able to collaborate on any Git project, you need to know how to manage your remote repositories. Remote repositories are versions of your project that are hosted on the Internet or network somewhere. Managing these remote repositories involves pushing and pulling data to and from them when you need to share work.
- To show the remote server for repo in the current directory, you can execute
git remote
, orgit remote -v
to see the URLs stored as well. - To add a remote server, you can execute
git remote add <shortname> <url>
, which will configure the shortname/nickname for that repo. Then you can use that shortname to access that repo.Note:
- You can only add remote repo in an existing repo
- Executing
git clone <url>
will give you a default shortname oforigin
Difference between git clone
and git remote add
git remote add
just creates an entry in your git config that specifies a name for a particular URL. You must have an existing git repo to use this. You will not see a folder or any file from that repo. To get them, you need to usegit pull <shortname> <branch>
git clone
creates a new git repository by copying an existing one located at the URL you specify. This will be physical.
Fetching and Pulling from Remote Repos
git fetch <remote>
This command goes out to that remote project and pulls down all the data from that remote project that you don’t have yet. After you do this, you should have references to all the branches from that remote, which you can merge in or inspect at any time (merge operations will be discussed in Git Branching.- If you cloned a repository, the command
git fetch origin
fetches any new work that has been pushed to that server since you cloned (or last fetched from) itNote:
-
git fetch
command only downloads the data to your local repository — it doesn’t automatically merge it with any of your work or modify what you’re currently working on. You have to merge it manually into your work when you’re ready -
get pull
will fetch and merge the remote branches into your current branch. More covered in Git Branching
-
Viewing Changes After Fetching
After you have fetched from your repo with git fetch <remote>
, it will automatically update (or create) all the so-called “remote-tracking branches” for that remote (if you only want to fetch from a specific branch, you need to run git fetch <remote>/<branchName>
).
Now, to see what has been changed, you can use the ^
or ..
operator. For example, If you want to see what remote master
has compared to your local master
, you ask Git to show you exactly this:
1 | git log <remote>/master ^master |
which means all commits reachable from <remote>/master
which do not include commits reachable from master
or, alternatively
1 | git log master..<remote>/master |
Pushing/Sharing to Your Remote
When you have your project at a point that you want to share, you have to push it upstream. The command for this is simple: git push <remote> <branch>
. If you want to push your master branch to your origin server (again, cloning generally sets up both of those names for you automatically), then you can run this to push any commits you’ve done back up to the server:
1 | $ git push origin master |
Note:
- This command works only if you cloned from a server to which you have write access and if nobody has pushed in the meantime. If someone else if pushing, your push will be rejected and you will have to fetch their work first and incorporate it into yours before you’ll be allowed to push.
Renaming or Removing Remote Repo
- If you want to rename a shorthand for a remote repo, you can execute
git remote rename <originalName> <newName>
to change a remote’s shortname. - If you want to remove a remote for some reason — you’ve moved the server or are no longer using a particular mirror, or perhaps a contributor isn’t contributing anymore, you can use
git remote remove <shortname>
Inspecting a Remote for Pushing and Pulling Information
If you want to see more information about a particular remote, you can use the git remote show <remote>
command
1 | $ git remote show pb |
It lists the URL for the remote repository as well as the tracking branch information. The command helpfully tells you that if you’re on the master branch and you run git pull, it will automatically merge in the master branch on the remote after it fetches all the remote references. It also lists all the remote references it has pulled down.
In general, it will show you:
- which branch is automatically pushed to when you run
git push
while on certain branches - which remote branches on the server you don’t yet have
- which remote branches you have but have been removed from the server
- multiple local branches that are able to merge automatically with their remote-tracking branch when you run
git pull
Tagging
Git has the ability to tag specific points in a repository’s history as being important. Typically, people use this functionality to mark release points (v1.0
, v2.0
and so on).
Listing the Tags of a Repo
You can list all the tags by calling:
1 | $ git tag |
Note:
- this command lists the tags in alphabetical order; the order in which they are displayed has no real importance.
You can also filter the tags by using the optional -l
or --list
, for example
1 | $ git tag -l "v1.8.5*" |
This will list all the tags that *starts with *v1.8.5
Creating Tags
You can have two types of tags:
a lightweight tag
- it’s just a pointer to a specific commit. It does not have much information attached to it. (It does have a Checksum, and includes the author information) It is often used as a temporary tag.
an annotated tag
- stored as full objects in the Git database. They’re checksummed; contain the tagger name, email, and date; have a tagging message; and can be signed and verified with GNU Privacy Guard (GPG). It’s generally recommended that you create annotated tags so you can have all this information
Note:
- Those tags will be attached to the lastest committed version of your repo
Creating a Lightweight Tag
This is basically the commit checksum stored in a file — no other information is kept. To create a lightweight tag, don’t supply any of the -a
, -s
, or -m
options, just provide a tag name.
1 | $ git tag <tag-id> |
Creating an Annotated Tag
This is created by
$ git tag -a <tag-id> -m "yourMessage"
-a
,-s
, or-u
<keyid>
is passed, the command creates a tag object, and requires a tag message. Unless-m <msg>
or-F <file>
is given, an editor is started for the user to type in the tag message.
Then to see the message in that tag, use
1 | $ git show <tag-id> |
Note:
- Don’t be mistaken to execute
git tag show <tag-id>
!
Tagging Previous Commits
To tag any specific commit, you specify the commit checksum (or part of it) at the end of the command:
1 | $ git tag -a <tag-id> -m "yourMessage" <commit-checksum> |
Sharing Tags
By default, the git push
command doesn’t transfer tags to remote servers. You will have to explicitly push tags to a shared server after you have created them. This process is just like sharing remote branches — you can run
1 | $ git push <repo-shortname> <tagname> |
To push all the tags, you run
1 | $ git push <repo-shortname> --tags |
Deleting Tags
You can use
1 | $ git tag -d <tag-id> |
to delete a specific tag. However, this does not remove the tag from the remote servers.
To delete the tag from the server, you could either use:
1 | $ git push origin --delete <tag-id> |
or, the slightly less intuitive command
1 | $ git push origin :refs/tags/v1.4-lw |
which means the null value before the colon is being pushed to the remote tag name, effectively deleting it.
Viewing Files at a Different State using Tag
You can view the different versions of files by using its tag. This is done by a git checkout
of that tag, although this puts your repository in “detached HEAD” state, (which has some ill side effects)
- in a detached-head state, if you make changes and then create a commit, the tag will stay the same, but your new commit won’t belong to any branch and will be unreachable, except by the exact commit hash. Thus, if you need to make changes — say you’re fixing a bug on an older version, for instance — you will generally want to create a branch (covered more in Git Branching)
Q: What does git checkout
mean?
If you call git log
, you will see something called HEAD
. HEAD
is a label noting where you are in the commit tree. It moves with you when you move from one commit to another. git checkout <commit>
is the basic mechanism for moving around in the commit tree, moving your focus (HEAD) to the specified commit.
The commit can be specified by any of a number of ways, commit hash, branch name, tag name, the relative syntax (HEAD^, HEAD~1, etc.) and so on. (It is often useful to consider a checkout to be changing branches)
Git Alias
Alias can be understood as shortcuts to execute commands. They can be setup by using the git config
command.
For example:
1 | $ git config --global alias.co checkout |
This means that
git checkout
becomesgit co
git branch
becomesgit br
git commit
becomesgit ci
git status
becomesgit st
Note:
global
means access for one user (yourself) in all the repos. In comparison,local
means for one user in this specific repo.system
means for all users in all repos.
Some useful alias include:
1 | $ git config --global alias.last 'log -1 HEAD' |
So you can see the last commit easily:
1 | $ git last |
Resetting the Alias
If you find certain alias used before as a bad idea, you can do
1 | $ git config --global --unset alias.<yourAlias> |
You can also directly edit in the config file to change the alias by:
1 | $ git config --global --edit |
which will open the default text editor and then you can edit your alias directly (delete it if not needed).
Git Branching
Git doesn’t store data as a series of changesets or differences, but instead as a series of snapshots.
When you make a commit, Git stores a commit object that contains a pointer to the snapshot of the content you staged. This object also contains the author’s name and email address, the message that you typed, and pointers to the commit or commits that directly came before this commit (its parent or parents): zero parents for the initial commit, one parent for a normal commit, and multiple parents for a commit that results from a merge of two or more branches.
- When you staged a file/files
- Staging the files computes a checksum for each one (the SHA-1 hash we mentioned in Getting Started)
- then stores that version of the file in the Git repository (Git refers to them as blobs)
- finally adds that checksum to the staging area
- When you commit by running
git commit
- Git checksums each subdirectory (in this case, just the root project directory)
- stores them as a tree object in the Git repository.
- Git then creates a commit object that has the metadata and a pointer to the root project tree so it can re-create that snapshot when needed.
If you had 3 files that were staged, and you committed them, you will get in totl five objects: three blobs (each representing the contents of one of the three files), one tree that lists the contents of the directory and specifies which file names are stored as which blobs, and one commit with the pointer to that root tree and all the commit metadata.
If you make some changes and commit again, the next commit object stores a (parent) pointer to the commit object that came immediately before it. Essentially it is like a linked list of commit objects.
A branch in Git is simply a lightweight movable pointer to one of these commits objects. The default branch name in Git is master (default when you run git init
). As you start making commits, you’re given a master branch that points to the last commit you made. Every time you commit, the master branch pointer moves forward automatically.
Create a New Branch
Creating a new branch means creating a new pointer for you to move around. Let’s say you want to create a new branch called testing. You do this with the git branch command:
1 | $ git branch <newBranchName> |
Note:
- this new branch will by default point to the current commit object (the head) you are working on.
- you can see the head commit object by running
git log
Switching Branches
To switch to an existing branch, you run the git checkout
command.
1 | $ git checkout <branchName> |
For example, if you have another branch called testing
, you switch to that by git checkout testing
.
This will move the head to the new branch you are at. This means that now, if you make modifications and commit, you will create a new commit object that the head/your current branch points to, but the other branch still points to the original commit object.
Note:
- Now, if you run
git log
, you will not see the new commit object that is made by the testing branch. By default, git log will only show commit history below the branch you’ve checked out.- In general, if you move backward in the tree, you will not be able to see future commit objects by running
git log
- To show commit history for the desired branch you have to explicitly specify it:
git log <branchName>
. To show all of the branches, rungit log --all
.- You can actually create a new branch and switching to it at the same time — this can be done in one operation with
git checkout -b <newBranchName>
However, if you have made some changes in this new branch and switched back to the old branch, your current working directory will be reverted back to the snapshot that old branch points to. Now, if you make some changes on the old branch, the two branch will diverge:
- the new branch does not know that change made in the old branch
- the old branch cannot go forward directly to the new branch because a commit object that both shared has been changed now
- basically, you will have two different commit objects, one for the old branch, and another for the new branch, both links to the same parent.
But don’t worry, in the end you can merge them together when you’re ready.
Making a Disaster:
- You can create a branch that points to the current head position, and then switch to the master branch. Then you can move the head using the checksum of the first commit by doing Now, the master head will point to the inital commit. If your friend is also working on this branch, they will be dismayed as thinking somehow all the progress is lost, but in fact it is kept by the other branch that you just created.
1
git checkout <checkSum>
Viewing the Diversion Tree
The entire tree for all branches can be seen by running
1 | $ git log --oneline --graph --all |
For example, it can look like this:
1 | $ git log --oneline --graph --all |
So we see that the two branches are diverged at the first commit object.
Merging Different Branches
There are basically two cases:
You have two diverged branches, but they used different files, or, to put it more suggestively, the file they shared are the same. In this case, it is simple for Git to figure out how to merge (basically just putting all the files together), and you run:
1
2$ git checkout master
$ git merge testingNote:
- This means that you are merging files from the testing branch to the master branch
- The merge operation will only be allowed if you have committed in the testing branch/current working directory
You have two diverged branches, and they share a file that is modified differently in both branches. Now, if you want to merge them, Git will show you:
1
2
3
4$ git merge testing
Auto-merging firstFile.txt
CONFLICT (content): Merge conflict in firstFile.txt
Automatic merge failed; fix conflicts and then commit the result.In this example, [firstFile.txt] is modified for both branches, and they both depend on it. Therefore, the merging cannot be completed, and you will need to manually edit that file.
When you finish editing that file, you run
git add <thatFile>
to mark it as resolved, and then callgit commit
to complete the merge.
In the end, when you have finished all the work, you can delete the branch you don’t need by:
1 | $ git branch -d <branchName> |
Branch Management
There are a handful of commands with git branch
that is useful for branch management:
git branch
This will show you a list of branches that yuo have, with an asterisk*
in front of the current branch you are on.git branch -v
Shows you a list of branches with their last commit messagegit branch --merged
shows you branches that have been mergedgit branch --unmerged
shows you branches that have not been merged
Note
- Each of the command above shows the information relative to the current branch. To specify which branch you want to know, you can do, for example,
git branch --no-merged testing
(to get information relative to the testing branch)
Typical WorkFlow Using Git Branches
Usually, people tend to put code that is entirely stable in their master branch — possibly only code that has been or will be released. They have another parallel branch named develop or next that they work from or use to test stability — it isn’t necessarily always stable, but whenever it gets to a stable state, it can be merged into master.
In general, for some larger projects, you will want your branches are at various levels of stability, ordered from most stable top branch (master) to the least stable bottom branch(topic, for example); when they reach a more stable level, they’re merged into the branch above them.
Note:
- It’s important to remember when you’re doing all this that these branches are completely local. When you’re branching and merging, everything is being done only in your Git repository — there is no communication with the server, for now.
Remote Branches
Remote-tracking branches are references to the state of remote branches. They’re local references that you can’t move; Git moves them for you whenever you do any network communication, to make sure they accurately represent the state of the remote repository. Think of them as bookmarks, to remind you where the branches in your remote repositories were the last time you connected to them.
Remote-tracking branch names take the form <remoteShortName>/<branch>
. For instance, if you wanted to see what the master
branch on your origin
remote looked like as of the last time you communicated with it, you would check the origin/master
branch. This means that, if you were working on an issue with a partner and they pushed up an iss53
branch, but you also have your own local iss53
branch, the branch on the server would be represented by the remote-tracking branch origin/iss53
updated with your friend’s version
Now, to synchronize your work with a given remote, you run
1 | $ git fetch <remoteShortName> command |
This command looks up which server “origin
” is (for example, it might be cloned from git.ourcompany.com
), fetches any data from it that you don’t yet have, and updates your local database, moving your origin/master
pointer to its new, more up-to-date position, but your current master
pointer is still the same as you were working with.
Colaborating with Multiple Repos
Say you have a main repo that you have cloned, with the shortname origin
at git.outcompany.com
, and you have another team working on that repo as well, with their repo being at git team1.ourcompany.com
You can add their work to your current repo by
1 | $ git remote add git.team1.ourcompany.com teamone |
Now, you have their repo with the shortname teamone
. You can fetch their progress to your local machine by:
1 | $ git fetch teamone |
In this case, since teamone
actually didn’t have anything more than the server had, nothing will be fecthed, but you will still obtain a branch teamone
that marks their progress
Note:
- At this point, if all you have called is
fetch
, you will not have a physical copy of their work on your local machine, but just a unmovable pointer. To get their work to your local machine and work on it, you need to callgit merge teamone/master
and switch to that, orgit checkout -b teamone teamone/master
, which creates a new local branch calledteamone
that has all the progress on the server ofteamone
.
Pushing Your Branch to the Server
Your local branches aren’t automatically synchronized to the remotes you write to — you have to explicitly push the branches you want to share. That way, you can use private branches for work you don’t want to share, and push up only the topic branches you want to collaborate on.
To do this, you run:
1 | $ git push <remoteShortName> <yourBranchName> |
For example, you want to push up a branch serverfix
to the server with shortname origin
, you run:
1 | $ git push origin serverfix |
Note:
- The command
git push origin serverfix
is a bit of a shortcut. The slightly more complete version isgit push origin serverfix:serverfix
, which means to push theserverfix
branch of the local machine (left side of colon) to theserverfix
branch of the server (right side of colon). This means you could also push to have different names bygit push origin serverfix:anotherNameOnServer
Tracking Branches
Checking out a local branch from a remote-tracking branch automatically creates what is called a “tracking branch” (and the branch it tracks is called an “upstream branch”). Tracking branches are local branches that have a direct relationship to a remote branch. If you’re on a tracking branch and type git pull
, Git automatically knows which server to fetch
from and which branch to merge in.
For example, when you made a clone
from the server with shorthand origin
, you made a tracking branch with a master branch that tracks the origin/master branch.
In general, when you do
1 | $ git checkout -b <branchName> <remoteShortName>/<branchName> |
You create a tracking branch of <branchName>
tracking the <remoteShortName>/<branchName>
In fact, this is so common that you could do the same thing by running, more explicitly:
1 | $ git checkout --track origin/serverfix |
But this will setup the same branchname as the branchname on the remote. If you want to change the local branchname, use the first one git checkout -b <branchName> <remoteShortName>/<branchName>
Difference between git fetch
and git pull
git fetch
only gets you information on what their progess is, but does not have a physical copy of their work. git pull
gets you both the information about their progess and a physical copy on your local machine of their work on server.
git pull
is essentially a git fetch
immediately followed by a git merge
in most cases. If you have a tracking branch set up as demonstrated in the last section, either by explicitly setting it or by having it created for you by the clone or checkout commands, git pull
will look up what server and branch your current branch is tracking, fetch from that server and then try to merge in that remote branch for you.
Viewing the Current Tracking Branch
You can view the current tracking branches by:
1 | $ git fetch --all |
So here we can see that:
iss53
branch is trackingorigin/iss53
and is “ahead” by two, meaning that we have two commits locally that are not pushed to the server.master
branch is trackingorigin/master
and is up to date.serverfix
branch is tracking theserver-fix-good
branch on our teamone server and is ahead by three and behind by one, meaning that there is one commit on the server we haven’t merged in yet and three commits locally that we haven’t pushed.testing
branch is not tracking any remote branch.Note:
- We ran
git fetch --all
before to make sure the data we have locally about the server is up-to-date. This means thatgit branch --vv
will only compare the cached data from the server, and it does not update from the server if there are any changes.
- We ran
Switching a Tracking Branch
If you have a local branch that you want to track another branch on the server, you can use the -u
or --set-upstream-to
option to git branch
to explicitly set your current branch at any time:
1 |
|
Deleting a Remote Branch
Suppose you’re done with a remote branch — say you and your collaborators are finished with a feature and have merged it into your remote’s master branch (or whatever branch your stable codeline is in). You can delete a remote branch using the
1 | git push <remoteShortName> --delete <remoteBranchName> |
Note:
- You don’t need to use
--delete <remoteShortName>/<remoteBranchName>
, because it is repetitive- This operation does not remove the branch in the server immediately. All this does is remove the pointer from the server. The Git server will generally keep the data there for a while until a garbage collection runs, so if it was accidentally deleted, it’s often easy to recover.
Rebasing
In Git, there are two main ways to integrate changes from one branch into another: the merge
and the rebase
.
This second operation rebase
basically stand from the fact that merge
does a three-way-merge, so that in all case, even if two branches had modification on completely different files.
So if you rebase
, what you want to do is actually this:
So, you can take the patch of the change that was introduced in C4 and reapply it on top of C3. With the rebase
command, you can take all the changes that were committed on one branch and replay them on a different branch.
To place experiment
on top of master
, you run:
1 | $ git checkout experiment |
By replaying the changes, it means that this rebase
operation does:
- going to the common ancestor of the two branches (the one you’re on and the one you’re rebasing onto)
- getting the diff introduced by each commit of the branch you’re on, saving those diffs to temporary files, resetting the current branch to the same commit as the branch you are rebasing onto, and finally applying each change in turn
Finally, if you want to put your master in place of the head of the same branch as experiment, you need to switch back to master and merge. This time, it is guaranteed to have no conflicts because the diversion will be eliminated as the changes are replayed and the ancestors will become the same.
Note:
- The end product from
merge
andrebase
will be the same, but rebasing makes for a cleaner history. If you examine thelog
of a rebased branch, it looks like a linear history: it appears that all the work happened in series, even when it originally happened in parallel.- Of course, do not rebase commits that exist outside your repository and that people may have based work on, because the changes are applied to that branch and it might destory their work.
A More Complicated Rebase
Conside the case:
And you want to rebase the client branch to go from the master branch. This can be done simply using one line:
1 | git rebase --onto master server client |
This means to take the client branch, figure out the patches since it diverged from the server branch, and replay these patches in the client branch as if it was based directly off the master branch instead.
Making a Disaster
A Disasterous Case
Consider if the server side actually rebased
their master
branch, and your work depends on it, and you did a pull
, which automatically merges
so it looks like this:
However, since we know that C4
and C6
are deleted/rebased because they are useless, we would also want C4
and C6
become removed. This can be done by git pull --rebase
instead of git pull
, which automatically successfully figure out what is uniquely yours and apply them back on top of the new branch.
Note:
- If you are using
git pull
and want to make--rebase
the default, you can set thepull.rebase config
value with something likegit config --global pull.rebase true
.
Running Git On Server
Running a Git server is fairly straightforward. First, you choose which protocols you want your server to support.
A remote repository is generally a bare repository — a Git repository that has no working directory. Because the repository is only used as a collaboration point, there is no reason to have a snapshot checked out on disk; it’s just the Git data. In the simplest terms, a bare repository is the contents of your project’s .git directory and nothing else.
Basically, a bare repository does not directly contain the actual files and code that you want to edit on your local machine. However, when you do git clone
from it, you will get the progress and the code translated from the bare repo on the server to your local machine where you cloned it. Everytime you push changes to the remote bare repo, only the progress information will be pushed, but there will still be no actual code files in the remote bare repo
The Protocals for Git Data Transfer
Git can use four distinct protocols to transfer data: Local, HTTP, Secure Shell (SSH) and Git.
Local Protocal
The most basic is the Local protocol, in which the remote repository is in another directory on the same host. This is often used if everyone on your team has access to a shared filesystem such as an NFS mount, or in the less likely case that everyone logs in to the same computer.
Then to clone a local bare repository, you run
1
$ git clone /your/gitrepo/directory.git
Or you can run
1
$ git clone file:///your/gitrepo/directory.git
Note:
- Doing it in the first way is a faster transfer process, but it will let Git use hardlinks or directly copy the files it needs.
- Doing it the second way will fire up the processes that Git normally uses to transfer data over a network, which is generally much less efficient. The main reason to specify the
file://
prefix is if you want a clean copy of the repository with extraneous references or objects left out
HTTP Protocols
- Smart HTTP
- Smart HTTP operates very similarly to the SSH or Git protocols but runs over standard HTTPS ports and can use various HTTP authentication mechanisms, meaning it’s often easier on the user than something like SSH, since you can use things like username/password authentication rather than having to set up SSH keys.
- This has become the most popilar way to use Git now, and in fact, for services like GitHub, the URL you use to view the repository online (for example, https://github.com/schacon/simplegit) is the same URL you can use to clone and, if you have access, push over.
- Dumb HTTP
- If the server does not respond with a Git HTTP smart service, the Git client will try to fall back to the simpler Dumb HTTP protocol. The Dumb protocol expects the bare Git repository to be served like normal files from the web server. The beauty of Dumb HTTP is the simplicity of setting it up. Basically, all you have to do is put a bare Git repository under your HTTP document root and set up a specific post-update hook, and you’re done (See Git Hooks).
- Smart HTTP
SSH Protocal
A common transport protocol for Git when self-hosting is over SSH. This is because SSH access to servers is already set up in most places — and if it isn’t, it’s easy to do. SSH is also an authenticated network protocol and, because it’s ubiquitous, it’s generally easy to set up and use.
To clone a Git repository over SSH, you can specify an
ssh:// URL
like this:1
$ git clone ssh://[user@]server/project.git
Or you can use the shorter scp-like syntax for the SSH protocol:
1
$ git clone [user@]server:project.git
In both cases above, if you don’t specify the optional
<username>
, Git assumes the user you’re currently logged in as ?.Note
- A negative aspect of SSH is that it doesn’t support anonymous access to your Git repository. If you’re using SSH, people must have SSH access to your machine, even in a read-only capacity, which doesn’t make SSH conducive to open source projects for which people might simply want to clone your repository to examine it.
Git Protocal
Finally, we have the Git protocol. This is a special daemon that comes packaged with Git; it listens on a dedicated port (9418) that provides a service similar to the SSH protocol, but with absolutely no authentication. In order for a repository to be served over the Git protocol, you must create a
git-daemon-export-ok
file — the daemon won’t serve a repository without that file in it — but, other than that, there is no security. Either the Git repository is available for everyone toclone
, or it isn’t. This is means that it will be read-only for either everyone or no-one.However, you can also setup push access, but, this means that, given the lack of authentication, it will be pushable for everyone or no-one on the internet.
Note:
- It is the fastest network transfer protocal available currently. It uses the same data-transfer mechanism as the SSH protocol but without the encryption and authentication overhead.
- It’s generally undesirable for the Git protocol to be the only access to your project. Generally, you’ll pair it with SSH or HTTPS access for the few developers who have push (write) access and have everyone else use
git://
for read-only access. It’s also probably the most difficult protocol to set up. It must run its own daemon, which requires xinetd or systemd configuration or the like, which isn’t always a walk in the park.
Setting up Git on a Server
Setting up an Existing Git Repo on Github
First you need to decide which protocal to use. Github provides two protocals: HTTP and SSH, which you can get when you see the
Quick Setup
tab or when you are inside the repo with theClone or Download
.Then, you navigate to your existing repo in your Git Bash, and add the remote repo by:
1
$ git remote add <remoteShortName> <remoteURLYouJustCopied>
Then you can push any of your branch in your local repo to the remote repo by:
1
$ git push <remoteShortName> <localBranchName:remoteBranchName>
If the
is the same as your , then you can just 1
$ git push <remoteShortName> <localBranchName>
Distributed WorkFlows
The aim for these flows is how to contribute code successfully to a project and make it as easy on you and the project maintainer as possible, and also how to maintain a project successfully with a number of developers contributing.
In general, due to the flexibility of Git, it presents a vast range of workflow possibilities for your project and/or your team, so we’ll cover a few common paradigms that take advantage of this flexibility.
Centralized Workflow
This means that if two developers clone from the hub and both make changes, the first developer to push their changes back up can do so with no problems. The second developer must merge in the first one’s work before pushing changes up, so as not to overwrite the first developer’s changes.
Integration-Manager Workflow
Because Git allows you to have multiple remote repositories, it’s possible to have a workflow where each developer has write access to their own public repository and read access to everyone else’s.This scenario often includes a canonical repository that represents the “official” project. To contribute to that project, you create your own public clone of the project and push your changes to it. Then, you can send a request to the maintainer of the main project to pull in your changes. The maintainer can then add your repository as a remote, test your changes locally, merge them into their branch, and push back to their repository.
- The project maintainer/manager pushes to their public/blessed repository.
- A contributor/developer clones that repository and makes changes.
- The contributor/developer pushes to their own public repository.
- The contributor/developer sends the maintainer an email asking them to pull changes.
- The maintainer/manager adds the contributor/developer’s repository as a remote and merges locally.
- The maintainer/manager pushes merged changes to the main repository.
Commit Guidelines
Having a good guideline for creating commits and sticking to it makes working with Git and collaborating with others a lot easier. The Git project provides a document that lays out a number of good tips for creating commits from which to submit patches — you can read it in the Git source code in the Documentation/SubmittingPatches
file
Do not commit/make whitespace changes (i.e. adding meaningless whitespaces into the program and committing those as changes). This can be easily detected by running
1
$ git diff
Try to make each commit a logically separate changeset. If you can, try to make your changes digestible — don’t code for a whole weekend on five different issues and then submit them all as one massive commit on Monday. Even if you don’t commit during the weekend, use the staging area on Monday to split your work into at least one commit per issue, with a useful message per commit.
Creating quality commit messages makes using and collaborating with Git a lot easier. As a general rule, your messages should start with a single line that’s no more than about 50 characters and that describes the changeset concisely, followed by a blank line, followed by a more detailed explanation. (The Git project requires that the more detailed explanation include your motivation for the change and contrast its implementation with previous behavior — this is a good guideline to follow. Write your commit message in the imperative: “Fix bug” and not “Fixed bug” or “Fixes bug.”)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21CAPITALIZED, short (50 chars or less) summary
More detailed explanatory text, if necessary. Wrap it to about 72
characters or so. In some contexts, the first line is treated as the
subject of an email and the rest of the text as the body. The blank
line separating the summary from the body is critical (unless you omit
the body entirely); tools like rebase can get confused if you run the
two together.
Write your commit message in the imperative: "Fix bug" and not "Fixed bug"
or "Fixes bug." This convention matches up with commit messages generated
by commands like git merge and git revert.
Further paragraphs come after blank lines.
- Bullet points are okay, too
- Typically a hyphen or asterisk is used for the bullet, followed by a
single space, with blank lines in between, but conventions vary here
- Use a hanging indent
Branch Naming
Sometimes, the maintainer of the Git project tends to namespace these branches — such as sc/ruby_client
, where sc
is short for the person who contributed the work, and ruby_client
is the branch the person worked on.
Therefore, it is sometimes a good idea to format your branchname to:
1 | $ git checkout -b <yourNickName/descriptiveBranchName> |
Forking a Project
Sometimes you want to work with a project that you do not have open push access to, but you might have the fork option, which allows you to clone the repo at least.
In this case, if you want to add some feature to the project, say featureA
, you could:
- clone the project to your local machine
- create a new branch
featureA
and work on it - create your own remote repo of the same project by forking that project
- push changes (preferably just the branch you edited, because if the change is
cherry-picked
or rejected, you don’t need to do too much rewindings) to your own remote repo that your forked - Now you need to notify the maintainers of the original project that you have work you’d like them to merge. This is often called a pull request, and you typically generate such a request either via the website — GitHub has its own “Pull Request” mechanism — or you can run the
git request-pull
command and email the subsequent output to the project maintainer manually.1
$ git request-pull <targetRemoteShortName>/<targetBranch> <myRemote>
Note:
- To keep your forked project up-to-date, you could add another
upstream
remote to your local forked repo. Then, to update your localmaster
branch with what the server has, you can rungit fetch upsteam/master
and thencheckout
to your localmaster
, rungit merge upstream/master
to merge in the new changes
- To keep your forked project up-to-date, you could add another
Difference bewtween git merge --squash
and git rebase
Sometimes, when there where quite a lot of updates in the server’s master
branch, and you would like your new changes to be based on the latest commit/change made, you would either do a rebase
or your featureA
branch to be directly off from the latest <remoteRepo>/master
branch, or you could do a merge --squash
.
- doing a
rebase
would produce a cleaner history by actually moving your branch to continue off from the<remoteRepo>/master
branch. This means it deletes the your original diverged branch and places it newly there. - doing a
merge --squash
would not delete your original diverged branch, and base it off from the lastest<remoteRepo>/master
branch. It does not delete the divergent history. This works by squashing all your changes/commit objects in your divergent branch to a single one, and puts it at the tip of the<remoteRepo>/master
branch.
Note:
- After you do a
git merge --squash <targetBranch>
, you need to *call anothergit commit
*manually to finish the change.
Nofifying Changes via Email
Sometimes, some projects might accept patches via a developer mailing list instead of pull requests. In this case, you need to mail your changes by:
1 | $ git format-patch -M <remoteShortName>/master |
For example
1 | $ git format-patch -M testRepo/master |
- The
git format-patch
is used to generate the mbox-formatted files that you can email to the list.
This will create a patch file under your current directory, which will look like this:
1 | $ cat 0001-TESTING-EMAIL-PATCH.patch |
Note:
- If you want to edit some writings. you can also edit text between the — line and the beginning of the patch (the diff –git line), if you don’t want them to show up in the commit message, so that the developers can read it, but that content is ignored by the patching process.
To email this to a mailing list, you can either paste the file into your email program or send it via a command-line program. Pasting the text often causes formatting issues, especially with “smarter” clients that don’t preserve newlines and other whitespace appropriately.
Sending it via commandline through an IMAP server
You will need to navigate to your
~/.gitconfig
file and add those lines in the end of the file:1
2
3
4
5
6
7[imap]
folder = "[Gmail]/Drafts"
host = imaps://imap.gmail.com
user = user@gmail.com
pass = YX]8g76G_2^sFbd
port = 993
sslverify = falseIf your IMAP server doesn’t use SSL, the last two lines probably aren’t necessary, and the host value will be
imap://
instead ofimaps://
. When that is set up, you can usegit imap-send
to place the patch series in the Drafts folder of the specified IMAP server:1
2
3
4
5
6
7$ cat *.patch | git imap-send
Resolving imap.gmail.com... ok
Connecting to [74.125.142.109]:993... ok
Logging in...
sending 2 messages
100% (2/2) done
At this point, you should be able to go to your Drafts folder, change the To field to the mailing list you’re sending the patch to, possibly CC the maintainer or person responsible for that section, and send it off.
Sending through a SMTP server
Similary, you will need to edit the
~/.gitconfig
file to add in the end:1
2
3
4
5[sendemail]
smtpencryption = tls
smtpserver = smtp.gmail.com
smtpuser = user@gmail.com
smtpserverport = 587After this is done, you can use git send-email to send your patches:
1
2
3
4
5
6
7$ git send-email *.patch
0001-add-limit-to-log-function.patch
0002-increase-log-output-to-30-from-25.patch
Who should the emails appear to be from? [Jessica Smith <jessica@example.com>]
Emails will be sent from: Jessica Smith <jessica@example.com>
Who should the emails be sent to? jessica@example.com
Message-ID to be used as In-Reply-To for the first email? y
Maintaining a Project
Basically, your aim is to accept work in a way that is clearest for other contributors and sustainable by you over the long run.
Maintaining using patches from email
- basically, you need to run
git apply
for agit diff
formatted object, orgit am
for apatch
formatted object.
Using git apply
If you received the patch from someone who generated it with git diff
or some variation of the Unix diff command (which is not recommended; see the next section), you can apply it with the git apply command. Assuming you saved the patch at /tmp/patch-ruby-client.patch
, you can apply the patch like this:
1 | $ git apply /tmp/patch-ruby-client.patch |
This will also handle file adds, deletes, and renames if they’re described in the git diff
format, which patch won’t do. Finally, git apply
is an “apply all or abort all” model where either everything is applied or nothing is, whereas git am
for patch
can partially apply patchfiles, leaving your working directory in a weird state. git apply
is overall much more conservative than patch.
However, you can use git apply --check
, for example:
1 | $ git apply --check 0001-see-if-this-helps-the-gem.patch |
To check if a patch
applies cleanly to your working directory. If there is no output, then the patch should apply cleanly.
Using git am
To apply a patch
generated by format-patch
, you use git am
(the command is named am as it is used to “apply a series of patches from a mailbox”).
if someone emailed you the patch properly using
git send-email
, namely the content of the patch being the mail content + subject title, and you download that into an mbox format, then you can pointgit am
to that mbox file, and it will start applying all the patches it sees.if someone uploaded a patch file generated via
git format-patch
to a ticketing system or something similar, you can save the file locally and then pass that file saved on your disk togit am
to apply it
However, it’s possible that the patch won’t apply cleanly. Perhaps your main branch has diverged too far from the branch the patch was built from, or the patch depends on another patch you haven’t applied yet. In that case, the git am process
will fail and ask you what you want to do:
1 | $ git am 0001-see-if-this-helps-the-gem.patch |
This command puts conflict markers in any files it has issues with, much like a conflicted merge or rebase operation. You solve this issue much the same way — edit the file to resolve the conflict, stage the new file, and then run git am --resolved
to continue to the next patch in the series of patches that you haven’t yet completed:
1 | $ (fix the file) |
Managing Using the Remote Branches
If your contribution came from a Git user who set up their own repository, pushed a number of changes into it, and then sent you the URL to the repository and the name of the remote branch the changes are in, you can add them as a remote and do merges locally.
For example, if a developer named Jessica sends you an email saying that she has a great new feature in the ruby-client
branch of her repository, you can test it by adding the remote and checking out that branch locally:
1 | $ git remote add jessica git://github.com/jessica/myproject.git |
Note:
- You do a
fetch
here instead of a directpull
because you might not want to automatically merge her work into any of your branch. You might want to see if her work functions properly before you runmerge
.
However, if you aren’t working with a person consistently but do something like a one-time pull, you can use the URL of their remote repository to the git pull
command. This does a one-time pull and doesn’t save the URL as a remote reference:
1 | $ git pull https://github.com/onetimeguy/project |
Note:
- Here the command
git pull https://github.com/onetimeguy/project
pulls in all the progress to your current branch. If you just want to pull a specific branch, remember to add the option of thelike this git pull https://github.com/onetimeguy/project <remoteBranchName>
Determining What is Changed
Now you have a topic
branch that contains contributed work. At this point, you can determine what you’d like to do with it. This section revisits a couple of commands so you can see exactly what you’ll be introducing if you merge this into your main branch.
to see only commits that are different between the
topic
branch and yourmaster
branch (for example), you could run1
$ git log <branch1> --not <branch2>
In this example:
1
2
3
4
5
6
7
8
9
10
11
12$ git log topic --not master
commit 5b6235bd297351589efc4d73316f0a68d484f118
Author: Scott Chacon <schacon@gmail.com>
Date: Fri Oct 24 09:53:59 2008 -0700
See if this helps the gem
commit 7482e0d16d04bea79d0dba8988cc78df655f16a0
Author: Scott Chacon <schacon@gmail.com>
Date: Mon Oct 22 19:38:36 2008 -0700
Update gemspec to hopefully work betterNote
- To see what changes each commit introduces, remember that you can pass the
-p
option togit log
and it will append thediff
introduced to each commit.
- To see what changes each commit introduces, remember that you can pass the
The more useful case might be seeing a full
diff
of what would happen if you merge thetopic
branch. So you might want to run:1
$ git diff <branch1>...<branch2>
In this example:
1
$ git diff master...topic
This command shows you only the work your current
topic
branch has introduced since its common ancestor withmaster
.This triple dot technically is a shorthand command created by Git to achieve the following:
1
2
3$ git merge-base contrib master
36c7dba2c95e6bbb78dfa822519ecfec6e1ca649
$ git diff 36c7dbor, more concisely:
1
$ git diff $(git merge-base contrib master)
Note:
- You might think there is an easier way by doing
git diff master
. This might work, but not always. This is because this command gives you adiff
by comparing the last commit of the topic branch you’re on and the last commit on the master branch. This means that if yourmaster
branch has moved forward since you created thetopic
branch from it, then you’ll get seemingly strange results. For example, if you’ve added a line in a file on themaster
branch, a direct comparison of the snapshots will look like thetopic
branch is going to remove that line.
- You might think there is an easier way by doing
Determining your Merging Workflow
Basically, you need to decide how to manage all your merge
and rebase
actions, so that you could easily track changes and make it easier to understand. There are several example styles that you could use for managing your project merges:
- Merging Workflow for small projects
- Merging Workflow for large projects
- Rebasing and Cherry-Picking Workflows
Small Merging Workflow
The simplest way is to do git merge
all the time to your master
branch. However, this might cause unpleasant history line to review. However, this is often manageable for small projects, and typically you will have a main track with two branches: master
and develop
, in which you determine that master
is updated only when a very stable release is cut and all new code is integrated into the develop
branch.
Then, since you typically first merge
into your develop
before deciding whether or not it is to be released in the master
branch:
And then if you do decide to merge
in to the master
branch:
Large Merging Workflow
This workflow will have four branch
es on the main track: master
,next
,pu
,and maint
.
master
branch will still be the one with the most stable version of the systemnext
branch will be controlling/merging thetopics
/features
branches that you got from your colaborators repo and you tested are functioning properly, considering whether to include or not.pu
branch means proposed updates, these could be branches with work that are interesting but need more workmaint
branch usually stores the project at the state of the last release, so that it can be forked off to provide backported patches in case a maintenance release is required.
For example, you might have your topic
branches from your colaborators’ repos like this:
And you organizes/maintains your main branch
es (maint
not shown here) like this:
Rebasing and Cherry-Picking Workflow
Other maintainers prefer to rebase
or cherry-pick
contributed work on top of their master branch, rather than merging it in, to keep a mostly linear history. When you have work in a topic
branch and have determined that you want to integrate it, you move to that branch and run the rebase
command to rebuild the changes on top of your current master
(or develop
, and so on) branch. If that works well, you can fast-forward your master branch, and you’ll end up with a linear project history.
Or you could do a cherry-pick, which does not remove the original branch but “rebases” on top of the current branch, and you can choose which commit/commits you would like to incorporate (which means you might want to incorporate only several commits in that branch, but not all).
For example, if you are given this:
1 |
|
Then a
rebase
command:1
$ git rebase master topic
You get:
1
2
3A'--B'--C' topic
/
D---E---F---G masterThe run
cherry-pick <from-commit-id>^..<to-commit-id>
:1
2$ git checkout master -b topic_new
$ git cherry-pick A^..CYou get:
1
2
3
4
5A---B---C topic
/
D---E---F---G master
\
A'--B'--C' topic_new
Rerere in Git
Rerere stands for “reuse recorded resolution” — it’s a way of shortcutting manual conflict resolution. When rerere is enabled, Git will keep a set of pre- and post-images from successful merges, and if it notices that there’s a conflict that looks exactly like one you’ve already fixed, it’ll just use the fix from last time, without bothering you with it.
This feature comes in two parts: a configuration setting and a command. The configuration setting is rerere.enabled
, and it’s handy enough to put in your global config
:
1 | $ git config --global rerere.enabled true |
Now, whenever you do a merge
that resolves conflicts, the resolution will be recorded in the cache in case you need it in the future.
If you need to, you can interact with the rerere cache using the git rerere
command. When it’s invoked alone, Git checks its database of resolutions and tries to find a match with any current merge conflicts and resolve them (although this is done automatically if rerere.enabled
is set to true).
Generating a Build Number
If you run git describe
, Git generates a string consisting of the name of the most recent tag earlier than that commit, followed by the number of commits since that tag, followed finally by a partial SHA-1 value of the commit being described (prefixed with the letter “g” meaning Git):
1 | $ git describe |
Note
- By default, the
git describe
command requires annotated tags (tags created with the -a or -s flag); if you want to take advantage of lightweight (non-annotated) tags as well, add the--tags
option to the command.
Preparing a Release Using a Zip/Gzip
Now you want to release a build. One of the things you’ll want to do is create an archive of the latest snapshot of your code for those poor souls who don’t use Git. The command to do this is git archive
:
To create a GZip file with the name being the build number generated from the previous section, you do:
1
2
3
4$ git archive master --prefix='projectArchive/' | gzip > `git describe master`.tar.gz
$ ls *.tar.gz
v1.0-3-gca2e64e.tar.gzwhere the
--prefix='projectArchive'
means that if you open that Gzip file, you will find a directory calledprojectArchive
in which all your files are saved.You can also create a zip archive in much the same way, but by passing the
--format=zip
option togit archive
:1
$ git archive master --format=zip > `git describe master`.zip
Here you will have all the files directly in that Zip, without opening any more directory
Shortlog
If you use the git shortlog
command. It summarizes all the commits in the range you give it; for example, the following gives you a summary of all the commits since your last release, if your last release was named v0.1:
1 | $ git shortlog --no-merges master --not v0.1 |
Git Commit Selection Tools
Now you’ll explore a number of very powerful things that Git can do that you may not necessarily use on a day-to-day basis but that you may need at some point.
Revision Selection Type | Command | Outcome/Example | Note |
---|---|---|---|
Showing a single commit using Short SHA-1 | git show <commit-sha-1> |
git show ae1934jf |
- |
Showing a single commit at the top of a branch | git show <branchName> |
git show topic1 gives to top commit at branch topic1 |
You can use git rev-parse <branchName> to get a SHA-1 code of the top commit from a branch |
Showing a single commit using a reflog | git show <reflog-data-of-HEAD> |
git show HEAD@{5} |
The reflog is data that keeps track of your HEAD position, which you can get from running git reflog |
Showing a single commit using ancestry reference using ^ (means the parent of something) |
git show <commit-id>^ |
git show HEAD^ shows the parent of the HEAD |
You can also specify a number for going back multiple times, d921970^2 means “the second parent of d921970 .” |
Showing multiple commits using the difference of two branches | git log <branch1>..<branch2> — that means “all commits on <branch2> that are not reachable from <branch1> |
git log master..topic1 |
The sequence of the <branch> matters in this command. |
Showing multiple commits using the difference of two branches | git log <branch1>...<branch2> which specifies all the commits that are reachable by either of two references but not by both of them/all the divergent commits |
git log master...topic1 |
A common switch to use with the log command in this case is --left-right , which shows you which branch (side) of the range each commit is in. This helps make the output more useful: $ git log --left-right master...experiment |
Note:
- Whenver you want to see multiple commits, use
git log
would be better thangit show
, because you can usegit log --oneline
to simplify the output while it is not available forgit show
, which is more for looking at a specific commit.
Git Tools for Interactive Staging
These can help you craft your commits to include only certain combinations and parts of files. These tools are helpful if you modify a number of files extensively, then decide that you want those changes to be partitioned into several focused commits rather than one big messy commit.
Basically, interactive staging is initiated if you call git add -i
:
1 | $ git add -i |
Here, you will see changes that are staged, changes that are unstaged, and the file/path
Command | Function | Example |
---|---|---|
1 or s |
Shows the same content as you see above, with the updated status | - |
2 or u |
First allows you to select file(s) of interest, and then add them to be staged | When you typed 2 or u , you will see Update>> , and you can type in 1,2 to select the first and second file, and press enter so you will see the files with * mark as selected. Then press Enter again to do the changes. |
3 or r |
First allows you to select file(s) of interest, and then *undo the staging * | Similar as above |
5 or p |
First allows you to select file(s) of interest, and then goes through each of the changes in that file and and let you decide whether if to incorporate the changes | Similar as above, but when you need to select which change to add, you will be prompted with Stage this hunk [y,n,a,d,/,j,J,g,e,?] , with typing ? allowing to see what each letter means |
6 or d |
First allows you to select file(s) of interest that are staged, then you can see the staged diff | Similar as above |
Git Tools for Stashing
When things are in a messy state and you want to switch branches for a bit to work on something else, but you don’t want to do a commit of half-done work just so you can get back to this point later, you can use git stash
or git stash push
For example, you have some modifications done but need to switch urgently to something else:
1 | $ git status |
Then you do:
1 | $ git stash push |
At this point, you can switch branches and do work elsewhere; your changes are stored on your stack. To see which stashes you’ve stored, you can use git stash list:
1 | $ git stash list |
Then, when you finished your urgent work, you can reapply by git stash apply stash@{<number>}
, which will reapply changes on the branch you run this command, much like a rebase
For example:
1 | $ git stash apply stash@{2} --index |
The --index
is often added because, when changes to your files were reapplied, by default the file you staged before will not be restaged, since it only reapplies the changes. To also re-stage what you have staged, you must run the git stash apply
command with a --index
option to tell the command to try to reapply the staged changes.
Note:
- Similar to
rebase
ormerge
, there could be conflicts as changes might not re-apply cleanly, if the files you edited in your stash were edited again. To resolve the conflicts, the same mechanism formerge
conflicts will pop up for you to deal with.- After you have applied a stash, remember to
drop
it (i.e. delete it), because by defaultgit stash apply
will only apply the changes, and you have to delete that resolved stash manually bygit stash drop stash@{<number>}
- By default,
git stash
will stash only modified and staged tracked files. If you specify--include-untracked
or-u
, you can stash untracked files that are not ignored. To even include the ignored files, use--all
or-a
instead.
You can also use the interactive stash by running git stash --patch
(not -i
. This will prompt you interactively which of the changes you would like to stash and which you would like to keep/commit in your working directory.
1 | $ git stash --patch |
A last option is to create a new branch for recovering your stashed work. This is done by git stash branch <newBranchname>
, for you with your selected branch name, checks out the commit you were on when you stashed your work, reapplies your work there, and then drops the stash if it applies successfully:
1 | $ git stash branch testchanges |
Git Tools for Cleanig Up Your Workspace
You can use git clean
for deleting files that are not tracked. However, you’ll want to be pretty careful with this command, since, If you change your mind, there is often no retrieving the content of those files.
Therefore, often you might want to run git clean --dry-run
or git clean -n
to see what will happen before you run git clean
1 | $ git clean --dry-run |
However, these do not include ignored files. To include those files for removal, you need to add the option -x
1 | $ git clean --dry-run -x |
Note
- Sometimes you might need to add the flag
-f
to do agit clean
, because it is required if the Git configuration variableclean.requireForce
is not explicitly set tofalse
. Therefore, you will need to force it by-f
.
Git Tools for Signing a Commit
In more recent versions of Git (v1.7.9 and above), you can now also sign individual commits. If you’re interested in signing commits directly instead of just the tags, all you need to do is add a -S
to your git commit
command.
However, first you need to get your GPG configured and your personal key installed.
If you don’t have a key installed, you can generate one with gpg --gen-key
.
1 | $ gpg --gen-key |
Then you can check it with:
1 | $ gpg --list-keys |
Once you have a private key to sign with, you can configure Git to use it for signing things by setting the user.signingkey
config setting.
1 | $ git config --global user.signingkey 0A46826A |
Now, you can sign your commit properly with -S
, which will pop up a window for putting in the password you created when you create the key.
1 | $ git commit -a -S -m 'Signed commit' |