git
is a (the!) version control system.
In this article I'll try to show in a short way how to improve your usage of
it.
I assume you already know git clone
,
git commit
,
git push
,
git pull
and
git branch
.
Why Version Control
When you write software, you usually can't write it from beginning to end in one stream. Instead, you will have different directions in which you go. And you will work one day on feature A, then on next day you have to fix something but you already changed a bit - not everything to implement A. Hence you need to go back.
Or you are not sure if the thing you're trying to do actually improves things. So you want to save the current state to be able to go back.
In the past, people did so by copy-and-paste. But then you end in your file
system having directories like awesome-project
, awesome-project-copy
,
awesome-project-1990-04-28
, awesome-project-feature-a
, ...
So you want to be able to manage that properly and jump back and compare different versions.
Change steps
- Working directory:
git add
to add it to staging - Staging:
git commit
to move it to the local repository - Local repository:
git push
to move it to a remote repository,git fetch
to get it from remote. - Remote repository
What it's not for
In general, I would not use git for things where you can't reasonably make a diff. This includes:
- Binary files: Ipython notebooks, compiled code
- Big files (> 5 MB): Data.
- Auto-generated files.
Of course, there are exceptions. Just keep it practical.
Commit Messages
Commit messages are intended for developers to understand what was changed.
I like the SciPy commit message guide. They start every message with a prefix. Copied from this page, here is how their messages look like:
ENH: add functionality X to numpy.<submodule>.
The first line of the commit message starts with a capitalized acronym
(options listed below) indicating what type of commit this is. Then a blank
line, then more text if needed. Lines shouldn't be longer than 72
characters. If the commit is related to a ticket, indicate that with
"See #3456", "See ticket 3456", "Closes #3456" or similar.
The prefixes I suggest to have, are:
API: an (incompatible) API change
BUG: bug fix
DEP: deprecate something, or remove a deprecated object
DEV: development tool or utility
DOC: documentation
ENH: enhancement
MAINT: maintenance commit (refactoring, typos, etc.)
REV: revert an earlier commit
STY: style fix (whitespace, PEP8)
TST: addition or modification of tests
REL: related to releasing the project, e.g. change of a version number
I also suggest to keep those messages shorter than 80 characters.
Services like GitHub can directly connect commits with issues if you reference
the issues like issue #123
in the commit message.
See also:
- WhatTheCommit.com: Funny commit messages
git log
git log
shows you the commit messages and
other logged changes:
commit e44ed1e8e5bdcada8534369c5135e77677253de1 (HEAD -> master, origin/master, origin/HEAD)
Author: Martin Thoma <[email protected]>
Date: Fri Jul 6 07:21:10 2018 +0200
DOC: Fix documentation of string return values
commit 8cf8c8d4b168ef3b0a32617917195a352f4ce723
Author: Martin Thoma <[email protected]>
Date: Fri Jul 6 07:15:32 2018 +0200
REL: v0.9.0
commit de71db384b5e08c65067937744bfe26a822796b1
Author: Martin Thoma <[email protected]>
Date: Fri Jul 6 07:13:33 2018 +0200
ENH: Add string.str2bool_or_none, str2float_or_none, str2int_or_none
Closes #31
commit 31bd14cbe7e317226ff7be15ec7d68df93665688
Author: Martin Thoma <[email protected]>
Date: Wed Jul 4 22:46:16 2018 +0200
ENH: Add prime number generator
commit bb56e82fa402827b3de8140783e0c4fc95e3699d
Author: Martin Thoma <[email protected]>
Date: Wed Jul 4 07:21:02 2018 +0200
DOC: document mpu.path
commit 9d929238ef04ecb866775b3e687d782090c484bf
Author: Martin Thoma <[email protected]>
Date: Wed Jul 4 06:51:40 2018 +0200
STY: Fix issues discovered by codacy
commit c31f575d3835049c65b8ab6c7fe909191779a912 (tag: v0.8.0)
Author: Martin Thoma <[email protected]>
Date: Tue Jul 3 23:22:37 2018 +0200
REL: Release v0.8.0
commit f3f7f1192a7b495f74156977bda1bd220a87e4eb
Author: Martin Thoma <[email protected]>
Date: Tue Jul 3 23:21:01 2018 +0200
ENH: Add mpu.path.get_all_files
See issue #14
You can also adjust the output to your needs:
$ git log --pretty=oneline
e44ed1e8e5bdcada8534369c5135e77677253de1 (HEAD -> master, origin/master, origin/HEAD) DOC: Fix documentation of string return values
8cf8c8d4b168ef3b0a32617917195a352f4ce723 REL: v0.9.0
de71db384b5e08c65067937744bfe26a822796b1 ENH: Add string.str2bool_or_none, str2float_or_none, str2int_or_none
31bd14cbe7e317226ff7be15ec7d68df93665688 ENH: Add prime number generator
bb56e82fa402827b3de8140783e0c4fc95e3699d DOC: document mpu.path
9d929238ef04ecb866775b3e687d782090c484bf STY: Fix issues discovered by codacy
c31f575d3835049c65b8ab6c7fe909191779a912 (tag: v0.8.0) REL: Release v0.8.0
f3f7f1192a7b495f74156977bda1bd220a87e4eb ENH: Add mpu.path.get_all_files
git diff
git diff
compares two commits, e.g.
$ git diff e44ed1e8e5bdcada8534369c5135e77677253de1..8cf8c8d4b168ef3b0a32617917195a352f4ce723
Looking up the git hashes (e44ed1...
and 8cf8c8d...
) is cumbersome, so
there are shortcuts. For example HEAD
is the latest version. And ^
gives
you the version before. Hence
$ git diff HEAD^..HEAD
compares the current version with that before.
git blame
git blame
gives you an overview who
last edited which line. It can be used to "blame" somebody for a change. Or,
better, to find the person who to ask why something was done the way it
currently is in the repository.
For the a part of the filters.py
in SciPy ndimages it looks like this:
e2fbe76393 scipy/ndimage/filters.py (Jaime Fernandez 2018-03-07 01:15:11 +0100 298) @_ni_docstrings.docfiller
a1a629221f scipy/ndimage/filters.py (Tim Leslie 2013-04-12 11:36:45 +0000 299) def prewitt(input, axis=-1, output=None, mode="reflect", cval=0.0):
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield 2006-03-18 13:52:58 +0000 300) """Calculate a Prewitt filter.
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 301)
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 302) Parameters
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 303) ----------
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 304) %(input)s
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 305) %(axis)s
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 306) %(output)s
7fb0a279ac scipy/ndimage/filters.py (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000 307) %(mode_multiple)s
a4eba7aeaf scipy/ndimage/filters.py (Matthew Brett 2008-12-14 11:51:37 +0000 308) %(cval)s
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 309)
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 310) Examples
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 311) --------
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 312) >>> from scipy import ndimage, misc
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 313) >>> import matplotlib.pyplot as plt
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 314) >>> fig = plt.figure()
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 315) >>> plt.gray() # show the filtered result in grayscale
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 316) >>> ax1 = fig.add_subplot(121) # left side
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 317) >>> ax2 = fig.add_subplot(122) # right side
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 318) >>> ascent = misc.ascent()
d197708c0c scipy/ndimage/filters.py (Martin Thoma 2016-06-12 17:18:59 +0200 319) >>> result = ndimage.prewitt(ascent)
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 320) >>> ax1.imshow(ascent)
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 321) >>> ax2.imshow(result)
f0f55115ff scipy/ndimage/filters.py (Martin Thoma 2016-11-13 12:14:38 +0100 322) >>> plt.show()
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield 2006-03-18 13:52:58 +0000 323) """
ee4db9c301 Lib/ndimage/filters.py (Robert Kern 2006-09-24 09:05:13 +0000 324) input = numpy.asarray(input)
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield 2006-03-18 13:52:58 +0000 325) axis = _ni_support._check_axis(axis, input.ndim)
6f3089c43b scipy/ndimage/filters.py (Jaime Fernandez 2018-02-22 00:47:21 +0100 326) output = _ni_support._get_output(output, input)
7fb0a279ac scipy/ndimage/filters.py (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000 327) modes = _ni_support._normalize_sequence(mode, input.ndim)
7fb0a279ac scipy/ndimage/filters.py (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000 328) correlate1d(input, [-1, 0, 1], axis, output, modes[axis], cval, 0)
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield 2006-03-18 13:52:58 +0000 329) axes = [ii for ii in range(input.ndim) if ii != axis]
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield 2006-03-18 13:52:58 +0000 330) for ii in axes:
7fb0a279ac scipy/ndimage/filters.py (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000 331) correlate1d(output, [1, 1, 1], ii, output, modes[ii], cval, 0,)
6f3089c43b scipy/ndimage/filters.py (Jaime Fernandez 2018-02-22 00:47:21 +0100 332) return output
git squash
The git commit history should be kept clean. If you develop a new feature you
might want to push the latest state - even if something is not working - from
time to time to the server so that your co-workers can have a look at it. But in
the master
branch it would be nice if every single commit was a single new
feature.
Commit squashing allows you to change multiple commits into a single one. This needs to re-write the version history which cannot be undone. Thus you need to force-push which will cause problems for everybody who had the version you were changing.
So as a general guide:
- Never force-push on master
- Avoid force-push on branches where multiple people work
- Squash commits, if you are the only one working on it and if it makes it easier to understand what you did.
Similar, but simpler is git commit --amend
. It lets you add something to the
latest commit in the current branch and edit the commit message.
Merging
When you are currently in a merge, git will tell you form time to time that
something cannot be auto-merged. Then you can execute git mergetool
which will start whichever tool you configured. In my case, it is meld
.
See my Software Versioning Cheat Sheet.
Tagging
Git gives your versions a hash. This is ok most of the time, but sometimes you
actually want to note that your software has a specific version. You can use
git tag -a v1.4
(see book)
for that.
Github marks those tags as releases. See my mpu releases, for example.
You can also show diffs / logs with the tags, e.g.:
$ git diff v0.2.0..v0.3.0
Git Hooks
Software Hooks are a plugin-technique.
Hooks allow users to alter and extend the software. In git, there are many
different hooks. They follow the naming schema [pre/post]-[event]
hook. Here
are some with possible use-cases.
pre-commit
:- Check the commit message for spelling errors
- Apply a linter to check for coding standards
- Execute (fast) tests to ensure correctnes
pre-receive
:- Enforce project coding standards.
- Enforce working tests
post-commit
:- Notify team members of a new commit (e.g. via e-mail or slack)
post-receive
:- Push the code to production
See also: githooks.com