• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

Git - Version Control Done Right

Contents

  • Why Version Control
  • Change steps
  • What it's not for
  • Commit Messages
  • git log
  • git diff
  • git blame
  • git squash
  • Merging
  • Tagging
  • Git Hooks
  • Workflows
  • See also

git is a (the!) version control system. In this article I'll try to show in a short way how to improve your usage of it.

I assume you already know git clone, git commit, git push, git pull and git branch.

Why Version Control

When you write software, you usually can't write it from beginning to end in one stream. Instead, you will have different directions in which you go. And you will work one day on feature A, then on next day you have to fix something but you already changed a bit - not everything to implement A. Hence you need to go back.

Or you are not sure if the thing you're trying to do actually improves things. So you want to save the current state to be able to go back.

In the past, people did so by copy-and-paste. But then you end in your file system having directories like awesome-project, awesome-project-copy, awesome-project-1990-04-28, awesome-project-feature-a, ...

So you want to be able to manage that properly and jump back and compare different versions.

Change steps

  • Working directory: git add to add it to staging
  • Staging: git commit to move it to the local repository
  • Local repository: git push to move it to a remote repository, git fetch to get it from remote.
  • Remote repository

What it's not for

In general, I would not use git for things where you can't reasonably make a diff. This includes:

  • Binary files: Ipython notebooks, compiled code
  • Big files (> 5 MB): Data.
  • Auto-generated files.

Of course, there are exceptions. Just keep it practical.

Commit Messages

Commit messages are intended for developers to understand what was changed.

I like the SciPy commit message guide. They start every message with a prefix. Copied from this page, here is how their messages look like:

ENH: add functionality X to numpy.<submodule>.

The first line of the commit message starts with a capitalized acronym
(options listed below) indicating what type of commit this is.  Then a blank
line, then more text if needed.  Lines shouldn't be longer than 72
characters.  If the commit is related to a ticket, indicate that with
"See #3456", "See ticket 3456", "Closes #3456" or similar.

The prefixes I suggest to have, are:

API: an (incompatible) API change
BUG: bug fix
DEP: deprecate something, or remove a deprecated object
DEV: development tool or utility
DOC: documentation
ENH: enhancement
MAINT: maintenance commit (refactoring, typos, etc.)
REV: revert an earlier commit
STY: style fix (whitespace, PEP8)
TST: addition or modification of tests
REL: related to releasing the project, e.g. change of a version number

I also suggest to keep those messages shorter than 80 characters.

Services like GitHub can directly connect commits with issues if you reference the issues like issue #123 in the commit message.

See also:

  • WhatTheCommit.com: Funny commit messages

git log

git log shows you the commit messages and other logged changes:

commit e44ed1e8e5bdcada8534369c5135e77677253de1 (HEAD -> master, origin/master, origin/HEAD)
Author: Martin Thoma <[email protected]>
Date:   Fri Jul 6 07:21:10 2018 +0200

    DOC: Fix documentation of string return values

commit 8cf8c8d4b168ef3b0a32617917195a352f4ce723
Author: Martin Thoma <[email protected]>
Date:   Fri Jul 6 07:15:32 2018 +0200

    REL: v0.9.0

commit de71db384b5e08c65067937744bfe26a822796b1
Author: Martin Thoma <[email protected]>
Date:   Fri Jul 6 07:13:33 2018 +0200

    ENH: Add string.str2bool_or_none, str2float_or_none, str2int_or_none

    Closes #31

commit 31bd14cbe7e317226ff7be15ec7d68df93665688
Author: Martin Thoma <[email protected]>
Date:   Wed Jul 4 22:46:16 2018 +0200

    ENH: Add prime number generator

commit bb56e82fa402827b3de8140783e0c4fc95e3699d
Author: Martin Thoma <[email protected]>
Date:   Wed Jul 4 07:21:02 2018 +0200

    DOC: document mpu.path

commit 9d929238ef04ecb866775b3e687d782090c484bf
Author: Martin Thoma <[email protected]>
Date:   Wed Jul 4 06:51:40 2018 +0200

    STY: Fix issues discovered by codacy

commit c31f575d3835049c65b8ab6c7fe909191779a912 (tag: v0.8.0)
Author: Martin Thoma <[email protected]>
Date:   Tue Jul 3 23:22:37 2018 +0200

    REL: Release v0.8.0

commit f3f7f1192a7b495f74156977bda1bd220a87e4eb
Author: Martin Thoma <[email protected]>
Date:   Tue Jul 3 23:21:01 2018 +0200

    ENH: Add mpu.path.get_all_files

    See issue #14

You can also adjust the output to your needs:

$ git log --pretty=oneline

e44ed1e8e5bdcada8534369c5135e77677253de1 (HEAD -> master, origin/master, origin/HEAD) DOC: Fix documentation of string return values
8cf8c8d4b168ef3b0a32617917195a352f4ce723 REL: v0.9.0
de71db384b5e08c65067937744bfe26a822796b1 ENH: Add string.str2bool_or_none, str2float_or_none, str2int_or_none
31bd14cbe7e317226ff7be15ec7d68df93665688 ENH: Add prime number generator
bb56e82fa402827b3de8140783e0c4fc95e3699d DOC: document mpu.path
9d929238ef04ecb866775b3e687d782090c484bf STY: Fix issues discovered by codacy
c31f575d3835049c65b8ab6c7fe909191779a912 (tag: v0.8.0) REL: Release v0.8.0
f3f7f1192a7b495f74156977bda1bd220a87e4eb ENH: Add mpu.path.get_all_files

git diff

git diff compares two commits, e.g.

$ git diff e44ed1e8e5bdcada8534369c5135e77677253de1..8cf8c8d4b168ef3b0a32617917195a352f4ce723

Looking up the git hashes (e44ed1... and 8cf8c8d...) is cumbersome, so there are shortcuts. For example HEAD is the latest version. And ^ gives you the version before. Hence

$ git diff HEAD^..HEAD

compares the current version with that before.

git blame

git blame gives you an overview who last edited which line. It can be used to "blame" somebody for a change. Or, better, to find the person who to ask why something was done the way it currently is in the repository.

For the a part of the filters.py in SciPy ndimages it looks like this:

e2fbe76393 scipy/ndimage/filters.py   (Jaime Fernandez         2018-03-07 01:15:11 +0100  298) @_ni_docstrings.docfiller
a1a629221f scipy/ndimage/filters.py   (Tim Leslie              2013-04-12 11:36:45 +0000  299) def prewitt(input, axis=-1, output=None, mode="reflect", cval=0.0):
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield            2006-03-18 13:52:58 +0000  300)     """Calculate a Prewitt filter.
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  301)
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  302)     Parameters
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  303)     ----------
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  304)     %(input)s
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  305)     %(axis)s
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  306)     %(output)s
7fb0a279ac scipy/ndimage/filters.py   (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000  307)     %(mode_multiple)s
a4eba7aeaf scipy/ndimage/filters.py   (Matthew Brett           2008-12-14 11:51:37 +0000  308)     %(cval)s
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  309)
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  310)     Examples
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  311)     --------
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  312)     >>> from scipy import ndimage, misc
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  313)     >>> import matplotlib.pyplot as plt
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  314)     >>> fig = plt.figure()
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  315)     >>> plt.gray()  # show the filtered result in grayscale
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  316)     >>> ax1 = fig.add_subplot(121)  # left side
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  317)     >>> ax2 = fig.add_subplot(122)  # right side
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  318)     >>> ascent = misc.ascent()
d197708c0c scipy/ndimage/filters.py   (Martin Thoma            2016-06-12 17:18:59 +0200  319)     >>> result = ndimage.prewitt(ascent)
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  320)     >>> ax1.imshow(ascent)
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  321)     >>> ax2.imshow(result)
f0f55115ff scipy/ndimage/filters.py   (Martin Thoma            2016-11-13 12:14:38 +0100  322)     >>> plt.show()
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield            2006-03-18 13:52:58 +0000  323)     """
ee4db9c301 Lib/ndimage/filters.py     (Robert Kern             2006-09-24 09:05:13 +0000  324)     input = numpy.asarray(input)
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield            2006-03-18 13:52:58 +0000  325)     axis = _ni_support._check_axis(axis, input.ndim)
6f3089c43b scipy/ndimage/filters.py   (Jaime Fernandez         2018-02-22 00:47:21 +0100  326)     output = _ni_support._get_output(output, input)
7fb0a279ac scipy/ndimage/filters.py   (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000  327)     modes = _ni_support._normalize_sequence(mode, input.ndim)
7fb0a279ac scipy/ndimage/filters.py   (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000  328)     correlate1d(input, [-1, 0, 1], axis, output, modes[axis], cval, 0)
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield            2006-03-18 13:52:58 +0000  329)     axes = [ii for ii in range(input.ndim) if ii != axis]
ca465a651f Lib/ndimage/Lib/filters.py (Ed Schofield            2006-03-18 13:52:58 +0000  330)     for ii in axes:
7fb0a279ac scipy/ndimage/filters.py   (Alvaro Sanchez-Gonzalez 2016-11-25 10:03:14 +0000  331)         correlate1d(output, [1, 1, 1], ii, output, modes[ii], cval, 0,)
6f3089c43b scipy/ndimage/filters.py   (Jaime Fernandez         2018-02-22 00:47:21 +0100  332)     return output

git squash

The git commit history should be kept clean. If you develop a new feature you might want to push the latest state - even if something is not working - from time to time to the server so that your co-workers can have a look at it. But in the master branch it would be nice if every single commit was a single new feature.

Commit squashing allows you to change multiple commits into a single one. This needs to re-write the version history which cannot be undone. Thus you need to force-push which will cause problems for everybody who had the version you were changing.

So as a general guide:

  • Never force-push on master
  • Avoid force-push on branches where multiple people work
  • Squash commits, if you are the only one working on it and if it makes it easier to understand what you did.

Similar, but simpler is git commit --amend. It lets you add something to the latest commit in the current branch and edit the commit message.

Merging

When you are currently in a merge, git will tell you form time to time that something cannot be auto-merged. Then you can execute git mergetool which will start whichever tool you configured. In my case, it is meld. See my Software Versioning Cheat Sheet.

Tagging

Git gives your versions a hash. This is ok most of the time, but sometimes you actually want to note that your software has a specific version. You can use git tag -a v1.4 (see book) for that.

Github marks those tags as releases. See my mpu releases, for example.

You can also show diffs / logs with the tags, e.g.:

$ git diff v0.2.0..v0.3.0

Git Hooks

Software Hooks are a plugin-technique. Hooks allow users to alter and extend the software. In git, there are many different hooks. They follow the naming schema [pre/post]-[event] hook. Here are some with possible use-cases.

  • pre-commit:
    • Check the commit message for spelling errors
    • Apply a linter to check for coding standards
    • Execute (fast) tests to ensure correctnes
  • pre-receive:
    • Enforce project coding standards.
    • Enforce working tests
  • post-commit:
    • Notify team members of a new commit (e.g. via e-mail or slack)
  • post-receive:
    • Push the code to production

See also: githooks.com

Workflows

  • Git Flow
  • Github Flow

See also

  • Rebellabs git Cheat Sheet
  • Github Git Cheat Sheet
  • git LFS

Published

Jul 8, 2018
by Martin Thoma

Category

Code

Tags

  • Code 9
  • git 8

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor