• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

Python Style Guide

Contents

  • General Coding
    • PEP8
    • PyFlakes
  • Docstrings
    • NumpyDoc
    • Google Style Docstrings
    • SphinxDocString
  • McCabe code complexity
  • Linters
    • Error Codes
  • Editor Support
  • Notes on Details
    • Maximum Line Length

Having a consistent code style for a project is important as it allows developers to code correctly without thinking too much about it. It makes code easier to read, maintain and after becomming used to the style also easier to write.

Most of the time, it is not too important which standards to follow, but to decide in the team which ones you want to have and follow those consistently. To cite from PEP8:

A style guide is about consistency. Consistency with [PEP8] is important. Consistency within a project is more important. Consistency within one module or function is the most important.

Python has standards for general coding as well as for docstrings.

General Coding

PEP8

The PEP8 was posted in July 2001 and got an update in 2013.

PyFlakes

PyFlakes is a very common tool to check Python code for potential errors. I've added the codes to the long table below.

Docstrings

Python packages are usually documented on a function / class / method / package level directly in the code. The stuff in docs/ is often only for building HTML out of the Python code, organzinging things (e.g. which package to show first) and a user manual.

There is PEP257 which defines some basic stuff. Building on this, there are two docstring style guides which cannot be combined: NumpyDoc an Google.

Tools like napoleon in combination with Sphinx can automatically create nice docs of both of them.

NumpyDoc

See GitHub for the guide.

It looks as follows:

def get_meta(filepath, a_number, a_dict):
    """
    Get meta-information of an image.

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo
    ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis
    parturient montes, nascetur ridiculus mus.

    Parameters
    ----------
    filepath : str
        Get metadata from this file
    a_number : int
        Some more details
    a_dict : dict
        Configuration

    Returns
    -------
    meta : dict
        Extracted meta information

    Raises
    ------
    IOError
        File could not be read
    """

Google Style Docstrings

See Github for the documentation.

It looks as follows:

def get_meta(filepath, a_number, a_dict):
    """Get meta-information of an image.

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo
    ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis
    parturient montes, nascetur ridiculus mus.

    Args:
        filepath: Get metadata from this file.
        a_number: Some more details.
        a_dict: Configuration.

    Returns:
        Extracted meta information:

    Raises:
        IOError: File could not be read.
    """

SphinxDocString

It's super ugly and I find it hard to read, but this docstring type is also out there. SphinxDocString. They use reStructuredText:

def get_meta(filepath, a_number, a_dict):
    """
    Get meta-information of an image.

    Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo
    ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis
    parturient montes, nascetur ridiculus mus.

    :param filepath: Get metadata from this file
    :type filepath: str
    :param a_number: Some more details
    :type a_number: int
    :param a_dict: Configuration
    :type a_dict: dict

    :returns: dict -- Extracted meta information

    :raises: IOError
    """

McCabe code complexity

The McCabe complexity measures how difficult it is to read your code. To quote from Wikipedia:

It is a quantitative measure of the number of linearly independent paths through a program's source code. [...] One of McCabe's original applications was to limit the complexity of routines during program development; he recommended that programmers should count the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.[2] This practice was adopted by the NIST Structured Testing methodology, with an observation that since McCabe's original publication, the figure of 10 had received substantial corroborating evidence, but that in some circumstances it may be appropriate to relax the restriction and permit modules with a complexity as high as 15.

I think McCabe complexity is one way to find spots where the could could be improved for readability, but I'm not certain how often that actually works.

There is a mccabe pytest plugin.

Linters

Linters are tools for static code analysis. Static code analysis is the task of analyzing a computer program without executing it. With executing the program, it would be dynamic code analysis which is done by coverage testing tools.

Common Python linters are:

  • pycodestyle which replaces pep8
  • pydocstyle
  • flake8
  • pyrama for checking package structure
  • radon: Measuring the code complexity

What you should forget

  • pylama: Only wraps some other tools. Use the pytest-plugins for those tools instead.
Don't forget the Black Auto-Formatter. It is maintained by the Python Software Foundation, works well and has reasonable defaults. Makes you think and discuss less about formatting and solves many of the things linters complain about.

Error Codes

The following error codes are from pycodestyle and pydocstyle. I added to a couple why they exist and added a suggestion if I think you should take them (from ✓✓ for a strong YES to ✘✘ for a strong NO). Please also have a look at lintlyci.github.io/Flake8Rules which gives a lot of good examples for those rules.

There are also two footnotes for some codes:

(*) In the default configuration, the checks E121, E123, E126, E133, E226, E241, E242, E704, W503, W504 and W505 are ignored because they are not rules unanimously accepted, and PEP 8 does not enforce them. Please note that if the option –ignore=errors is used, the default configuration will be overridden and ignore only the check(s) you skip. The check W503 is mutually exclusive with check W504. The check E133 is mutually exclusive with check E123. Use switch --hang-closing to report E133 instead of E123. Use switch --max-doc-length=n to report W505.

(^) These checks can be disabled at the line level using the # noqa special comment. This possibility should be reserved for special cases.

Code Meaning Suggestion
E1 Indentation Indentation carries a meaning in Python - if code is in a block or not.
E101 indentation contains mixed spaces and tabs
Why: When you mix spaces and tabs, you will get different indentation for different editor settings.
✓✓
E111 indentation is not a multiple of four
Why: My guess is that 95% of all projects use 4 spaces - a single spaces is hard to read and more than four is something you don't want to type that often
✓
E112 expected an indented block
Why: A bug
✓✓
E113 unexpected indentation
Why: A bug
✓✓
E114 indentation is not a multiple of four (comment)
Why: Easily leads to bugs
✓✓
E115 expected an indented block (comment)
Why: Easily leads to bugs
✓✓
E116 unexpected indentation (comment)
Why: Easily leads to bugs
✓✓
E121 (*^) continuation line under-indented for hanging indent
Why: Usual code style
E122 (^) continuation line missing indentation or outdented
Why: Usual code style
E123 (*) closing bracket does not match indentation of opening bracket's line
Why: Readability
✓
E124 (^) closing bracket does not match visual indentation
Why: Readability
✓
E125 (^) continuation line with same indent as next logical line
Why: Readability
✓
E126 (*^) continuation line over-indented for hanging indent
Why: Readability
E127 (^) continuation line over-indented for visual indent
Why: Readability
E128 (^) continuation line under-indented for visual indent
Why: Readability
E129 (^) visually indented line with same indent as next logical line
Why: Readability
E131 (^) continuation line unaligned for hanging indent
Why: Readability
E133 (*) closing bracket is missing indentation
Why: Readability
E2 Whitespace
E201 whitespace after (
Why: Usual code style
E202 whitespace before )
Why: Usual code style
E203 whitespace before
Why: Usual code style
E211 whitespace before (
E221 multiple spaces before operator
E222 multiple spaces after operator
E223 tab before operator
Why: Try to avoid tabs in Python
✓✓
E224 tab after operator
Why: Try to avoid tabs in Python
✓✓
E225 missing whitespace around operator
E226 (*) missing whitespace around arithmetic operator
E227 missing whitespace around bitwise or shift operator
E228 missing whitespace around modulo operator
E231 missing whitespace after ,, ;, or :
E241 (*) multiple spaces after ,
E242 (*) tab after ,
Why: Try to avoid tabs in Python
✓✓
E251 unexpected spaces around keyword / parameter equals
E261 at least two spaces before inline comment
E262 inline comment should start with #
E265 block comment should start with #
E266 too many leading # for block comment
E271 multiple spaces after keyword
Why: I can see the reason for one space ... but many?
✓✓
E272 multiple spaces before keyword
Why: I can see the reason for one space, but not for multiple
✓✓
E273 tab after keyword
Why: Try to avoid tabs in Python
✓✓
E274 tab before keyword
Why: Try to avoid tabs in Python
✓✓
E275 missing whitespace after keyword
E3 Blank line
E301 expected 1 blank line, found 0
E302 expected 2 blank lines, found 0
E303 too many blank lines (3)
Why: Don't make your code too stretched out. If you want to separate code, make a new module.
✓✓
E304 blank lines found after function decorator
Why: This is confusing. A function decorator changes the function being decorated. If you separate them, I might miss that it is there.
✓✓
E305 expected 2 blank lines after end of function or class
E306 expected 1 blank line before a nested definition
E4 Import
E401 multiple imports on one line
Why: It's more readable to have one import per line, you can structure them more easily and your editor can tell you which one you're not using
✓✓
E402 module level import not at top of file
Why: You should have all your imports at the top of your file. However, there could be other code as well in between imports. For example, setting the seed of random.
✘
E5 Line length
E501 (^) line too long (> 79 characters)
Why: See below.
✓✓
E502 the backslash is redundant between brackets
E7 Statement
E701 multiple statements on one line (colon)
E702 multiple statements on one line (semicolon)
E703 statement ends with a semicolon
Why: Likely unnecessary and due to a C / C++ / Java developer (trying to) write Python code.
✓✓
E704 (*) multiple statements on one line (def)
E711 (^) comparison to None should be if cond is None:
Why: Example
✓✓
E712 (^) comparison to True should be if cond is True: or if cond:
Why: Because if cond is way easier to read
✓✓
E713 test for membership should be not in
E714 test for object identity should be is not
E721 (^) do not compare types, use isinstance()
E722 do not use bare except, specify exception instead
E731 do not assign a lambda expression, use a def
Why: Example, DRY
E741 do not use variables named l, O, or I
Why: Those letters are hard to distinguish in some fonts.
✓✓
E742 do not define classes named l, O, or I
Why: Those letters are hard to distinguish in some fonts.
✓✓
E743 do not define functions named l, O, or I
Why: Those letters are hard to distinguish in some fonts.
✓✓
E9 Runtime
E901 SyntaxError or IndentationError
E902 IOError
W1 Indentation warning
W191 indentation contains tabs ✓✓
W2 Whitespace warning
W291 trailing whitespace
Why: It just adds noise to git diff
✓✓
W292 no newline at end of file
Why: answer
W293 blank line contains whitespace
Why: It just adds noise to git diff
✓✓
W3 Blank line warning
W391 blank line at end of file
W5 Line break warning
W503 (*) line break before binary operator
W504 (*) line break after binary operator
W505 (*^) doc line too long (82 > 79 characters)
W6 Deprecation warning
W601 .has_key() is deprecated, use in ✓✓
W602 deprecated form of raising exception
W603 <> is deprecated, use != ✓✓
W604 backticks are deprecated, use repr() ✓✓
W605 invalid escape sequence x
W606 async and await are reserved keywords starting with Python 3.7
F4 Flake8 module import
F401 module imported but unused
Why: Might keep unnecessary dependencies
F402 import module from line N shadowed by loop variable
Why: Potential bug.
✓✓
F403 from module import * used; unable to detect undefined names
F404 future import(s) name after other statements
F8 Flake8 name errors
F811 redefinition of unused name from line N
Why: Potentially unused code.
F812 list comprehension redefines name from line N
F821 undefined name name
F822 undefined name name in __all__
F823 local variable name ... referenced before assignment
F831 duplicate argument name in function definition
F841 local variable name is assigned to but never used
N8 Naming conventions
N801 class names should use CapWords convention ✓✓
N802 function name should be lowercase ✓✓
N803 argument name should be lowercase ✓✓
N804 first argument of a classmethod should be named cls ✓✓
N805 first argument of a method should be named self ✓✓
N806 variable in function should be lowercase ✓✓
N807 function name should not start or end with __
N811 constant imported as non constant
N812 lowercase imported as non lowercase
N813 camelcase imported as lowercase
N814 camelcase imported as constant
D1 Missing Docstrings
D100 Missing docstring in public module
D101 Missing docstring in public class
D102 Missing docstring in public method
D103 Missing docstring in public function
D104 Missing docstring in public package
D105 Missing docstring in magic method
D2 Whitespace Issues
D200 One-line docstring should fit on one line with quotes
D201 No blank lines allowed before function docstring
D202 No blank lines allowed after function docstring
D203 1 blank line required before class docstring
D204 1 blank line required after class docstring
D205 1 blank line required between summary line and description
D206 Docstring should be indented with spaces, not tabs
D207 Docstring is under-indented
D208 Docstring is over-indented
D209 Multi-line docstring closing quotes should be on a separate line
D210 No whitespaces allowed surrounding docstring text
D211 No blank lines allowed before class docstring
D212 Multi-line docstring summary should start at the first line
D213 Multi-line docstring summary should start at the second line
D3 Quotes Issues
D300 Use “”“triple double quotes”“”
D301 Use r”“” if any backslashes in a docstring
D302 Use u”“” for Unicode docstrings
D4 Docstring Content Issues
D400 First line should end with a period
D401 First line should be in imperative mood
D402 First line should not be the functions signature
D403 First word of the first line should be properly capitalized

Editor Support

You should let your editor do as many automatic formatting changes as you can.

  • Sublime Text: Python Flake8 Lint (tested, works fine) and Auto PEP8 (not tested)
  • Spyder: Auto PEP8 (not tested)

Notes on Details

Maximum Line Length

You might consider a maximum line lenght of 80 characters too extreme / outdated.

Well, please have a look how a 3-way merge would look like on your machine. This is how it looks like on mine:

3-way merge with 80 character lines
3-way merge with 80 character lines

And now look at files with 100 characters:

3-way merge with 100 character lines
3-way merge with 100 character lines

Sure, you can still do it. But for sure it also is less comfortable.

Let's see how famous projects do it (code on GitHub):

Project 95%-line length 99%-line length 100%-line length
numpy 75 80 589
scipy 76 83 5223
pandas 74 79 801
Pillow 73 80 185
sqlalchemy 73 78 144
requests 77 92 172
cpython 75 81 1182

As you can see, all projects try to be below 79 characters per line. They only break it for tests and documentation. Not for code. While I admit there are some cases where you will get above the 79 character threshold, in most cases it just means that you should change the way you wrote your code. I've often seen it when you have many nested loops or conditions.

Another argument against longer line lengths is readability. Long lines are just harder to read. Newspapers could also have way longer lines and less columns. But they don't do that. Websites also make columns. Let's look at the number of characters in a line for a couple:

Website Category Characters
Focus.de News 70
washingtonpost.com News 99
Sueddeutsche.de News 68
Medium.com Blog posts 73

Published

Jul 1, 2018
by Martin Thoma

Category

Code

Tags

  • Flake8 3
  • PEP8 1
  • Python 141
  • Style Guide 2

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor