Having a consistent code style for a project is important as it allows developers to code correctly without thinking too much about it. It makes code easier to read, maintain and after becomming used to the style also easier to write.
Most of the time, it is not too important which standards to follow, but to decide in the team which ones you want to have and follow those consistently. To cite from PEP8:
A style guide is about consistency. Consistency with [PEP8] is important. Consistency within a project is more important. Consistency within one module or function is the most important.
Python has standards for general coding as well as for docstrings.
General Coding
PEP8
The PEP8 was posted in July 2001 and got an update in 2013.
PyFlakes
PyFlakes is a very common tool to check Python code for potential errors. I've added the codes to the long table below.
Docstrings
Python packages are usually documented on a function / class / method / package
level directly in the code. The stuff in docs/
is often only for building
HTML out of the Python code, organzinging things (e.g. which package to show
first) and a user manual.
There is PEP257 which defines some basic stuff. Building on this, there are two docstring style guides which cannot be combined: NumpyDoc an Google.
Tools like napoleon in combination with Sphinx can automatically create nice docs of both of them.
NumpyDoc
See GitHub for the guide.
It looks as follows:
def get_meta(filepath, a_number, a_dict):
"""
Get meta-information of an image.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis
parturient montes, nascetur ridiculus mus.
Parameters
----------
filepath : str
Get metadata from this file
a_number : int
Some more details
a_dict : dict
Configuration
Returns
-------
meta : dict
Extracted meta information
Raises
------
IOError
File could not be read
"""
Google Style Docstrings
See Github for the documentation.
It looks as follows:
def get_meta(filepath, a_number, a_dict):
"""Get meta-information of an image.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis
parturient montes, nascetur ridiculus mus.
Args:
filepath: Get metadata from this file.
a_number: Some more details.
a_dict: Configuration.
Returns:
Extracted meta information:
Raises:
IOError: File could not be read.
"""
SphinxDocString
It's super ugly and I find it hard to read, but this docstring type is also out there. SphinxDocString. They use reStructuredText:
def get_meta(filepath, a_number, a_dict):
"""
Get meta-information of an image.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis
parturient montes, nascetur ridiculus mus.
:param filepath: Get metadata from this file
:type filepath: str
:param a_number: Some more details
:type a_number: int
:param a_dict: Configuration
:type a_dict: dict
:returns: dict -- Extracted meta information
:raises: IOError
"""
McCabe code complexity
The McCabe complexity measures how difficult it is to read your code. To quote from Wikipedia:
It is a quantitative measure of the number of linearly independent paths through a program's source code. [...] One of McCabe's original applications was to limit the complexity of routines during program development; he recommended that programmers should count the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.[2] This practice was adopted by the NIST Structured Testing methodology, with an observation that since McCabe's original publication, the figure of 10 had received substantial corroborating evidence, but that in some circumstances it may be appropriate to relax the restriction and permit modules with a complexity as high as 15.
I think McCabe complexity is one way to find spots where the could could be improved for readability, but I'm not certain how often that actually works.
There is a mccabe pytest plugin.
Linters
Linters are tools for static code analysis. Static code analysis is the task of analyzing a computer program without executing it. With executing the program, it would be dynamic code analysis which is done by coverage testing tools.
Common Python linters are:
pycodestyle
which replacespep8
pydocstyle
flake8
pyrama
for checking package structureradon
: Measuring the code complexity
What you should forget
pylama
: Only wraps some other tools. Use the pytest-plugins for those tools instead.
Error Codes
The following error codes are from pycodestyle
and
pydocstyle
.
I added to a couple why they exist and added a suggestion if I think you should take them
(from ✓✓ for a strong YES to ✘✘ for a strong NO). Please also have a
look at lintlyci.github.io/Flake8Rules
which gives a lot of good examples for those rules.
There are also two footnotes for some codes:
(*) In the default configuration, the checks E121, E123, E126, E133, E226, E241, E242, E704, W503, W504 and W505 are ignored because they are not rules unanimously accepted, and PEP 8 does not enforce them. Please note that if the option –ignore=errors is used, the default configuration will be overridden and ignore only the check(s) you skip. The check W503 is mutually exclusive with check W504. The check E133 is mutually exclusive with check E123. Use switch --hang-closing to report E133 instead of E123. Use switch --max-doc-length=n to report W505.
(^) These checks can be disabled at the line level using the # noqa special comment. This possibility should be reserved for special cases.
Code | Meaning | Suggestion |
---|---|---|
E1 | Indentation | Indentation carries a meaning in Python - if code is in a block or not. |
E101 | indentation contains mixed spaces and tabs Why: When you mix spaces and tabs, you will get different indentation for different editor settings. |
✓✓ |
E111 | indentation is not a multiple of four Why: My guess is that 95% of all projects use 4 spaces - a single spaces is hard to read and more than four is something you don't want to type that often |
✓ |
E112 | expected an indented block Why: A bug |
✓✓ |
E113 | unexpected indentation Why: A bug |
✓✓ |
E114 | indentation is not a multiple of four (comment) Why: Easily leads to bugs |
✓✓ |
E115 | expected an indented block (comment) Why: Easily leads to bugs |
✓✓ |
E116 | unexpected indentation (comment) Why: Easily leads to bugs |
✓✓ |
E121 (*^) | continuation line under-indented for hanging indent Why: Usual code style |
|
E122 (^) | continuation line missing indentation or outdented Why: Usual code style |
|
E123 (*) | closing bracket does not match indentation of opening bracket's line Why: Readability |
✓ |
E124 (^) | closing bracket does not match visual indentation Why: Readability |
✓ |
E125 (^) | continuation line with same indent as next logical line Why: Readability |
✓ |
E126 (*^) | continuation line over-indented for hanging indent Why: Readability |
|
E127 (^) | continuation line over-indented for visual indent Why: Readability |
|
E128 (^) | continuation line under-indented for visual indent Why: Readability |
|
E129 (^) | visually indented line with same indent as next logical line Why: Readability |
|
E131 (^) | continuation line unaligned for hanging indent Why: Readability |
|
E133 (*) | closing bracket is missing indentation Why: Readability |
|
E2 | Whitespace | |
E201 | whitespace after ( Why: Usual code style |
|
E202 | whitespace before ) Why: Usual code style |
|
E203 | whitespace before
Why: Usual code style |
|
E211 | whitespace before ( |
|
E221 | multiple spaces before operator | |
E222 | multiple spaces after operator | |
E223 | tab before operator Why: Try to avoid tabs in Python |
✓✓ |
E224 | tab after operator Why: Try to avoid tabs in Python |
✓✓ |
E225 | missing whitespace around operator | |
E226 (*) | missing whitespace around arithmetic operator | |
E227 | missing whitespace around bitwise or shift operator | |
E228 | missing whitespace around modulo operator | |
E231 | missing whitespace after , , ; , or : |
|
E241 (*) | multiple spaces after , |
|
E242 (*) | tab after , Why: Try to avoid tabs in Python |
✓✓ |
E251 | unexpected spaces around keyword / parameter equals | |
E261 | at least two spaces before inline comment | |
E262 | inline comment should start with # |
|
E265 | block comment should start with # |
|
E266 | too many leading # for block comment |
|
E271 | multiple spaces after keyword Why: I can see the reason for one space ... but many? |
✓✓ |
E272 | multiple spaces before keyword Why: I can see the reason for one space, but not for multiple |
✓✓ |
E273 | tab after keyword Why: Try to avoid tabs in Python |
✓✓ |
E274 | tab before keyword Why: Try to avoid tabs in Python |
✓✓ |
E275 | missing whitespace after keyword | |
E3 | Blank line | |
E301 | expected 1 blank line, found 0 | |
E302 | expected 2 blank lines, found 0 | |
E303 | too many blank lines (3) Why: Don't make your code too stretched out. If you want to separate code, make a new module. |
✓✓ |
E304 | blank lines found after function decorator Why: This is confusing. A function decorator changes the function being decorated. If you separate them, I might miss that it is there. |
✓✓ |
E305 | expected 2 blank lines after end of function or class | |
E306 | expected 1 blank line before a nested definition | |
E4 | Import | |
E401 | multiple imports on one line Why: It's more readable to have one import per line, you can structure them more easily and your editor can tell you which one you're not using |
✓✓ |
E402 | module level import not at top of file Why: You should have all your imports at the top of your file. However, there could be other code as well in between imports. For example, setting the seed of random . |
✘ |
E5 | Line length | |
E501 (^) | line too long (> 79 characters) Why: See below. |
✓✓ |
E502 | the backslash is redundant between brackets | |
E7 | Statement | |
E701 | multiple statements on one line (colon) | |
E702 | multiple statements on one line (semicolon) | |
E703 | statement ends with a semicolon Why: Likely unnecessary and due to a C / C++ / Java developer (trying to) write Python code. |
✓✓ |
E704 (*) | multiple statements on one line (def) | |
E711 (^) | comparison to None should be if cond is None: Why: Example |
✓✓ |
E712 (^) | comparison to True should be if cond is True: or if cond: Why: Because if cond is way easier to read |
✓✓ |
E713 | test for membership should be not in |
|
E714 | test for object identity should be is not |
|
E721 (^) | do not compare types, use isinstance() |
|
E722 | do not use bare except, specify exception instead | |
E731 | do not assign a lambda expression, use a def Why: Example, DRY |
|
E741 | do not use variables named l , O , or I Why: Those letters are hard to distinguish in some fonts. |
✓✓ |
E742 | do not define classes named l , O , or I Why: Those letters are hard to distinguish in some fonts. |
✓✓ |
E743 | do not define functions named l , O , or I Why: Those letters are hard to distinguish in some fonts. |
✓✓ |
E9 | Runtime | |
E901 | SyntaxError or IndentationError | |
E902 | IOError | |
W1 | Indentation warning | |
W191 | indentation contains tabs | ✓✓ |
W2 | Whitespace warning | |
W291 | trailing whitespace Why: It just adds noise to git diff |
✓✓ |
W292 | no newline at end of file Why: answer |
|
W293 | blank line contains whitespace Why: It just adds noise to git diff |
✓✓ |
W3 | Blank line warning | |
W391 | blank line at end of file | |
W5 | Line break warning | |
W503 (*) | line break before binary operator | |
W504 (*) | line break after binary operator | |
W505 (*^) | doc line too long (82 > 79 characters) | |
W6 | Deprecation warning | |
W601 | .has_key() is deprecated, use in |
✓✓ |
W602 | deprecated form of raising exception | |
W603 | <> is deprecated, use != |
✓✓ |
W604 | backticks are deprecated, use repr() |
✓✓ |
W605 | invalid escape sequence x |
|
W606 | async and await are reserved keywords starting with Python 3.7 |
|
F4 | Flake8 module import | |
F401 | module imported but unused Why: Might keep unnecessary dependencies |
|
F402 | import module from line N shadowed by loop variable Why: Potential bug. |
✓✓ |
F403 | from module import * used; unable to detect undefined names |
|
F404 | future import(s) name after other statements | |
F8 | Flake8 name errors | |
F811 | redefinition of unused name from line N Why: Potentially unused code. |
|
F812 | list comprehension redefines name from line N | |
F821 | undefined name name | |
F822 | undefined name name in __all__ | |
F823 | local variable name ... referenced before assignment | |
F831 | duplicate argument name in function definition | |
F841 | local variable name is assigned to but never used | |
N8 | Naming conventions | |
N801 | class names should use CapWords convention | ✓✓ |
N802 | function name should be lowercase | ✓✓ |
N803 | argument name should be lowercase | ✓✓ |
N804 | first argument of a classmethod should be named cls |
✓✓ |
N805 | first argument of a method should be named self |
✓✓ |
N806 | variable in function should be lowercase | ✓✓ |
N807 | function name should not start or end with __ |
|
N811 | constant imported as non constant | |
N812 | lowercase imported as non lowercase | |
N813 | camelcase imported as lowercase | |
N814 | camelcase imported as constant | |
D1 | Missing Docstrings | |
D100 | Missing docstring in public module | |
D101 | Missing docstring in public class | |
D102 | Missing docstring in public method | |
D103 | Missing docstring in public function | |
D104 | Missing docstring in public package | |
D105 | Missing docstring in magic method | |
D2 | Whitespace Issues | |
D200 | One-line docstring should fit on one line with quotes | |
D201 | No blank lines allowed before function docstring | |
D202 | No blank lines allowed after function docstring | |
D203 | 1 blank line required before class docstring | |
D204 | 1 blank line required after class docstring | |
D205 | 1 blank line required between summary line and description | |
D206 | Docstring should be indented with spaces, not tabs | |
D207 | Docstring is under-indented | |
D208 | Docstring is over-indented | |
D209 | Multi-line docstring closing quotes should be on a separate line | |
D210 | No whitespaces allowed surrounding docstring text | |
D211 | No blank lines allowed before class docstring | |
D212 | Multi-line docstring summary should start at the first line | |
D213 | Multi-line docstring summary should start at the second line | |
D3 | Quotes Issues | |
D300 | Use “”“triple double quotes”“” | |
D301 | Use r”“” if any backslashes in a docstring | |
D302 | Use u”“” for Unicode docstrings | |
D4 | Docstring Content Issues | |
D400 | First line should end with a period | |
D401 | First line should be in imperative mood | |
D402 | First line should not be the functions signature | |
D403 | First word of the first line should be properly capitalized |
Editor Support
You should let your editor do as many automatic formatting changes as you can.
- Sublime Text: Python Flake8 Lint (tested, works fine) and Auto PEP8 (not tested)
- Spyder: Auto PEP8 (not tested)
Notes on Details
Maximum Line Length
You might consider a maximum line lenght of 80 characters too extreme / outdated.
Well, please have a look how a 3-way merge would look like on your machine. This is how it looks like on mine:
And now look at files with 100 characters:
Sure, you can still do it. But for sure it also is less comfortable.
Let's see how famous projects do it (code on GitHub):
Project | 95%-line length | 99%-line length | 100%-line length |
---|---|---|---|
numpy | 75 | 80 | 589 |
scipy | 76 | 83 | 5223 |
pandas | 74 | 79 | 801 |
Pillow | 73 | 80 | 185 |
sqlalchemy | 73 | 78 | 144 |
requests | 77 | 92 | 172 |
cpython | 75 | 81 | 1182 |
As you can see, all projects try to be below 79 characters per line. They only break it for tests and documentation. Not for code. While I admit there are some cases where you will get above the 79 character threshold, in most cases it just means that you should change the way you wrote your code. I've often seen it when you have many nested loops or conditions.
Another argument against longer line lengths is readability. Long lines are just harder to read. Newspapers could also have way longer lines and less columns. But they don't do that. Websites also make columns. Let's look at the number of characters in a line for a couple:
Website | Category | Characters |
---|---|---|
Focus.de | News | 70 |
washingtonpost.com | News | 99 |
Sueddeutsche.de | News | 68 |
Medium.com | Blog posts | 73 |