CODE: Python Coding Standards

PEP 8 sets naming, spacing, and formatting guidelines to keep Python code predictable and easy to read. Documentation through docstrings and comments preserves intent, reduces ambiguity, and improves long-term code quality.

PEP 8 Essentials (Style Guide)

Naming

  • Modules/packagessnake_case (data_loader.py)
  • Functions/variablessnake_case (load_data)
  • ClassesCapWords (DataLoader)
  • ConstantsUPPER_SNAKE (DEFAULT_TIMEOUT)
  • Avoid single-letter names except for throwaway loop vars.

Layout

  • Line length: 79–99 chars (choose one; 88 is common with Black).
  • Indent: 4 spaces (no tabs).
  • Blank lines: 2 before top-level defs; 1 between class methods.
  • Trailing commas for multi-line literals/calls to keep diffs clean.

Imports

  • One per line; standard → third-party → local, each group separated by a blank line.
  • Absolute imports preferred.

python
import json
from pathlib import Path

import requests

from .utils import normalize

Whitespace

  • Around operators: a + b, not a+b.
  • After commas: func(a, b, c).
  • No extra spaces inside parentheses/brackets/braces.

Booleans/None

  • if x is None: / if x is not None:
  • if items: instead of if len(items) > 0: (when truthiness is intended)

Docstrings (PEP 257) & Documentation

Use triple-quoted strings right after defs/classes/modules.

Docstring content

  • What it does (1-line summary).
  • Why/Notes if non-obvious.
  • Args/Returns/Raises.
  • Give examples for tricky behavior.

Choose one style and stick to it (pick what your team/tools parse).

Google style

python
def read_csv(path: str) -> list[dict]:
    """Load rows from a CSV file.

    Args:
        path: Path to the CSV file.

    Returns:
        List of row dicts.

    Raises:
        FileNotFoundError: If path does not exist.
        UnicodeDecodeError: If encoding is invalid.
    """

NumPy style

python
def normalize(x: list[float]) -> list[float]:
    """
    Normalize a list to unit sum.

    Parameters
    ----------
    x : list of float
        Values to normalize.

    Returns
    -------
    list of float
        Normalized values.
    """

reST (Sphinx)

python
def foo(a: int) -> int:
    """Do foo.

    :param a: Input integer.
    :returns: Processed value.
    """

Project docs

  • Small libs: README.md + examples.
  • Bigger projects: Sphinx (API docs from docstrings) or MkDocs (markdown-first).
  • Keep usage examples tested with doctest or CI snippets to prevent drift.

Types & Contracts (PEP 484/526)

Add type hints for public APIs and tricky internals; check with mypy or pyright.

python
from typing import Iterable, Sized

def top_n(xs: Iterable[int], n: int) -> list[int]:
    """Return the largest n numbers."""
    return sorted(xs, reverse=True)[:n]

Guidelines:

  • Prefer concrete containers in returns (list[int]) unless you promise generality.
  • Use TypedDictProtocol, and dataclasses for structured data.
  • Mark optional: str | None and check is None.

Functions, Classes, and Modules

  • Single responsibility per function/module.
  • Keep functions short (often < 25–40 lines).
  • Prefer pure functions when possible (fewer side effects).
  • For simple data holders, use @dataclass(frozen=True) to get immutability & comparisons.

python
from dataclasses import dataclass

@dataclass(frozen=True)
class User:
    id: int
    email: str


Errors, Logging, and Exceptions

  • Fail loud and early: validate inputs; raise specific exceptions.
  • Don’t swallow exceptions; re-raise with context.
  • Use logging over prints; attach context.

python
import logging
log = logging.getLogger(__name__)

try:
    write_file(p, data)
except OSError as e:
    log.error("Failed to write %s (%s bytes): %s", p, len(data), e)
    raise


Linting, Formatting, Imports (tooling stack)

Pick tools and automate them.

  • Formatterblack (opinionated, 88 cols by default)
  • Linterruff (fast; replaces flake8 + many plugins)
  • Type checkermypy or pyright
  • Import sorterisort (or let ruff handle imports)

pyproject.toml (starter)

toml
[tool.black]
line-length = 88
target-version = ["py312"]

[tool.ruff]
line-length = 88
select = ["E","F","I","B","UP"]  # pyflakes/pycodestyle/imports/bugbear/pyupgrade
ignore = ["E203"]  # if using Black
target-version = "py312"

[tool.ruff.isort]
known-first-party = ["yourpkg"]
combine-as-imports = true

[tool.mypy]
python_version = "3.12"
strict = true
warn_unused_ignores = true

Run locally:

bash
black .
ruff check . --fix
mypy .


Pre-commit Hooks (keep the repo clean)

yaml
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/psf/black
    rev: 24.8.0
    hooks: [{id: black}]
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.6.9
    hooks:
      - id: ruff
      - id: ruff-format
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.11.2
    hooks: [{id: mypy}]

bash
pip install pre-commit
pre-commit install        # runs on every commit
pre-commit run --all-files


Comments & TODOs

  • Explain why, not “what the code does”.
  • Keep comments short; update when code changes.
  • Standard TODO form helps searchability:
  • # TODO(amr): handle empty payloads
  • # FIXME: flaky test on Windows
  • # NOTE: using greedy strategy to minimize memory

File/Project Structure (minimal)

bash
project/
├─ src/yourpkg/
│  ├─ __init__.py
│  ├─ core.py
│  └─ io.py
├─ tests/
│  └─ test_core.py
├─ pyproject.toml
├─ README.md
└─ .pre-commit-config.yaml

  • Put code in src/ layout to avoid accidental imports from the repo root.
  • Test names mirror module names.

Practical Checklist (copy into your README)

  •  Formatter (black / ruff format) passes
  •  Linter (ruff check) passes
  •  Types (mypy/pyright) pass on strict mode
  •  Tests (pytest -q) pass locally & in CI
  •  Public functions/classes have docstrings
  •  No prints in libraries; use logging
  •  Clear exceptions; no broad except Exception
  •  Consistent return types & None handling
  •  Imports grouped & sorted; no unused imports
  •  Function length & cyclomatic complexity reasonable

Example: Clean Function Before/After

Before

python
def calc(d, n):
    # n is the max items
    s = 0
    for k in d.keys():
        if d[k] is not None:
            s += d[k]
    return s/n

After

python
from collections.abc import Mapping

def mean_non_null(values: Mapping[str, float | None], limit: int) -> float:
    """Average the first `limit` non-null values."""
    picked = [v for v in values.values() if v is not None][:limit]
    if not picked:
        raise ValueError("No values to average")
    return sum(picked) / len(picked)

  • Clear name, types, docstring, validation, no hidden division by zero.