Commit Hygiene

Why Commits Matter

A commit is not a save point. It's a unit of communication.

Every commit you write will be read by someone — a teammate reviewing your PR, a future engineer debugging a regression, or you six months from now trying to understand why you changed something. The quality of your commits determines whether git history is a useful tool or a wall of noise.

Good commits make git bisect work. They make reverts safe. They make code review efficient. They make onboarding faster because new team members can read the history and understand how the system evolved.

Bad commits — "fix stuff", "WIP", "asdf", giant commits that change 40 files across 3 unrelated features — make all of these impossible. The history exists but communicates nothing.

Writing good commits takes marginally more effort than writing bad ones. The return on that effort compounds forever.

Atomic Commits

An atomic commit is a single logical change. It does one thing, it does it completely, and the codebase is in a working state both before and after the commit.

What atomic looks like

Good: "Add email validation to user registration form"
  - Adds validation function
  - Adds validation call in form handler
  - Adds test for validation
  - Adds error message display

  All related to one change. Codebase works before, works after.

Bad: "Update registration and fix header bug and add logging"
  - Changes form validation
  - Fixes unrelated CSS in the header
  - Adds logging to three different services

  Three unrelated changes. If the header fix introduces a regression,
  you can't revert it without also losing the validation and logging.

The test for atomicity

Ask yourself: "If I needed to revert this commit, would I want to revert everything in it?" If the answer is no — if the commit contains changes you'd want to keep alongside changes you'd want to revert — it's not atomic.

How to work atomically

The key tool is git add -p (patch mode). It lets you stage individual hunks within a file, so you can separate unrelated changes even if they're in the same file.

# Stage changes interactively, hunk by hunk
git add -p

# For each hunk, you'll be asked:
# y - stage this hunk
# n - skip this hunk
# s - split this hunk into smaller hunks
# e - manually edit the hunk

A typical workflow when you've made multiple unrelated changes:

1. git diff                    # Review all changes
2. git add -p                  # Stage only the hunks related to change A
3. git commit -m "Add email validation to registration"
4. git add -p                  # Stage hunks related to change B
5. git commit -m "Fix header alignment on mobile"
6. git add .                   # Stage remaining changes
7. git commit -m "Add request logging to auth service"

This takes 2 extra minutes and produces a history that's actually useful.

Writing Meaningful Commit Messages

A commit message has two parts: a subject line and an optional body.

The subject line

Format:  <type>: <what changed>
Length:  50-72 characters
Tense:   Imperative ("Add feature" not "Added feature")

The subject line should complete the sentence: "If applied, this commit will ___."

Good subject lines:
  Add rate limiting to API endpoints
  Fix null pointer in user lookup when email is empty
  Refactor payment processing to use strategy pattern
  Remove deprecated v1 API endpoints
  Update PostgreSQL driver to 15.2

Bad subject lines:
  fix stuff
  WIP
  changes
  update
  PR feedback
  Monday work
  asdf

The body (when you need it)

Not every commit needs a body. Simple, self-explanatory changes don't. But when the "why" isn't obvious from the "what," add a body:

Fix race condition in session cleanup

The session cleanup job was running concurrently with new session
creation, causing a window where a newly created session could be
immediately deleted.

Fix: acquire a lock on the session table before cleanup runs.
The lock timeout is set to 5 seconds to avoid blocking session
creation for too long.

Alternatives considered:
- Soft deletes: rejected because it adds complexity to every session query
- Timestamp-based filtering: rejected because clock skew between
  servers makes this unreliable

This body explains why the change was made, why this approach was chosen over alternatives, and what the tradeoffs are. Six months from now, when someone wonders why there's a lock on session cleanup, they'll find this commit and understand.

When to explain in the body

Always explain:
  - Bug fixes (what was the bug, how does this fix it)
  - Non-obvious implementation choices
  - Performance optimizations (what was slow, what's the improvement)
  - Breaking changes (what breaks, how to migrate)

Usually skip the body:
  - Trivial changes (typo fixes, formatting)
  - Changes where the diff is self-explanatory
  - Standard pattern implementations

Conventional Commits

Conventional commits add a structured prefix to commit messages:

feat: add user profile page
fix: resolve null pointer in search results
refactor: extract payment logic into service class
test: add integration tests for order workflow
docs: update API documentation for v2 endpoints
chore: update dependencies to latest versions
perf: cache database queries for product listings
ci: add parallel test execution to pipeline

Why use them

Structured prefixes make scanning git history much faster. You can immediately see what kind of change each commit represents without reading the description.

They also enable automation. Tools can generate changelogs, determine version bumps (feat = minor, fix = patch), and categorize changes — all from commit messages.

Breaking changes

Conventional commits mark breaking changes with ! or a BREAKING CHANGE footer:

feat!: change authentication to require API keys

All API endpoints now require an API key in the Authorization header.
The previous cookie-based authentication is no longer supported.

BREAKING CHANGE: Clients using cookie-based auth must migrate to API
key authentication. See docs/migration-v3.md for details.

Scoping

Optional scope in parentheses adds context:

feat(auth): add two-factor authentication
fix(api): handle timeout errors from payment provider
refactor(db): migrate from raw SQL to query builder

Don't over-scope. Scopes are useful when a codebase has clear modules (auth, api, db, ui). If everything is in one module, scopes add noise without value.

When to Squash

Squashing combines multiple commits into one. It's appropriate in specific situations:

Squash when

- Your branch has "WIP" or "fixup" commits that shouldn't be in main
- You made several attempts at an approach and the intermediate
  commits aren't useful history
- A PR has commits like "fix lint" or "address review feedback" that
  have no standalone value

Don't squash when

- Each commit represents a distinct, meaningful change
- The commit history tells a useful story about how the feature was built
- Commits are already atomic and well-messaged
- Someone might need to bisect within your changes later

The fixup workflow

When you know a commit is just fixing something from a previous commit:

# Make your fix
git add .

# Create a fixup commit targeting the original
git commit --fixup=abc1234

# Later, before merging, auto-squash fixups
git rebase -i --autosquash main

This is cleaner than manual squashing because it explicitly links the fixup to the commit it fixes.

PR merge strategies

Most teams use one of three strategies:

Merge commit:    Preserves all individual commits. Good when commits
                 are already atomic and well-organized.

Squash merge:    Combines all PR commits into one. Good when commits
                 are messy but the PR as a whole is a single logical
                 change.

Rebase merge:    Replays individual commits onto main. Good when
                 commits are clean and you want linear history.

The right choice depends on your team's discipline. If everyone writes atomic commits, merge or rebase preserves useful history. If commits tend to be messy, squash merge produces a cleaner main branch.

The Staging Area as a Tool

Most engineers treat git add . as the only way to stage changes. The staging area is actually a powerful tool for crafting commits.

git add -p           Stage specific hunks (interactive)
git add file.ts      Stage a specific file
git reset HEAD file  Unstage a file
git diff --staged    Review what you're about to commit

The workflow: make changes freely without thinking about commits. When you're ready to commit, use git add -p to carefully select what goes into each commit. Review with git diff --staged. Commit. Repeat for the next logical change.

This separates "writing code" from "organizing commits," which means you can work in whatever order feels natural and then arrange the commits logically afterward.

Reviewing Your Own Commits

Before pushing, review your commits:

# See the last 5 commits with diffs
git log -5 -p

# See just the commit messages
git log -10 --oneline

# See what changed in each commit (files only)
git log -5 --stat

Ask yourself:

Does each commit message accurately describe the change?
Is each commit atomic (one logical change)?
Would someone reading this history understand what happened and why?
Are there any commits that should be squashed, split, or reworded?

If something's off, fix it before pushing with interactive rebase:

git rebase -i HEAD~5
# Mark commits as: pick, reword, squash, fixup, edit, drop

Common Pitfalls

Giant commits that change everything. If your commit modifies 30 files across multiple features, it's not a commit — it's a code dump. Break it apart.

Meaningless messages driven by laziness. "Fix" tells the next developer nothing. "Fix null pointer in OrderService.calculateTotal when order has no items" takes 10 extra seconds to type and saves the next person 30 minutes of investigation.

Committing generated files, build artifacts, or dependencies. Your .gitignore should prevent this, but check git status before committing to make sure.

Never rewriting history on shared branches. git rebase -i and git commit --amend are for local branches that haven't been pushed. Once commits are shared, treat them as immutable.

Commit frequency extremes. Committing after every line change creates noise. Committing once a day creates monoliths. The right frequency is "whenever you complete one logical change" — typically 3-10 commits per day during active development.

Key Takeaways

Atomic commits — one logical change per commit — make your git history useful for reviews, reverts, bisects, and understanding how the codebase evolved.

Commit messages are documentation. The subject line says what changed. The body says why. Skip the body only when the diff is self-explanatory.

Use git add -p to stage selectively. This is the tool that makes atomic commits practical even when you're working on multiple things.

Conventional commit prefixes (feat, fix, refactor, etc.) make history scannable and enable changelog automation.

Squash when intermediate commits have no standalone value. Don't squash when each commit represents a meaningful, distinct change. Review your commits before pushing.