Architecture Decision Records

Every significant technical decision your team does not document will be re-debated within 6 months. Someone new joins, sees a choice they disagree with, and opens the same discussion that was already settled. Or the person who made the original decision leaves, and nobody remembers why the system works the way it does. Architecture Decision Records (ADRs) capture the WHY behind technical decisions so that future engineers can understand the context, constraints, and reasoning that led to the current design.

Why ADRs Matter

Code captures WHAT you built. Tests capture HOW it should behave. ADRs capture WHY you built it this way instead of another way.

Without ADRs:
  New engineer: "Why are we using Kafka instead of RabbitMQ?"
  Team: "I think Dave decided that. Dave left 8 months ago."
  New engineer: "This seems wrong. Let's switch to RabbitMQ."
  Team: (spends 3 weeks evaluating, rediscovers the same constraints,
         reaches the same conclusion Dave reached)

With ADRs:
  New engineer: "Why are we using Kafka instead of RabbitMQ?"
  Team: "Check ADR-007."
  New engineer reads ADR-007:
    "We chose Kafka because we need message replay for audit compliance,
     and RabbitMQ does not support replay without additional infrastructure.
     We considered RabbitMQ with a separate event store but rejected it
     because of the operational complexity of maintaining two systems."
  New engineer: "Got it. That makes sense given the compliance requirement."

The ADR saved 3 weeks of re-evaluation. More importantly, it preserved the institutional knowledge that would otherwise leave when people leave.

The Lightweight ADR Format

ADRs should be lightweight. If writing an ADR takes more than 30 minutes, the format is too heavy. Here is a minimal format that captures everything necessary:

# ADR-007: Use Kafka for Event Streaming

## Status
Accepted (2025-11-15)

## Context
We need an event streaming system for order events. These events
must be consumed by 4 downstream services (billing, inventory,
analytics, audit). The audit service requires message replay
capability for compliance.

## Decision
We will use Apache Kafka as our event streaming platform.

## Alternatives Considered

### RabbitMQ
Pros: simpler setup, team has experience
Cons: no native message replay, would need a separate event store
      for audit compliance
Rejected because: the operational complexity of RabbitMQ + event store
exceeds the complexity of running Kafka

### AWS SNS/SQS
Pros: managed service, no infrastructure to maintain
Cons: no message replay, vendor lock-in, higher per-message cost
      at our volume (estimated 2M events/day)
Rejected because: no replay capability and cost at scale

### Custom event store on PostgreSQL
Pros: simple, uses existing infrastructure
Cons: not designed for high-throughput streaming, would need
      custom consumer group logic
Rejected because: reinventing the wheel for a solved problem

## Consequences
- We need Kafka operational expertise (training for the team)
- We accept Kafka's operational complexity (ZooKeeper, partitioning)
- We gain message replay, which satisfies the audit compliance requirement
- We gain a scalable backbone for future event-driven features

The Sections Explained

Status:
  Proposed, Accepted, Deprecated, or Superseded (by ADR-XXX)
  Include the date. Dates matter for historical context.

Context:
  What is the situation? What problem are we solving?
  What constraints exist? This is the most important section.
  A future reader who understands the context can evaluate
  the decision themselves.

Decision:
  What did we decide? One or two sentences. Be specific.

Alternatives Considered:
  What else did we think about? Why did we reject each?
  This is the second most important section. It prevents
  re-evaluation of options that were already rejected.

Consequences:
  What are the positive and negative outcomes of this decision?
  Be honest about the costs, not just the benefits.

When to Write an ADR

Not every decision needs an ADR. The threshold is:

Write an ADR when:
  - The decision is hard to reverse (infrastructure, data storage,
    language choice, major framework)
  - Multiple reasonable alternatives exist and the choice is not obvious
  - The decision affects multiple teams or services
  - Someone in the future will ask "why did we do it this way?"
  - The decision involves a significant trade-off

Do NOT write an ADR when:
  - The decision is trivially reversible (variable naming, file structure)
  - There is only one reasonable option
  - The decision is local to one function or file
  - The industry standard is clear and you are following it

Examples of ADR-Worthy Decisions

  - Choosing a database (PostgreSQL vs MongoDB vs DynamoDB)
  - Choosing a communication pattern (sync HTTP vs async events)
  - Choosing an authentication approach (JWTs vs sessions)
  - Deciding on a deployment strategy (Kubernetes vs serverless)
  - Choosing a programming language for a new service
  - Deciding to split a monolith into services (or not)
  - Choosing a testing strategy (unit-heavy vs integration-heavy)
  - Deciding on an API style (REST vs GraphQL vs gRPC)

Examples of Non-ADR Decisions

  - Which linter rules to enable (put in linter config)
  - How to name a variable (code review feedback)
  - Which test framework to use (usually obvious for the language)
  - Whether to use tabs or spaces (code style guide)

Where to Store ADRs

Store ADRs in the repository, close to the code they describe.

Recommended structure:
  project-root/
    docs/
      adr/
        001-use-postgresql-for-storage.md
        002-adopt-event-sourcing-for-orders.md
        003-choose-kafka-for-streaming.md
        004-migrate-to-typescript.md
        template.md

Why in the repository:
  - Version controlled (history is preserved)
  - Close to the code (found by anyone with the repo)
  - Changes are reviewed in PRs (team input on decisions)
  - Searchable with grep/ripgrep
  - Travels with the code (no external wiki link rot)

Number them sequentially. The numbers provide ordering and easy reference ("see ADR-007").

The ADR Lifecycle

ADRs are not permanent truths. They are records of decisions made in a specific context. When the context changes, decisions may change too.

Lifecycle:
  1. Proposed: someone writes the ADR and opens a PR
  2. Accepted: team reviews, discusses, and merges
  3. Active: the decision is in effect and guides development
  4. Deprecated: the context has changed enough that this decision
     no longer applies, but we have not replaced it yet
  5. Superseded: a new ADR replaces this one
     "Superseded by ADR-015"

Never delete old ADRs. Mark them as superseded and link to the
replacement. The old ADR still has value as historical context.

Superseding an ADR

# ADR-015: Migrate from Kafka to Redpanda

## Status
Accepted (2026-03-10)
Supersedes: ADR-007 (Use Kafka for Event Streaming)

## Context
Since ADR-007, our operational burden with Kafka has increased.
ZooKeeper management consumes 10+ hours per month. Redpanda offers
Kafka API compatibility with simpler operations (no ZooKeeper).

## Decision
We will migrate from Apache Kafka to Redpanda.

## Why ADR-007's Reasoning No Longer Applies
ADR-007 chose Kafka over alternatives primarily for message replay.
Redpanda provides the same replay capability with the same API,
so the core requirement is still met. The operational complexity
that ADR-007 accepted as a consequence is now the motivating
reason to switch.

Note how the new ADR references the old one and explicitly addresses why the original reasoning no longer applies.

Common ADR Mistakes

The Missing Alternatives

An ADR that only describes the chosen option is not useful. The value is in the alternatives and why they were rejected. Without alternatives, the future reader does not know what was considered and will open the same evaluation.

Bad:
  Decision: We will use PostgreSQL.
  (Why? What else was considered? Why not MongoDB?)

Good:
  Decision: We will use PostgreSQL.
  Alternatives:
    MongoDB - rejected because our data is highly relational
    DynamoDB - rejected because we need complex queries
    SQLite - rejected because we need concurrent access

The Novel-Length ADR

An ADR that takes 2 hours to write and 20 minutes to read is too long. Keep it to one page. If you need more space, the decision is probably too broad — split it into multiple ADRs.

The Retroactive ADR

Writing ADRs after the fact is harder but still valuable:

"We've been running on Kafka for a year. Should we write an ADR now?"

Yes. Interview the engineers who made the decision. Write down
what they remember about the context and alternatives. It will
not be as accurate as a contemporaneous ADR, but it is infinitely
better than no record at all.

Real-World Example: The Decision Nobody Remembered

A team used MongoDB for their primary data store. When a new tech lead joined, they asked why. Nobody could explain the reasoning — the engineer who chose it had left two years ago. The new tech lead proposed migrating to PostgreSQL, arguing that the data was relational.

Three months into the migration, they discovered the reason: the original system used MongoDB's flexible schema because the product was in rapid prototyping mode, and the data model changed weekly. By the time the migration started, the data model had stabilized, making PostgreSQL viable. But the 3-month migration could have been a 1-week evaluation if an ADR had existed.

The ADR would have said: "Chose MongoDB for schema flexibility during rapid prototyping. Re-evaluate when the data model stabilizes." The new tech lead would have read it, confirmed the model was stable, and started the migration immediately — with full context about what to watch for.

Common Pitfalls

Not recording alternatives — the most valuable part of an ADR is the alternatives that were rejected and why. Without this, the decision will be re-evaluated from scratch.
Writing ADRs that are too long — if it takes more than 30 minutes to write or more than 5 minutes to read, it is too long. One page is enough.
Not writing ADRs at all — "we'll remember" is a lie. People leave, memories fade, and context is lost. Even a hastily written ADR is better than nothing.
Deleting superseded ADRs — old ADRs have historical value. Mark them as superseded and link to the replacement. Do not delete them.
Writing ADRs for trivial decisions — not every choice needs an ADR. Save them for hard-to-reverse decisions with meaningful trade-offs. Linter rules and variable names do not need ADRs.

Key Takeaways

ADRs capture WHY, not just WHAT. They record the context, alternatives considered, and trade-offs that led to a technical decision.
Use a lightweight format: Status, Context, Decision, Alternatives Considered, Consequences. Writing an ADR should take 30 minutes or less.
Store ADRs in the repository, numbered sequentially, close to the code they describe. Repository-stored ADRs are version-controlled, searchable, and do not suffer from wiki link rot.
Write ADRs for decisions that are hard to reverse, have multiple reasonable alternatives, or will be questioned by future engineers. Skip ADRs for trivial or obvious decisions.
Never delete old ADRs. Mark them as superseded and link to the replacement. The historical context remains valuable even when the decision changes.