ADRs & RFCs

Architecture Decision Records (ADRs) and Requests for Comments (RFCs) are the two dominant written formats for recording technical decisions in software organizations. They differ in scope and stage — RFCs propose, ADRs record — and most mature engineering teams use both. Adopting them is one of the highest-leverage communication moves a team can make: it converts tribal knowledge into durable documentation, surfaces disagreements early, and creates a searchable history of why the system looks the way it does.

Origin

RFC (Request for Comments) — The term comes from the IETF (Internet Engineering Task Force), which has published internet standards as numbered RFCs since 1969. The first RFC (RFC 1) documented the design of the ARPANET host software. The cultural convention of "propose in writing, get comments, iterate, publish" spread from IETF to the broader open-source world and, eventually, into corporate engineering practice.
ADR (Architecture Decision Record) — Popularized by Michael Nygard in a 2011 blog post titled Documenting Architecture Decisions. Nygard's observation: architectural decisions made years ago shape every daily choice, but the reasoning is usually lost. ADRs are a cheap, lightweight way to preserve that reasoning.

Both conventions are now standard across tech companies that have grown beyond the single-team stage.

The Frameworks

RFC: Request for Comments

An RFC is a proposal document, written before the decision is made, explicitly intended to invite feedback.

Standard RFC structure:

1. Title & Metadata        (author, date, status, reviewers)
2. Summary                 (TL;DR in 2-3 sentences)
3. Motivation              (why this problem is worth solving)
4. Proposed Solution       (the recommendation)
5. Alternatives Considered (at least 2, with why they were rejected)
6. Consequences            (what becomes easier, what becomes harder)
7. Open Questions          (unresolved issues, for reviewers to weigh in)
8. Decision Log            (filled in as reviewers comment)

ADR: Architecture Decision Record

An ADR is a record document, written after (or at the moment of) the decision, intended as a durable artifact.

Standard ADR structure (Nygard format):

1. Title       (e.g., "ADR-042: Use Postgres instead of MySQL")
2. Status      (proposed / accepted / deprecated / superseded by ADR-XXX)
3. Context     (what forces are at play, what problem we're solving)
4. Decision    (the decision, stated clearly)
5. Consequences (positive, negative, neutral outcomes of this decision)

The minimalism is the point — a 1-page ADR is often better than a 5-page one because it is more likely to be written, read, and updated.

How They Fit Together

RFCs and ADRs are stages of the same process:

Problem noticed
       |
       v
   [Draft RFC]    <-- propose solution, invite comments
       |
       v
   [Review period: 1-2 weeks]
       |
       v
   [Decision made]
       |
       v
    [ADR]         <-- record the outcome for posterity
       |
       v
   Implementation, referenced by ADR

Not every decision needs both. Heuristic:

Use only an ADR:   Small, local decision; clear best answer
Use only an RFC:   Exploratory; outcome uncertain
Use both:          Cross-team impact, lasting consequences

How to Use It

Writing an Effective RFC

1. Write the TL;DR first. If you cannot fit the proposal in 3 sentences,
   you are not clear on it yet.
2. Always include alternatives. An RFC with one option is a decree,
   not a proposal.
3. Name the reviewers. "The team" will not review; Priya, Sam, and
   Jordan will.
4. Set an explicit deadline. "Comments close Friday" forces engagement.
5. Summarize discussion in the document. Do not let the real discussion
   live only in Slack threads.

Writing an Effective ADR

1. Write it in under 30 minutes. The value is in having it, not in
   perfection.
2. Be concrete about Context. "Scalability was an issue" is not useful
   in 2 years; "We were hitting 900 qps with 200ms p99" is.
3. State the Decision in one sentence. "We will use Postgres for all
   new services as of 2024-Q1."
4. Be honest about Consequences. Include the bad ones. Future you will
   thank past you.
5. Number them sequentially and store them in-repo.

Tech & Company Example

A platform team is deciding on an observability vendor.

RFC-2024-017: Adopt OpenTelemetry + Grafana for Tracing

Summary
  Replace New Relic with OpenTelemetry SDKs feeding into Grafana Tempo.
  Estimated savings: $840k/yr. Estimated migration effort: 6 engineer-weeks.

Motivation
  New Relic contract renews in 90 days with a 35% price hike.
  Current usage is 2x the plan we're on, so actual cost is worse.
  Meanwhile, Grafana stack is already our primary metrics + logs backend.

Proposed Solution
  - Adopt OpenTelemetry SDKs in Go, TypeScript, Python, Kotlin services.
  - Route traces to self-hosted Grafana Tempo.
  - Sunset New Relic in two phases: new services off NR in Q3, all off in Q4.

Alternatives Considered
  A. Stay on New Relic, negotiate contract: saves $, doesn't save on the
     platform fragmentation cost.
  B. Honeycomb.io: excellent product, but adds another vendor + cost.
  C. Datadog APM: feature-rich, priciest option, and we already avoid
     DD for cost reasons on metrics.

Consequences
  + Unified observability stack, single query language (LogQL/TraceQL)
  + Cost savings fund 2 more engineer headcount in ops
  - 6 weeks of migration effort, must do during quiet period
  - Tempo is less feature-rich than NR for complex correlation queries

Open Questions
  - Do we self-host Tempo or use Grafana Cloud Tempo?
  - Retention policy (currently 30 days on NR; is that correct?)

Decision Log
  [To be filled in as reviewers comment]

After a 2-week review, the decision is made. The ADR is then:

ADR-089: Use OpenTelemetry + Grafana Tempo for tracing

Status: Accepted, 2024-05-12
Supersedes: ADR-041 (which adopted New Relic in 2021)

Context
  New Relic contract renewal priced at $1.2M/yr (35% increase) with
  actual usage running 2x planned. Grafana stack is already primary
  for metrics (Mimir) and logs (Loki). Observability platform
  fragmentation has been a recurring complaint in platform surveys.

Decision
  Adopt OpenTelemetry SDKs for all services; route traces to self-hosted
  Grafana Tempo. Sunset New Relic by 2024-Q4.

Consequences
  + Single observability stack reduces context-switching and auth sprawl.
  + Cost savings of ~$840k/yr.
  + OpenTelemetry is vendor-agnostic; we can swap backends later.
  - Engineers lose some advanced NR features (Applied Intelligence).
  - Self-hosting Tempo adds ops load (~0.5 FTE).
  - Migration takes 6 engineer-weeks; sequenced in Q3 quiet window.

See RFC-2024-017 for full analysis.

The RFC captured the decision process; the ADR captures the outcome. Two years later, when someone asks "why are we on Tempo?", they have an answer in 90 seconds.

When It Works

Any decision that will outlast the team that made it
Technical choices with long-lived consequences (languages, databases, protocols)
Org-level decisions about how work gets done (branching, on-call, SDLC)
Onboarding — new engineers can read the ADR history to understand the system

When It Does Not Work

Trivial decisions (what color the button is) — creates overhead without value
Decisions under active experimentation — freezing a decision too early in an experimental space is wasteful
Cultures without writing norms — ADRs/RFCs only work if people actually read them

Common Failure Modes

RFC as fait accompli — Publishing an RFC that is already a final decision in disguise. Reviewers detect this and disengage.
ADR amnesia — Writing ADRs then never linking to them or referencing them; they become a graveyard.
The RFC that never closes — A proposal that gathers comments forever without anyone making a call. Set explicit deadlines.
Copy-paste context — Context sections that say "we have scaling challenges" with no specifics; useless in 2 years.
Missing alternatives — An RFC with only one option is not soliciting feedback, it is announcing a decision.
Misaligned scope — Using an ADR for a decision that needs broad input (should be an RFC) or an RFC for a local decision (should be an ADR).

Nygard ADR — The minimalist original (Title, Status, Context, Decision, Consequences).
MADR (Markdown ADR) — Richer template with explicit Decision Drivers, Considered Options, and Pros/Cons.
Y-statements — One-sentence ADRs: "In the context of X, facing Y, we decided Z to achieve W, accepting U."
Python PEPs, Rust RFCs, React RFCs — Public open-source RFC processes; useful templates to study.
Design Doc — Google-origin term for what many call RFC. Overlapping scope; often synonymous.
Amazon 6-pager — Narrative alternative for business/product decisions (see next subtopic).