Tracking & Paying Down Debt

Knowing you have technical debt is not useful. Every codebase has debt. What matters is knowing where it is, how much it costs, and when to pay it down. Most teams fail at this because their debt tracking is either nonexistent or buried in a Jira backlog with 400 other tickets that nobody looks at.

A debt register is not a backlog. A backlog is where tickets go to die. A debt register is a living document that connects technical shortcuts to business impact and drives real allocation decisions.

The Debt Register

A debt register is a short, prioritized list of known technical debt items. It lives in a visible place — a shared document, a wiki page, a dedicated board. Not buried in the backlog.

Debt Register Format:
  | ID | Description              | Quadrant          | Blast Radius | Customer Impact | Dev Slowdown | Owner    | Payback Trigger        |
  |----|--------------------------|-------------------|--------------|-----------------|--------------|----------|------------------------|
  | D1 | Monolithic deploy        | Deliberate Prudent| High         | Medium (downtime)| High         | Platform | >30min deploy window   |
  | D2 | No integration tests     | Inadvertent Reckless| Medium    | High (regressions)| Medium      | Backend  | 3rd regression in prod |
  | D3 | Hardcoded feature flags  | Deliberate Prudent| Low          | None            | Low          | Frontend | >10 flags in codebase  |
  | D4 | Manual DB migrations     | Deliberate Prudent| High         | High (data risk) | Medium      | Platform | Next hire onboards     |

Keep it under 15 items. If you have more than 15, you are tracking too granularly or you have a much bigger problem than debt.

What Makes a Good Debt Entry

Each entry needs three things to be actionable:

1. A clear description of what the debt IS (not what to do about it)
   Bad:  "Refactor the payment module"
   Good: "Payment module has no separation between billing logic and
          Stripe API calls, making it impossible to add a second
          payment provider without rewriting"

2. A measurable impact
   Bad:  "Slows us down"
   Good: "Every payment-related feature takes ~40% longer because
          engineers must understand Stripe internals to change
          billing logic"

3. A payback trigger (when, not if)
   Bad:  "When we have time"
   Good: "Before we add PayPal support or when payment feature
          velocity drops below 1 feature/sprint"

Prioritizing Debt

Not all debt deserves attention. Some debt is cheap to carry. Some will bankrupt you. Prioritize using three dimensions:

Blast Radius

How much of the system does this debt affect?

Low blast radius:
  A poorly named variable. A messy utility function. A missing test
  for an edge case. These are annoying but contained.

Medium blast radius:
  A database schema that makes certain queries slow. A shared library
  with a bad API. An authentication module that's hard to extend.

High blast radius:
  A monolithic architecture that prevents independent deployment.
  A data model that fundamentally does not match the domain.
  A deployment process that requires manual steps and causes outages.

Customer Impact

Does this debt affect users directly?

Direct customer impact:
  - Slow page loads because of unoptimized queries
  - Bugs from missing validation
  - Downtime during deployments
  - Data inconsistencies visible to users

Indirect customer impact:
  - Features take longer to ship (users wait for improvements)
  - Bugs in internal tools cause support delays
  - Monitoring gaps mean you learn about problems from users, not alerts

No customer impact:
  - Messy internal code that works correctly
  - Inconsistent naming conventions
  - Missing documentation for stable systems

Development Slowdown

How much does this debt tax ongoing development?

Measuring dev slowdown:
  - Track how long features take vs how long they "should" take
  - Ask engineers: "What slowed you down this sprint?"
  - Count the number of "workarounds" in recent PRs
  - Measure how long onboarding takes for new engineers
  - Track how often engineers touch code they did not plan to touch

Shopify tracks "developer experience" metrics including build times, test suite duration, and deploy frequency. When these metrics degrade, it signals accumulating debt. They do not wait for a crisis — they treat degradation as a leading indicator.

The Prioritization Matrix

                    High Customer Impact    Low Customer Impact
High Dev Slowdown   FIX NOW                 FIX SOON
Low Dev Slowdown    FIX SOON                MONITOR (maybe never fix)

If something has high blast radius, push it up one tier regardless of the matrix position. High blast radius means the problem will get worse faster than you expect.

The 20% Rule

Allocate 20% of your engineering capacity to debt reduction. Every sprint, every cycle.

For a 2-week sprint with 2 engineers:
  Total capacity: ~20 engineer-days
  20% allocation: ~4 engineer-days for debt work

This is not:
  - A "tech debt sprint" once a quarter (debt accumulates between sprints)
  - A vague promise to "clean up when we have time" (you never have time)
  - An engineer's side project (it needs to be planned and prioritized)

This IS:
  - Planned work with clear outcomes
  - Pulled from the debt register, not invented on the fly
  - Reviewed and demo'd like any other work
  - Protected from being stolen by feature work

Google's famous "20% time" was originally about innovation, but many teams use a similar allocation for debt and infrastructure. The exact percentage matters less than the consistency. 15% works. 25% works. 0% does not work, and "whenever we have time" is 0%.

Basecamp (now 37signals) builds in "cooldown" periods between cycles — six weeks of feature work followed by two weeks of fixing, refactoring, and debt reduction. This prevents the gradual accumulation that makes debt invisible until it is overwhelming.

The Tech Debt Standup

Once a week, ask one question: "What is slowing us down?"

Tech Debt Standup Format (15 minutes, weekly):
  Each engineer answers:
    1. What slowed me down this week? (specific, not vague)
    2. Is this already on the debt register?
    3. If not, should it be?

  Then as a team:
    4. Do we need to reprioritize the register?
    5. What debt work are we doing next sprint?

This is not a regular standup. It is a dedicated space for surfacing friction. In a regular standup, people report what they did. In a debt standup, people report what the codebase did to them.

Why a Separate Meeting

Debt discussions get crowded out by feature discussions every time. Product managers care about features. Designers care about user experience. The only people who feel the pain of debt are engineers, and they often cannot articulate it in business terms during a sprint planning meeting.

A dedicated debt standup gives engineers a space to surface problems and translate them into business impact:

Bad framing (in sprint planning):
  "We should refactor the notification system."

Good framing (in debt standup, then brought to planning):
  "The notification system has no queue. Every new notification type
   requires touching 4 files and a deploy. It took 3 days to add
   SMS notifications when it should have taken 4 hours. If we add a
   queue, every future notification type is a 2-hour task."

The second framing gets funded. The first gets deprioritized forever.

Paying Down Debt in Practice

The Strangler Approach

Do not schedule a "debt sprint" where you stop all feature work and clean up. This never works because:

Why debt sprints fail:
  1. Stakeholders resent the pause in feature delivery
  2. Engineers try to fix everything at once and finish nothing
  3. The debt comes back because the habits that created it haven't changed
  4. Nobody can point to business value from the sprint

Instead, attach debt work to feature work:

The "leave it better" rule:
  When you touch a module to add a feature, also fix one piece of
  debt in that module. Not all of it. One thing.

  Adding a new API endpoint to the payments module?
  Also add the missing input validation that's been on the register.

  Building a new notification type?
  Also extract the shared notification logic into a proper service.

This approach means debt gets paid down in the areas where you are actively working, which are also the areas where debt is most likely to slow you down.

Dedicated Debt Tickets

For debt that is not near active feature work, pull items from the register into the sprint as planned work:

Good debt ticket:
  Title: Add database migration tooling
  Why: Manual migrations risk data loss and take 2 hours each.
       We've done 6 this quarter.
  Outcome: Migrations run via CLI command with rollback support.
  Effort: 3 days
  Business value: Eliminates ~12 hours/quarter of manual work and
                  reduces data loss risk.

Bad debt ticket:
  Title: Refactor database layer
  Why: It's messy
  Outcome: Cleaner code
  Effort: Unknown
  Business value: Unknown

Tracking Progress

Measure debt reduction the same way you measure feature delivery:

Debt metrics to track:
  - Number of items on the debt register (should trend down or stay stable)
  - Age of oldest item (if something has been there 6 months, either
    fix it or accept it's not actually debt worth tracking)
  - Sprint velocity trend (if debt reduction is working, velocity
    should gradually improve)
  - Incident frequency (many incidents trace back to known debt)
  - Onboarding time for new engineers (a proxy for codebase health)

Real-World Debt Paydown Stories

Slack's engineering team famously paused feature development in 2016 to focus on reliability and infrastructure. They called it a "quality release" and spent several months on it. The result was a dramatic improvement in stability. But they did it because they had waited too long — the debt had reached crisis levels. The 20% rule prevents you from ever needing a "quality release."

Etsy built a culture of continuous deployment and continuous improvement. Engineers were expected to leave code better than they found it. They measured deploy frequency, lead time, and failure rate. Debt was never allowed to accumulate to crisis levels because it was being paid down continuously.

Common Pitfalls

The Jira graveyard. Filing debt as tickets in the backlog feels productive. It is not. Those tickets get deprioritized sprint after sprint until nobody remembers why they were filed. A debt register is not a backlog — it is a small, curated, prioritized list.
Treating debt reduction as optional. If debt reduction only happens when there is "leftover" capacity, it never happens. Protect the 20% allocation the way you protect sprint commitments.
No payback triggers. "We'll fix it when we have time" means "we'll never fix it." Every debt item needs a concrete condition that triggers remediation.
Measuring effort instead of impact. "We spent 4 days on debt reduction" is not meaningful. "We reduced deploy time from 30 minutes to 5 minutes" is meaningful. Tie debt work to outcomes.
Boiling the ocean. Trying to fix all debt at once fixes nothing. Pick the highest-impact item, finish it, measure the improvement, then pick the next one.
Not celebrating debt paydown. Feature launches get announcements. Debt reduction gets silence. This trains the team to deprioritize debt work. Celebrate the 30-minute deploy that became a 5-minute deploy. Celebrate the test suite that now catches regressions.

Key Takeaways

Keep a debt register: a short, prioritized, living document with clear descriptions, measurable impact, and payback triggers. Not a Jira backlog.
Prioritize debt by blast radius, customer impact, and development slowdown. High blast radius debt gets fixed first because it gets worse fastest.
Allocate 20% of every sprint to debt reduction. Not a quarterly sprint, not "when we have time" — every sprint, consistently.
Run a weekly tech debt standup. One question: "What slowed us down?" Surface friction, update the register, plan next actions.
Attach debt work to feature work when possible. The "leave it better" rule ensures debt gets paid in the areas where it matters most.