Theory of Constraints
Eliyahu Goldratt introduced the Theory of Constraints (TOC) in his 1984 novel The Goal. The premise is deceptively simple: every system has exactly one constraint that limits its overall throughput. The system can only move as fast as its slowest part. Optimizing anything other than the constraint is an illusion of progress. You feel productive. The metrics on the non-bottleneck look great. But the system output does not change. TOC gives engineers a disciplined method for finding the real bottleneck, exploiting it, and then moving on to the next one.
The Five Focusing Steps
Goldratt formalized TOC into five repeating steps:
1. IDENTIFY - Find the constraint (the slowest step)
2. EXPLOIT - Get maximum throughput from the constraint as-is
3. SUBORDINATE - Align everything else to support the constraint
4. ELEVATE - Invest to increase the constraint's capacity
5. REPEAT - The constraint has moved; find the new one
This is a cycle, not a one-time fix. The moment you elevate a constraint, something else becomes the bottleneck. The system never stops having a constraint. What changes is which part of the system it lives in.
Why Engineers Get This Wrong
Most engineering teams optimize whatever is in front of them. The frontend team makes the UI faster. The backend team refactors the API layer. The infrastructure team upgrades the database. Everyone is busy. Everyone is shipping. But if the actual constraint is the deployment pipeline taking 45 minutes, none of that work changes how fast value reaches users.
Before identifying the constraint:
Code Review (2h) -> CI Build (30min) -> Manual QA (5 days) -> Deploy (2min)
Total cycle time: ~5.1 days
Actual bottleneck: Manual QA
Team A optimizes CI from 30min to 10min.
New total: ~5.08 days
Improvement: 0.4%
Team B automates 80% of QA checks.
New total: ~1.1 days
Improvement: 78%
Team A worked hard. Team B worked on the constraint.
Identifying the Constraint in Software Systems
Constraints show up in predictable places in engineering organizations:
Build & Deploy Pipeline
Symptom: Engineers complain about slow feedback loops
Look for: Longest stage in CI/CD pipeline
Common: Test suites that grew unchecked, sequential builds
that could be parallel, manual approval gates
Code Review
Symptom: PRs sit open for days
Look for: Review queue depth, number of qualified reviewers
Common: One senior engineer reviews everything, no review
rotation, unclear ownership
Decision Making
Symptom: Work stalls waiting for "alignment" or "approval"
Look for: How many people must agree before work starts
Common: Architecture review boards that meet monthly,
managers who must approve every technical decision
External Dependencies
Symptom: Teams blocked on other teams
Look for: Cross-team request queues, API contract disputes
Common: Platform team with 30 consumers and no self-serve,
shared database with no ownership model
Exploit Before You Elevate
Step 2 (Exploit) is the most underused step. Engineers jump straight to "elevate" — throw money or people at the problem. But exploitation means getting maximum output from the constraint without adding resources.
Constraint: QA team can test 5 features per sprint
Exploit:
- Prioritize which features actually need QA
- Give QA automated smoke tests so manual testing
focuses on edge cases
- Move QA earlier in the process (shift left)
- Write better acceptance criteria so QA does not
have to guess
Elevate (only after exploit is exhausted):
- Hire more QA engineers
- Buy better testing infrastructure
- Build a dedicated test environment
Exploitation is almost free. Elevation costs real money and time. Do exploitation first.
Subordination: The Hardest Step
Subordination means telling non-constraints to slow down if they are creating waste. This is politically difficult. If the backend team can produce 20 features per sprint but QA can only test 5, the backend team should not produce 20. Producing 15 features that sit in a queue is not productivity — it is inventory. In software, inventory is work-in-progress: unmerged branches, untested features, unreleased code. It has carrying costs: merge conflicts, context switching, stale implementations.
Without subordination:
Backend ships 20 features -> QA queue grows -> 15 features
sit untested -> merge conflicts multiply -> rework increases
With subordination:
Backend ships 5 features -> QA tests 5 features -> 5 features
reach production -> backend uses remaining time to write
automated tests, reduce QA burden, improve documentation
The backend team feels "slower" but the system moves faster.
Real-World Engineering Examples
Example: Microservice Latency
A request travels through five services. Service C takes 800ms. Services A, B, D, and E take 50ms each. Total latency: 1000ms.
A (50ms) -> B (50ms) -> C (800ms) -> D (50ms) -> E (50ms)
Optimizing A from 50ms to 10ms: total = 960ms (4% improvement)
Optimizing C from 800ms to 200ms: total = 400ms (60% improvement)
Always profile before optimizing. Always optimize the constraint.
Example: Hiring Pipeline
Your team needs to grow from 5 to 10 engineers. The pipeline:
Sourcing (50 candidates/week) -> Phone Screen (20/week) ->
Technical Interview (3/week) -> Offer (2/week)
Constraint: Technical interviews (3/week)
- Only 2 interviewers qualified
- Each interview takes 2 hours including write-up
Exploit: Train 3 more interviewers, use structured rubrics
to speed write-ups, batch interviews on 2 days
Subordinate: Stop sourcing 50/week when you can only
interview 3. Reduce sourcing, increase quality.
Elevate: Hire a recruiting coordinator, build interview
tooling
Example: Incident Response
Mean time to resolution (MTTR) is 4 hours. The breakdown:
Detection (5min) -> Alerting (2min) -> Triage (15min) ->
Diagnosis (3h) -> Fix (30min) -> Deploy (10min)
Constraint: Diagnosis (3 hours)
Exploit: Better logging, runbooks, dashboards that surface
the probable cause automatically
Subordinate: Do not invest in faster alerting (2min is fine)
Elevate: Build anomaly detection, invest in observability
tooling
TOC & Work-in-Progress Limits
TOC directly influenced Kanban and lean manufacturing. WIP limits exist because of TOC: if you limit work-in-progress to match the constraint's capacity, you stop building inventory and start finishing work.
Without WIP limits:
10 things started, 2 things finished per week
8 things aging in the queue
With WIP limits (matched to constraint):
3 things started, 2-3 things finished per week
Near-zero queue, faster cycle time
Common Pitfalls
- Optimizing non-constraints: The most common mistake. It feels productive but does not improve system throughput. Measure the whole system, not individual components.
- Skipping exploitation: Jumping to "hire more people" or "buy better tools" before extracting full value from the current constraint. Exploitation is cheaper and faster.
- Ignoring that constraints move: After you fix one bottleneck, another emerges. Teams celebrate the fix and stop looking. TOC is a continuous cycle.
- Confusing busy with productive: A team at 100% utilization is not necessarily productive. If they are producing work that sits in a queue downstream, they are creating inventory, not value.
- Political resistance to subordination: Telling a team to slow down is career-threatening advice in most organizations. Frame it as "redirect capacity to support the constraint" instead.
Key Takeaways
- Every system has exactly one constraint at any given time. Find it before you optimize anything.
- The five focusing steps (Identify, Exploit, Subordinate, Elevate, Repeat) are a cycle, not a checklist.
- Exploit the constraint before investing in elevation. Get maximum throughput from what you have.
- Subordinate non-constraints to the constraint. Overproduction upstream creates inventory and waste.
- After fixing a constraint, it moves. Immediately look for the new one.
- Measure system throughput, not local efficiency. A fast component feeding a slow component is waste.