Chesterton's Fence

Overview

G.K. Chesterton proposed a simple rule: if you come across a fence across a road and do not see the reason for it, do not tear it down. First, find out why it was put there. Only after you understand its purpose should you decide whether to remove it.

In engineering, the fence is that strange piece of code, that seemingly pointless process, that "unnecessary" configuration. The temptation is strong: this looks wrong, so remove it. But the fence was put there by someone who had a reason. Your job is to find that reason before you act.

The Principle

The wrong way:
  "This code looks unnecessary. I'll delete it."
  "This process is slow. I'll skip it."
  "This config setting is weird. I'll change it to something sensible."

The right way:
  "This code looks unnecessary. Why was it written?"
  "This process is slow. What problem does it solve?"
  "This config setting is weird. What happens if I change it?"

The rule is not "never remove the fence."
The rule is "understand the fence before you remove it."

Engineering Fences

The "Unnecessary" Code

Scenario: You're refactoring the payment module. You find this:

  if user.country == "BR":
      amount = round(amount, 2)
      # force recalculation after rounding
      tax = calculate_tax(amount, user.country)

You think: "This is weird. We already round amounts globally.
And why recalculate tax after rounding? I'll clean this up."

The fence: Brazil has tax calculation rules where the tax is
computed on the rounded amount, not the original amount. A 0.01
difference in the base amount can cascade into a compliance violation.
An engineer discovered this after a government audit flagged
discrepancies. The comment could be better, but the code is correct.

What happens when you remove the fence:
  Tax calculations for Brazilian users are off by fractions of a cent.
  No one notices for months.
  An audit finds thousands of incorrect tax filings.
  The fix costs more than the original implementation.

The "Slow" Process

Scenario: Your deployment process requires a 30-minute staging test
after every deploy before promoting to production. You think:
"Our CI tests are comprehensive. This staging step is wasted time.
Let's skip it and deploy straight to production."

The fence: Two years ago, a dependency upgrade passed all CI tests
but caused a memory leak that only manifested under real traffic
patterns. The staging test with synthetic load was added after that
incident. CI tests run with clean state; staging tests run with
accumulated state.

What happens when you remove the fence:
  Most deploys are fine — this reinforces the belief that
  the staging step was unnecessary.
  Months later, a similar dependency issue reaches production.
  The 30 minutes you saved per deploy is dwarfed by the
  incident response.

The "Pointless" Meeting

Scenario: There's a weekly 15-minute sync between the backend team
and the data team. Most weeks, both teams say "nothing to report."
You think: "This meeting is a waste. Cancel it. People can reach
out asynchronously if they need something."

The fence: The meeting was created after a data pipeline broke
because the backend team changed a table schema without telling
the data team. The meeting isn't for the weeks when nothing
changed. It's for the one week when something did change and
someone would otherwise forget to mention it.

What happens when you remove the fence:
  Three months later, a schema change breaks the data pipeline.
  Someone says: "Didn't we used to have a meeting for this?"

How to Investigate the Fence

Before removing anything, follow this process.

Step 1: Check the history
  → git log on the file or function
  → git blame on the specific lines
  → Search for related commits, PRs, or issues
  → Look for comments, commit messages, or linked tickets

Step 2: Ask the people
  → Who wrote this code? Ask them why.
  → If they're gone, ask the team's longest-tenured member.
  → Check Slack/email archives for discussions about the change.

Step 3: Check for edge cases
  → Is there a test for this code? What does it test?
  → Are there production logs showing when this code path executes?
  → Is this code related to a specific customer, region, or compliance rule?

Step 4: Understand the failure mode
  → What would happen if this code didn't exist?
  → What failure was this code written to prevent?
  → Can you reproduce the original problem in a test environment?

Step 5: Now decide
  → If you understand the reason AND the reason no longer applies:
      remove the fence.
  → If you understand the reason AND the reason still applies:
      keep the fence (maybe improve it).
  → If you cannot find the reason:
      keep the fence. Add a comment explaining what you know.
      Revisit later when you have more context.

Applying Chesterton's Fence

In Code Reviews

When reviewing code that removes or changes existing behavior:

  Bad review comment:
    "LGTM, nice cleanup!"

  Good review comment:
    "This removes the retry logic on the payment endpoint.
     Do you know why the retry was added originally?
     I see it was introduced in commit abc123 after issue #456."

  As a reviewer, your job is to be the voice of the fence.
  Ask the author to demonstrate they understand what they're removing
  and why it's safe to remove.

In Refactoring

Refactoring is where Chesterton's Fence matters most. You are
explicitly changing code that works. The risk is introducing bugs
into code that had none.

Before refactoring:
  1. Read the git history for the entire file, not just
     the function you're changing.
  2. Run the existing tests. If there are no tests, write
     tests that capture current behavior BEFORE you change anything.
  3. Search for comments with words like "hack," "workaround,"
     "temporary," "do not remove," or issue numbers.
  4. Check if the code handles edge cases that are not obvious
     from the happy path.

The safest refactoring preserves behavior exactly. If you find
yourself saying "I'm also fixing this while I'm here," stop.
That's a separate change that deserves its own investigation.

In Process Changes

Engineering processes are fences too. They were usually created
in response to a painful incident.

Process                          Likely fence reason
──────────────────────────────────────────────────────────────
Manual deploy approval           A bad deploy caused an outage
Required security review         A vulnerability reached production
Mandatory design doc for         A large project failed due to
  projects over 2 weeks           unclear requirements
Two-person rule for              An accidental database change
  database migrations              caused data loss
Change freeze before holidays    An incident happened when
                                   everyone was unavailable

Before removing a process, find the incident that created it.
If the underlying risk still exists, you need the process
(or a better replacement).

When the Fence Should Come Down

Chesterton's Fence is not an argument for never changing anything. It is an argument for understanding before changing.

Remove the fence when:

  The reason no longer applies:
    "This code handles a bug in Python 2.7. We migrated to
     Python 3.11 two years ago."

  A better solution exists:
    "This manual deploy checklist prevents bad deploys. An automated
     deploy pipeline with health checks does the same thing faster."

  The cost exceeds the benefit:
    "This approval process prevents bad changes but adds 3 days
     to every deploy. The cost of slowness now exceeds the cost
     of the occasional bad change it prevents."

  The context has changed:
    "This rate limiter was set for 100 requests per second when
     we had 1,000 users. We now have 100,000 users and the limit
     is causing legitimate traffic to be dropped."

The Cost of Ignoring the Fence

Real-world examples of removed fences:

Fence: Sleep(500ms) before database write
Reason: The database had a replication lag issue
Removed because: "Sleeps in code are always bad"
Result: Write-after-read consistency failures
         across the application for two weeks

Fence: Disabled the ORM's lazy loading for one specific query
Reason: The query caused an N+1 problem that took down the service
Removed because: "We should use consistent patterns"
Result: N+1 problem returned, service degradation during peak hours

Fence: Hardcoded timeout of 30 seconds on one API call
Reason: The upstream service sometimes hung indefinitely
Removed because: "We have a global timeout of 10 seconds"
Result: The upstream service hung, the 10-second timeout wasn't
        applied to that specific HTTP client, threads exhausted

Common Pitfalls

Removing code you don't understand because it "looks wrong": Looking wrong is not the same as being wrong. Investigate before removing. The ugliest code in the codebase often handles the ugliest edge cases.
Assuming previous engineers were less skilled: They had context you don't have. They dealt with constraints you don't know about. The code may look bad because the problem was hard, not because the engineer was bad.
Keeping the fence forever out of fear: The principle says understand before removing, not never remove. If you understand the reason and it no longer applies, remove it. Dead code and obsolete processes have their own costs.
Replacing the fence with something that doesn't serve the same purpose: You replace a manual approval process with an automated one. But the automated one doesn't check the thing the manual process was actually checking. You removed the fence and built a gate in a different spot.
Not documenting why the fence exists: If you find a fence, understand it, and decide to keep it, add a comment explaining why. The next person should not have to repeat your investigation.

Key Takeaways

Before removing code, processes, or configuration, understand why they exist. The rule is not "never change anything" but "understand first, then decide."
Use git history, blame, commit messages, issue trackers, and conversations with team members to find the reason behind the fence. Most fences were built in response to a real problem.
The most dangerous refactors remove behavior that handles edge cases the refactorer has never encountered. Write tests for current behavior before changing anything.
When you find a fence, understand it, and decide to keep it, add documentation. Comments like "this handles edge case X from incident Y" save the next engineer from repeating the investigation.
Remove the fence when the reason no longer applies, a better solution exists, or the cost clearly exceeds the benefit. Chesterton's Fence is a gate to thoughtful change, not a wall against all change.