Estimation Without Lying

Software estimates are always wrong. This is not a character flaw — it is a fundamental property of estimating novel work under uncertainty. Despite this, estimates remain useful. They enable planning, coordination, and trade-off decisions. The skill is not becoming a perfect estimator. It is becoming a less-wrong estimator who communicates uncertainty honestly, so that the people depending on your estimates can make good decisions.

Why Estimates Are Always Wrong

Software estimation fails for structural reasons, not personal ones:

Reason 1: Novel work
  You are not estimating how long it takes to do something
  you have done before. You are estimating how long it takes
  to figure out how to do something new. The uncertainty is
  in the discovery, not the execution.

Reason 2: Hidden complexity
  Every task has invisible sub-tasks you discover only after
  starting. "Add a date picker" sounds like 2 hours until you
  discover timezone handling, locale formatting, accessibility
  requirements, and mobile browser compatibility.

Reason 3: Dependencies on others
  Your task requires a code review, a design decision, a
  DevOps change, or a response from a third-party API vendor.
  You can estimate your work but not the wait times.

Reason 4: Interruptions and overhead
  You estimate 6 hours of coding, but you only get 3 hours
  of focus time in a day because meetings, Slack, and PR
  reviews consume the rest.

Reason 5: Optimism bias
  Humans consistently underestimate task duration. This is
  called the "planning fallacy" (Kahneman & Tversky, 1979)
  and it affects experts and novices equally.

Understanding why estimates fail does not mean giving up on estimation. It means building practices that account for these failure modes.

T-Shirt Sizing

When precision is not needed (and it often is not), use relative sizing instead of time estimates.

T-shirt sizes:
  XS  ->  less than half a day
  S   ->  half a day to one day
  M   ->  2-3 days
  L   ->  a week
  XL  ->  more than a week (break this down further)

Advantages:
  - Faster to assign than hour estimates
  - Encourages thinking in ranges, not false precision
  - Easier to calibrate across a team
  - Removes the illusion of precision

Usage:
  "This task is a Medium — a couple days of work."
  vs
  "This task is 14 hours" (false precision that implies confidence
  you do not have)

T-shirt sizing works best for sprint planning and backlog grooming. It communicates rough effort without creating the expectation of exact timelines.

Reference-Based Estimation

The most accurate estimation method is comparing new work to completed work that you have actual data on.

Process:
  1. Find a similar completed task (the reference)
  2. Note how long the reference actually took
  3. Identify differences between the reference and the new task
  4. Adjust the estimate based on those differences

Example:
  New task: "Add Stripe webhook handler for subscription events"
  
  Reference: "Add Stripe webhook handler for payment events"
  That took 3 days (1 day implementation, 1 day testing, 1 day review/fixes)
  
  Differences:
  - Subscription events are more complex (more event types)
  - We already have the webhook infrastructure from the payment work
  - Need to coordinate with the billing team on event format
  
  Estimate: 3-4 days
  (more complex events, but infrastructure exists, minus coordination time)

Building a Reference Library

Keep track of how long things actually take:

Task                              Estimated    Actual    Ratio
---------------------------------------------------------------
Payment webhook handler           2 days       3 days    1.5x
User profile API endpoint         1 day        1.5 days  1.5x
Database migration (5 tables)     3 days       5 days    1.7x
Frontend form with validation     1 day        2 days    2.0x
CI/CD pipeline setup              2 days       4 days    2.0x

After collecting 10-20 data points, you will see your personal estimation bias. Most engineers consistently underestimate by 1.5-2x. Knowing your multiplier is more valuable than improving your raw estimates.

Padding Honestly

"Padding" has a bad reputation because it is associated with sandbagging — inflating estimates to look good when you finish early. That is dishonest. But honest padding — explicitly accounting for known unknowns — is essential.

Honest padding categories:

  1. Implementation uncertainty: "This API might have quirks"
     Add 20-50% depending on how unfamiliar the technology is

  2. Integration risk: "This has to work with system X"
     Add 25-50% for each external system you integrate with

  3. Review and iteration: "The code review might surface issues"
     Add 1-2 days for review, feedback, and fixes

  4. Context-switch overhead: "I have meetings 3 days this week"
     Estimate in effective hours, not calendar hours
     (5 hours of focus per day, not 8)

  5. Testing and edge cases: "I know about 3 cases, there are
     probably 5 more"
     Add 30-50% for test writing and edge case discovery

Communicating the Padding

Do not hide the padding. Make it explicit:

Bad: "This will take 5 days"
     (actually 3 days of work + 2 days of padding you will not explain)

Good: "The core implementation is about 3 days. Code review and
       iteration adds another day. There is integration risk with
       the payment provider that could add 1-2 days. My estimate
       is 4-6 days."

The second version is more useful because the stakeholder understands which parts are certain and which are risky. If they ask "can it be done in 3 days?" you can say "yes, if we skip the integration testing and accept the risk."

Communicating Uncertainty

An estimate without a confidence level is meaningless. "5 days" could mean "definitely 5 days" or "could be 3, could be 12, I'm guessing 5."

Formats for communicating uncertainty:

  Ranges:
    "3-5 days" (tight range, high confidence)
    "2-8 days" (wide range, low confidence)
    "1-3 weeks" (very uncertain, probably needs a spike)

  Confidence levels:
    "5 days, 80% confident" (I'd bet on it)
    "5 days, 50% confident" (coin flip)
    "5 days, 20% confident" (I'm guessing)

  Best/likely/worst:
    "Best case: 3 days (everything goes smoothly)
     Likely: 5 days (normal discovery and iteration)
     Worst case: 10 days (the API integration is harder than expected)"

Wide ranges are not failure. They are honesty. A wide range tells the stakeholder "this is uncertain, and here is why." A narrow range based on no information tells the stakeholder "I am certain" when you are not — and they will plan accordingly, and be disappointed.

Estimation Anti-Patterns

The Anchoring Trap:
  Someone says "this should take about 2 days, right?" and
  now your estimate is anchored to 2 days regardless of the
  actual complexity. Estimate before hearing anyone else's number.

The Deadline-Driven Estimate:
  "We need this by Friday." You estimate Friday even though
  the work is 8 days. Now you have a lie that will collapse
  on Thursday. Estimate the work first, then negotiate scope.

The Precision Trap:
  "This will take 37 hours." Nobody can estimate to that precision.
  Round to half-days or days. False precision undermines credibility.

The Unchanged Estimate:
  You estimated 3 days on Monday. On Wednesday, you've discovered
  the problem is harder than expected. But you do not update the
  estimate because you already committed. Update early and often.

The Solo Estimate for Team Work:
  You estimate your part at 3 days but forget that the designer
  needs 2 days and the reviewer needs 1 day. Estimate end-to-end
  delivery, not just your keyboard time.

When You Are Asked for a Number You Cannot Give

Sometimes someone demands a number and you genuinely do not know. This is common early in a project or when working with unfamiliar technology.

Responses that work:

"I can't give you a useful estimate right now. I need
 half a day to investigate before I can estimate with any
 confidence. Can I get back to you tomorrow?"

"The honest answer is somewhere between 1 week and 4 weeks.
 I know that's not helpful. If you need a tighter range,
 I need to spend 2 days on a spike first."

"I can estimate the first phase (2-3 days). After that,
 I'll know enough to estimate the rest."

"For planning purposes, I'd say 2 weeks. But I want to
 flag that this has high uncertainty — I'll update the
 estimate after the first 3 days."

All of these are better than making up a number.

Real-World Example: The Three-Day Task

An engineer estimated "3 days" for adding a feature. The breakdown they had in their head:

Day 1: implement the feature
Day 2: write tests
Day 3: code review and fixes

What actually happened:

Day 1: implement the feature (spent 4 hours, had meetings)
Day 2: finish implementation, discover an edge case
Day 3: handle the edge case, start writing tests
Day 4: finish tests, submit PR
Day 5: code review feedback: redesign needed for one component
Day 6: redesign and update tests
Day 7: second review, minor fixes, merged

The 3-day estimate became 7 days. Not because the engineer was bad at their job, but because:

They estimated coding time, not calendar time
They did not account for meetings reducing focus time
They did not account for discovery during implementation
They did not account for review iteration

Applying the 2x multiplier (from their historical data) to the 3-day estimate would have given 6 days — much closer to reality.

Common Pitfalls

Estimating coding time instead of calendar time — you do not get 8 hours of coding per day. You get 4-5 on a good day. Estimate in calendar time, accounting for meetings and overhead.
Not tracking actual time spent — without data on how long things actually take, you cannot calibrate your estimates. Track actuals, even roughly.
Giving a single number instead of a range — a single number implies confidence you probably do not have. Give ranges with context about what drives the uncertainty.
Never updating estimates — estimates are living things. When you learn new information, update the estimate and communicate the change immediately.
Anchoring to someone else's expectation — estimate the work, then compare to the deadline. If they do not match, negotiate scope, not estimates.

Key Takeaways

Estimates are always wrong because software is novel work with hidden complexity, external dependencies, and optimism bias. Accept this and build practices around it.
Use T-shirt sizing for rough planning and reference-based estimation for precision. Compare new work to completed work with actual time data.
Pad honestly and explicitly. Account for integration risk, review cycles, and the difference between coding time and calendar time.
Communicate uncertainty with ranges and confidence levels. Wide ranges are honesty, not weakness.
Track how long things actually take. After 10-20 data points, you will know your personal multiplier and your estimates will improve significantly.