20 min read
On this page

Build vs. Buy at Scale

Build vs. Buy at Scale

Why This Matters at the Director/VP Level

As a team lead or senior manager, build-vs-buy is a tactical question: "Should we use this library or write our own?" At the director/VP level, it becomes a strategic one: "Should we build an entire platform, or partner with a vendor who does this for a living?" The stakes are dramatically higher. You're not choosing a library — you're committing millions of dollars, dozens of engineers, and years of organizational direction.

I've watched organizations get this wrong in both directions. Companies that built everything from scratch because they believed they were special, only to drown in maintenance costs. Companies that bought everything and then couldn't differentiate, couldn't move fast, and were held hostage by vendor roadmaps. The right answer is almost always a thoughtful mix — but getting that mix right is genuinely hard.

This section will give you frameworks to make these decisions well, at scale, across your entire organization.


The Total Cost of Ownership (TCO) Model

The single biggest mistake in build-vs-buy decisions is comparing the wrong numbers. People compare the cost of building (engineering salaries for initial development) against the cost of buying (the vendor's annual license fee). Both numbers are incomplete, and the comparison is misleading.

True Cost of Building

When you build something in-house, here's what you're actually signing up for:

  • Initial development cost. The engineering time to build version 1.0. Most people estimate this, and most people underestimate it by 2-3x.
  • Ongoing maintenance. Bug fixes, security patches, dependency updates, infrastructure costs. This is typically 15-25% of the initial build cost per year, every year, forever.
  • Opportunity cost. Those engineers could be building your actual product instead. This is the cost people most consistently ignore, and it's often the largest one.
  • Knowledge concentration risk. The people who built it will eventually leave. When they do, you're left with a system that nobody fully understands.
  • Feature development. Your users (internal or external) will want new features. You're now running a product team for an internal tool.
  • On-call and operational burden. Someone has to wake up at 3 AM when it breaks.

True Cost of Buying

When you buy a solution, here's what you're actually signing up for:

  • License fees. The sticker price. Often negotiable, especially at scale.
  • Integration costs. Getting the vendor's product to work with your systems. This is routinely underestimated and can exceed the license cost in year one.
  • Customization limitations. The vendor's product won't do exactly what you want. You'll spend time on workarounds, or you'll change your processes to fit the tool.
  • Vendor lock-in. The deeper you integrate, the harder it is to leave. Switching costs grow over time.
  • Ongoing vendor management. Contract negotiations, relationship management, escalations, security reviews.
  • Price increases. Vendors raise prices. Once you're locked in, your negotiating leverage decreases. I've seen vendors double their prices at renewal because they knew the switching cost was prohibitive.

Building a TCO Model

A good TCO model covers at least a 3-5 year horizon and includes all of the costs above. Here's a simplified template:

Year 0-1 (Build): Initial development cost + integration + training Year 0-1 (Buy): License fee + integration cost + training + customization

Year 2-5 (Build): Maintenance (20% of build cost/year) + feature development + ops cost + opportunity cost of engineers Year 2-5 (Buy): License fee (assume 5-10% annual increase) + ongoing integration maintenance + vendor management overhead

The opportunity cost line is where most analyses fall apart. If you have five engineers maintaining an internal CI/CD platform, that's not just their salary — it's the features they're not building for your customers.


The Make-or-Buy Framework

TCO is necessary but not sufficient. You also need a strategic framework for deciding what to build and what to buy. Here's the one I use:

The Two-Axis Model

Think about every system along two dimensions:

  1. Is it a competitive differentiator? Does this system give you an advantage over your competitors? Does it directly contribute to what makes your product unique?
  2. Is it operationally critical? If this system goes down, does your business stop functioning?

This gives you four quadrants:

High differentiation, high criticality → BUILD. This is your core product, your secret sauce. You should own this completely. Examples: your core recommendation engine if you're Netflix, your matching algorithm if you're a marketplace.

High differentiation, low criticality → BUILD (but invest proportionally). These are things that make you unique but aren't mission-critical. Invest in them, but don't over-engineer.

Low differentiation, high criticality → BUY (carefully). These are things that every company needs and that must work reliably: email, authentication, payment processing, monitoring. Buy from established vendors, but negotiate hard and maintain the ability to switch.

Low differentiation, low criticality → BUY (or use open source). Don't waste any engineering time here. Examples: office productivity tools, project management software, internal wikis.

The "Special Snowflake" Test

Before you classify something as a differentiator, apply the "special snowflake" test. Ask yourself honestly: "Are our requirements genuinely unique, or do we just think they are?"

In my experience, about 80% of the time that an engineering team says "our requirements are too unique for any vendor solution," they're wrong. They're conflating familiarity with uniqueness. They're used to doing things a certain way, and they can't imagine doing them differently.

The honest test is this: "If I hired a new VP of Engineering from outside the company, would they look at this system and say 'yes, this absolutely had to be custom-built,' or would they say 'why didn't you just use [well-known vendor]?'"


Vendor Risk Assessment

When you decide to buy, you need a framework for evaluating vendor risk. At scale, a vendor failure doesn't just inconvenience a team — it can take down a business unit.

Key Risk Dimensions

Financial viability. Is this vendor going to be around in 5 years? Check their funding, revenue trajectory, customer base. If they're a startup, understand the risk. If they get acqui-hired, what happens to the product?

Concentration risk. How dependent are you on this single vendor? If they raise prices 50%, can you walk away? If they have an outage, can you keep operating?

Security risk. What data are you giving them? What access do they have to your systems? How mature is their security program?

Roadmap alignment. Is their product roadmap going where you need it to go? If they pivot to serve a different market, will the product still meet your needs?

Integration depth. How deeply embedded is this vendor in your systems? The deeper the integration, the higher the switching cost, and the higher the risk.

Mitigation Strategies

  • Abstraction layers. Put an interface between your code and the vendor's API. If you need to switch vendors, you only change the implementation behind the interface.
  • Data portability. Ensure you can export your data in a standard format. Test this regularly, not just when it's time to migrate.
  • Multi-vendor strategies. For critical, non-differentiating systems, consider using two vendors or maintaining the ability to quickly switch.
  • Contract protections. Source code escrow, SLA guarantees with financial penalties, price cap clauses at renewal.
  • Regular reassessment. Review vendor relationships annually. The market changes; better options emerge; your needs evolve.

Platform Decisions That Affect Multiple Teams

At the director/VP level, many build-vs-buy decisions affect not just one team but multiple teams across the organization. These require a different decision-making process.

The Stakeholder Problem

When a platform decision affects ten teams, you'll get ten different opinions. Some teams will want to build because they want control. Others will want to buy because they want to move fast. Some will want vendor A; others will want vendor B.

You cannot make these decisions by consensus. Consensus-driven platform decisions lead to lowest-common-denominator outcomes — or worse, no decision at all, which means teams go their own way and you end up with fragmentation.

A Decision Process That Works

  1. Appoint a decision owner. One person (usually a senior architect or platform team lead) owns the recommendation. They gather input but make the call.
  2. Define evaluation criteria upfront. Before looking at any options, agree on what matters. Weight the criteria. This prevents after-the-fact rationalization.
  3. Time-box the evaluation. Give the team 2-4 weeks to evaluate options. Analysis paralysis is a real risk. Set a decision date and stick to it.
  4. Prototype, don't just evaluate. For critical decisions, build quick prototypes with 2-3 options. You learn more in a week of prototyping than a month of analysis.
  5. Make the decision and commit. Once the decision is made, everyone commits. Disagree and commit is essential here. You cannot have teams undermining the chosen direction.
  6. Communicate the reasoning. Share not just the decision, but the framework, the criteria, and the reasoning. This builds trust even with people who disagreed.

Managing Vendor Relationships at Scale

When you're spending millions with vendors, relationship management becomes a strategic function.

Negotiation Leverage

Your biggest leverage is credible alternatives. Before any negotiation, know your BATNA (Best Alternative to a Negotiated Agreement). Can you switch to a competitor? Can you build it yourself? The vendor will test whether your alternatives are real.

Volume discounts. If you're using a vendor across multiple teams or business units, consolidate your purchasing. Fragmented purchasing leaves money on the table.

Multi-year commitments. Vendors will give significant discounts for multi-year deals. But be careful — you're trading flexibility for savings. Only commit long-term if you're confident in the vendor.

Executive relationships. At scale, you should have a relationship with the vendor's leadership, not just your sales rep. When things go wrong (and they will), that executive relationship is how you get problems resolved quickly.

Vendor Management Best Practices

  • Hold quarterly business reviews (QBRs) with strategic vendors
  • Track vendor performance against SLAs rigorously
  • Maintain a vendor scorecard that covers reliability, support quality, roadmap delivery, and cost
  • Have an escalation path that doesn't start with a support ticket
  • Review spend annually and renegotiate proactively, not just at renewal time

Strategic vs. Tactical Build/Buy Decisions

Not every build-vs-buy decision needs a month-long analysis. Learn to distinguish strategic decisions from tactical ones.

Strategic Decisions (Take Your Time)

These are decisions that are expensive to reverse, affect multiple teams, and lock you into a direction for years. Examples:

  • Choosing a cloud provider
  • Selecting a core data platform
  • Building vs. buying an internal developer platform
  • Choosing an ERP or CRM system

For these, invest in thorough analysis. Run pilots. Talk to references. Model the TCO. Get executive alignment.

Tactical Decisions (Move Fast)

These are decisions that are relatively easy to reverse, affect one team, and have limited blast radius. Examples:

  • Choosing a logging library
  • Selecting a testing framework
  • Picking a project management tool for one team

For these, set a short time-box (days, not weeks), pick the best option available, and move on. You can always switch later. The cost of deliberation often exceeds the cost of a suboptimal choice.

The Reversibility Test

When you're not sure whether a decision is strategic or tactical, ask: "How hard would it be to reverse this decision in a year?" If the answer is "very hard and very expensive," it's strategic. If the answer is "annoying but doable," it's tactical.


Long-Term Cost Analysis

Build-vs-buy economics change over time, and you need to account for that.

The Build Cost Curve

Building is expensive upfront and then levels off — but it never goes to zero. Maintenance costs accumulate. Technical debt accrues. The system needs to keep up with changing requirements. Many organizations underestimate the long tail of build costs.

A useful rule of thumb: over a 5-year period, the initial build cost is typically only 30-40% of the total cost of ownership. The other 60-70% is maintenance, operations, and feature development.

The Buy Cost Curve

Buying starts cheap (relatively) and then increases over time. License fees go up. You add more users. You need more capacity. The vendor introduces new pricing tiers. Your negotiating leverage decreases as switching costs increase.

The Crossover Point

There's often a crossover point where building becomes cheaper than buying on a per-year basis. For large organizations with strong engineering teams, this crossover often happens around year 3-4. But getting to that crossover point requires surviving years 1-3, where building is significantly more expensive.

The question is whether you can afford the upfront investment, whether you have the engineering capacity, and whether the market will wait for you to build.


Real-World Examples

The Wrong Build Decision: Custom Monitoring Platform

A mid-size tech company decided to build their own monitoring and alerting platform because "none of the existing tools met our needs." The initial estimate was 6 months with a team of 4 engineers.

Two years later, they had 12 engineers working on the monitoring platform. It had become an internal product with its own roadmap, its own on-call rotation, and its own backlog. Meanwhile, their actual product — the thing that made them money — was starved for engineering talent.

The monitoring platform was decent but not great. It did about 70% of what Datadog or New Relic could do, at roughly 3x the cost when you factored in the fully loaded cost of those 12 engineers. And it was always behind the commercial tools on features because, well, monitoring wasn't their business.

When a new VP of Engineering joined and audited the team allocation, she migrated the company to Datadog within 6 months. The 12 engineers were redeployed to product teams. Customer-facing feature velocity roughly doubled.

The lesson: just because you can build it doesn't mean you should. The question isn't "can our engineers build a monitoring platform?" (of course they can). The question is "is building a monitoring platform the highest-value use of our engineering time?"

The Smart Buy: Payment Processing

A fintech startup initially considered building their own payment processing infrastructure. They had strong engineers, and they thought they could build something better than what was available.

Their CTO pushed back. She argued that payment processing is low differentiation (it's table stakes — every fintech needs it), high criticality (if payments break, the business stops), and extraordinarily complex (regulatory compliance, fraud detection, multi-currency support, bank integrations).

They chose Stripe instead. The integration took 3 weeks. They launched their product 4 months earlier than they would have if they'd built payments in-house. That 4-month head start let them capture market share before a competitor launched a similar product.

Five years later, they're spending 2M/yearonStripefees.Couldtheyhavebuiltitcheaper?Maybebuttheywouldhaveneededateamof810engineersdedicatedtopaymentsinfrastructure,whichatfullyloadedcostwouldhavebeen2M/year on Stripe fees. Could they have built it cheaper? Maybe — but they would have needed a team of 8-10 engineers dedicated to payments infrastructure, which at fully loaded cost would have been 2-3M/year anyway, plus the opportunity cost of those engineers not working on product features.

The lesson: buying lets you move faster and focus on what actually differentiates you. Speed to market has a monetary value that rarely shows up in TCO spreadsheets.


Common Mistakes

1. Underestimating the True Cost of Building

People estimate the build cost and forget about maintenance, on-call, feature requests, knowledge transfer, and opportunity cost. Double your initial estimate, then add 20% per year for maintenance, and you'll be closer to reality.

2. Overestimating Uniqueness

"Our requirements are too unique for any vendor" is almost always wrong. Challenge this assumption aggressively. Talk to the vendors. You might be surprised how flexible their solutions are.

3. Ignoring Opportunity Cost

If you have 5 engineers building internal tools, that's 5 engineers not building product features. What's the revenue impact of those missing features? This number is hard to calculate precisely, but it's rarely zero and often very large.

4. Making the Decision Based on Current State

Build-vs-buy should be evaluated over a 3-5 year horizon, not based on where you are today. A vendor that's too expensive for your current scale might be the obvious choice at your projected scale.

5. Letting Engineers Make Business Decisions

Engineers naturally want to build things. It's what they love doing. But build-vs-buy is a business decision, not a technical one. Involve engineers in the evaluation, but make the decision based on business value.

6. Not Reassessing Past Decisions

The market changes. Your needs change. A build decision that made sense 3 years ago might not make sense today. A vendor that was too immature 2 years ago might be perfect now. Review your major build-vs-buy decisions annually.

7. Death by Vendor Sprawl

The opposite mistake: buying too many point solutions from too many vendors. You end up with 47 SaaS tools, no integration between them, and a procurement nightmare. Consolidate where you can.

8. Failing to Plan for Migration

Every buy decision should include a migration plan. Not because you expect to migrate, but because having a credible exit strategy gives you negotiating leverage and protects you if the vendor fails.


The Open Source Middle Ground

There's a third option that doesn't fit neatly into "build" or "buy," and at scale it deserves its own consideration: open source.

When Open Source Is the Right Choice

Open source can be the best of both worlds — you get a mature, battle-tested solution without the vendor lock-in or the ongoing license fees. But it comes with its own costs that people routinely underestimate.

Good candidates for open source adoption:

  • Infrastructure tooling with strong communities (Kubernetes, PostgreSQL, Kafka, Prometheus)
  • Frameworks and libraries with active maintenance and broad adoption
  • Tools where the open source version is genuinely production-ready, not just a teaser for the commercial product

Poor candidates for open source adoption:

  • Projects maintained by a single person or a tiny team — bus factor risk is too high
  • Software that requires deep expertise to operate and your team doesn't have that expertise
  • Tools where the open source version is deliberately hobbled to push you toward the paid tier

The Hidden Costs of Open Source

Open source isn't free. You're trading license fees for operational responsibility:

  • Operations expertise. Running PostgreSQL is straightforward. Running PostgreSQL at scale with replication, failover, backup, and performance tuning requires real expertise. You either hire for that expertise or develop it internally — both cost money.
  • Upgrades and patching. You're responsible for staying current. Security patches need to be applied promptly. Major version upgrades can be complex and risky.
  • Integration work. You build and maintain the integration with your systems.
  • Support. When things break at 3 AM, there's no vendor to call. You're on your own (or you buy commercial support, which partially negates the cost advantage).

The Managed Service Hybrid

The sweet spot for many organizations is using open source software through a managed service (like Amazon RDS for PostgreSQL, or Confluent for Kafka). You get the open source ecosystem without the operational burden, at a cost that's typically lower than a proprietary alternative but higher than self-managing.

This is increasingly the right answer for most non-differentiating infrastructure: use the open source technology through a managed service, and invest your engineering time elsewhere.


Building Your Organization's Decision-Making Muscle

One thing I want to emphasize: build-vs-buy isn't a one-time decision. It's a capability that your organization needs to develop. The best engineering organizations have a repeatable, lightweight process for making these decisions consistently and well.

Creating a Decision Record

For every significant build-vs-buy decision, create a brief written record that captures:

  • What decision was made and when
  • What alternatives were considered
  • What criteria were used
  • What the expected costs and benefits were
  • Who made the decision

This record serves two purposes. First, it helps you learn from past decisions — you can review whether your predictions were accurate and improve your decision-making over time. Second, it provides context for future leaders who might wonder "why did we build this ourselves?" or "why did we choose this vendor?"

Annual Portfolio Review

Once a year, review your portfolio of build-vs-buy decisions as a whole. Ask:

  • Are we building too many things that aren't differentiators?
  • Are we over-reliant on any single vendor?
  • Have any of our build decisions turned into maintenance sinkholes?
  • Have any vendors become significantly better or worse since we last evaluated?
  • Are there new open source options that have matured since we last looked?

This review keeps your portfolio healthy and prevents the slow accumulation of technical debt and vendor risk.


Business Value

Getting build-vs-buy right at scale is one of the highest-leverage activities for a director or VP. Here's the concrete value:

  • Engineering efficiency. Every engineer freed from maintaining undifferentiated internal tools is an engineer who can work on revenue-generating features. At fully loaded cost (250400K/yearforaseniorengineer),evenfreeingup35engineerscreates250-400K/year for a senior engineer), even freeing up 3-5 engineers creates 750K-$2M in redeployed capacity.
  • Speed to market. Buying well-established solutions lets you move faster. In competitive markets, launching months earlier can be worth millions in captured revenue.
  • Cost optimization. A rigorous TCO model often reveals that the "cheap" option (building) is actually the expensive one, or that a renegotiated vendor contract can save hundreds of thousands per year.
  • Risk reduction. Proper vendor risk assessment prevents the nightmare scenario: a critical vendor failing with no backup plan, or an internal system collapsing after the original builders leave.
  • Strategic focus. The most valuable outcome of getting build-vs-buy right is focus. Your engineering organization is building things that matter — things that differentiate your business — and buying everything else.

Organizations that get this right consistently outperform those that don't. They ship faster, spend more efficiently, and their engineers are happier because they're working on interesting problems instead of reinventing the wheel.


Common Pitfalls

  • Underestimating the true cost of building. Estimating only the initial development cost while ignoring maintenance, on-call, feature requests, knowledge transfer, and opportunity cost leads to decisions that look cheap but become very expensive.
  • Overestimating uniqueness of requirements. "Our requirements are too unique for any vendor" is almost always wrong. Failing to challenge this assumption aggressively means building custom solutions for problems that well-established vendors have already solved.
  • Ignoring opportunity cost. Every engineer maintaining an internal tool is an engineer not building product features. The revenue impact of those missing features is rarely zero and often very large.
  • Letting engineers make what is fundamentally a business decision. Engineers naturally want to build things. But build-vs-buy should be evaluated on business value, not technical interest. Involve engineers in evaluation, but decide based on ROI.
  • Not reassessing past decisions. A build decision that made sense three years ago may not make sense today. A vendor that was too immature two years ago may be perfect now. Failing to review annually means accumulating stale commitments.
  • Death by vendor sprawl. The opposite of building too much: buying too many point solutions from too many vendors creates integration nightmares and procurement overhead that erodes the benefits of buying.

Key Takeaways

  1. Always model TCO over 3-5 years, including opportunity cost.
  2. Build what differentiates you. Buy everything else.
  3. Apply the "special snowflake" test ruthlessly.
  4. Vendor risk assessment is as important as the initial vendor selection.
  5. Platform decisions that affect multiple teams need a clear decision owner, not consensus.
  6. Reassess major build-vs-buy decisions annually — the landscape changes.
  7. The best build-vs-buy decision is the one that maximizes engineering focus on what actually makes your company money.