Custom Application Dev·6 min read·16 April 2024

Build vs Buy for Data-Intensive Applications: A Framework for the Decision

Most organisations have a default position on build versus buy, and it was formed by experience rather than analysis. Either they buy everything — and then customise until the tool is unrecognisable and every upgrade is a migration project — or they build everything, and maintain systems forever that a vendor would patch, update, and support for free. Neither default is defensible as a general principle.

The failure mode for buyers is customisation debt. You adopt a tool because it covers 70% of your needs. You configure it, extend it, write workarounds for the gaps, and eventually you have a heavily customised installation that no longer upgrades cleanly and that only two people on your team fully understand. The vendor releases a new version. You can't take it without a significant project. The tool that was supposed to reduce your maintenance burden has become a different kind of maintenance burden.

The failure mode for builders is the opposite. The system you built in six weeks takes six months to extend when requirements change. The engineer who built it leaves, taking with them the context that wasn't in the documentation because there wasn't time to write documentation. Five years later, the system is running in production, it works, nobody knows how, and changing anything is a project that nobody wants to prioritise because the current system technically works.

Both of these are real outcomes, not theoretical risks. The question is which one your decision is more likely to produce.

The Costs That Don't Appear in the Procurement Conversation

When organisations evaluate bought tools, the analysis usually covers license cost, implementation cost, and training. The costs that get underweighted are the structural ones that compound over time.

Vendor lock-in. Your workflows get encoded in the vendor's system. Their data model, their automation logic, their integration patterns. When you want to migrate — because they got acquired, raised prices, or simply stopped being the right tool — the migration cost is proportional to how deeply you embedded. For data-intensive applications, that depth is usually substantial.
Customisation debt. Every workaround you build around the tool's limitations makes the next upgrade harder. This is especially acute for data tools, where the customisation often happens in SQL, configuration, or proprietary scripting that sits outside normal software engineering practices and doesn't get the same review, testing, or documentation.
Per-seat or usage-based pricing that scales against you. A tool that costs a sensible amount for a team of five is a different proposition for a team of fifty. Data tools specifically tend to price on volume — number of rows, number of API calls, data processed — which means your costs scale with your success rather than with your value.
Vendor trajectory risk. The product you're buying today is the product the vendor prioritises for their median customer. If your use case diverges from the median — and data use cases frequently do — the product will drift away from your needs over time. Features you depend on get deprecated. The roadmap goes in a different direction.

The Costs of Building That Get Underestimated

Custom builds have mirror-image failure modes. The initial build cost is usually estimated reasonably well. Everything that comes after it is not.

Maintenance is forever. Every system you build is a system you commit to maintaining until you decommission it. Security patches, dependency updates, performance degradation at scale, edge cases that only appear in production — these are real ongoing costs that are rarely included in the business case for building. The engineer who built the system and can resolve issues in an hour leaves. Their replacement takes a week. That change in operational cost is invisible at build time.

Requirements change. The system you designed to solve today's problem will be asked to solve tomorrow's problem. Custom systems that weren't designed with extensibility in mind — because extensibility is hard to justify when you don't know what the extensions will be — become expensive to extend. The six-week build becomes the eighteen-month refactor.

Four Questions That Drive the Decision

Rather than treating build versus buy as a binary choice, treat it as a structured analysis across four questions.

Is this a differentiating capability? If the thing you're building is core to how you compete — if it encodes proprietary logic, if it reflects a way of working that's genuinely yours — build it. Own it. A tool built to serve the median customer will never fully capture what makes your operation specific. If it's table stakes — a capability that every company in your space needs and uses the same way — buy it. The competitive advantage of building your own email system is zero.
Does the off-the-shelf option cover 80% or 60% of your needs? At 80%, buy it and accept the gaps. The customisation cost of closing 20% of gaps is usually manageable, and the vendor may close them on the roadmap. At 60%, the customisation required to make the tool work for your use case will likely exceed the cost of building what you actually need. The 60% threshold is not precise — it's a prompt to be honest about how much of the tool you'll actually use versus work around.
How fast will your requirements change? Fast-changing requirements are destructive to bought tools. Each change is a customisation that adds to the debt. For rapidly evolving domains — new product lines, regulatory change, data use cases that are still being discovered — a custom build you control is often cheaper over three years than a customised SaaS tool that you're fighting every quarter.
Do you have the engineering capacity to maintain a custom system for five years? This is the question that gets answered optimistically most often. Yes, you have a strong engineering team now. Will they still be there in five years? Will they still own this system, or will it have been passed to whoever is available? A custom system that isn't maintained becomes a liability. The honest answer to this question should inform the build decision more than most teams allow it to.

The Middle Path

The most pragmatic position is usually a hybrid: buy the infrastructure, build the application logic. Use managed services for the undifferentiated heavy lifting — databases, queues, compute, storage. Own the layer that encodes your specific business rules, your data models, your integration logic.

This captures most of the benefit of both approaches. You're not maintaining a database engine or a message broker. But you own the code that defines how your business operates and how your data moves through it. The infrastructure vendors are your utility providers. The application layer is yours.

The framing of "build versus buy" obscures this option because it implies a binary choice at the system level. In practice, every production system is a mix. The question is where you draw the line between commodity infrastructure and proprietary logic — and whether you draw it deliberately, or let it happen by default.

The total cost of ownership question — what does this decision cost over five years, including the scenarios where the vendor fails us or requirements change significantly — is the right question. Most procurement decisions don't ask it with enough rigour. That's usually when the regret sets in.

Written by ATHING

We design and build data infrastructure, automation pipelines, and AI systems for organisations that need them to work.

Talk to Us