All writing

The AI Build-vs-Buy Trap: Why the Wrong Question Is Costing You Months

Most teams walk into this decision carrying the wrong mental model. They ask: should we build or buy? That framing is already broken. The real question is: where does this capability sit on your differentiation curve? Everything flows from that.

Get it wrong and you either spend six months building commodity infrastructure that a $200/month SaaS would have covered, or you chain your core product to a vendor whose roadmap you can't control and whose pricing will move the moment you're dependent on it. Both failure modes are common across MENA and beyond — and both are avoidable.

The Differentiation Test

Before touching a procurement page or a code editor, I run every AI capability through three questions:

  1. Does this touch the thing users pay for? If yes, you likely own it.
  2. Would a competitor buying the same tool close the gap with you? If yes, it's commodity — buy it.
  3. Does your data or domain context make this meaningfully better than the generic version? If yes, build the delta, not the whole stack.

At Etera AI, we work on a multi-agent LLM travel platform where this line is drawn constantly. The LLM call itself? Commodity — use the API. Prompt orchestration tuned to specific inventory, margins, and user personas? That's the moat. The embedding pipeline over proprietary travel content? Own it. The vector database hosting it? Almost certainly buy — Pinecone, Weaviate, pgvector, whatever fits your ops posture.

The line isn't build or buy. It's almost always both, layered deliberately.

Where Teams Get This Wrong

Mistake 1: Treating "AI" as a monolithic build decision.

People say "we're building our AI in-house" as if that means something coherent. A production AI feature involves model inference, retrieval, orchestration, evaluation, observability, and user-facing UX. Each layer has a different build/buy calculus. Collapsing them into one decision is how you end up rebuilding LangChain from scratch because someone didn't want a dependency.

Mistake 2: Confusing integration work with building.

Calling an API and writing prompt logic around it is integration, not building. That's fine — it's often the right call. But don't let it masquerade as a serious in-house capability. If the vendor disappears or changes their pricing, your "in-house AI" disappears with it.

Mistake 3: Letting compliance anxiety drive the wrong direction.

I see this constantly in regulated industries across the Gulf. A team decides to self-host an open-source model because they're worried about data residency, without actually checking whether the commercial API they're avoiding even stores prompts by default (most enterprise tiers don't). The result: months of MLOps work that wasn't necessary. Check the data processing agreements first. The engineering decision comes after.

Mistake 4: Underpricing future switching costs.

Vendor lock-in in AI isn't just about APIs — it's about the fine-tuning datasets, the evaluation sets, the prompt libraries, and the integration code you build around a specific model's quirks. Prompts engineered around one model's failure modes can actively degrade on the next generation — this is a well-documented class of migration pain that teams hit when moving between major model versions. Abstract your model interface from day one.

Here's what that looks like in a realistic scenario — imagine a travel agent that needs to extract structured booking intent from a user message:

python
# Don't do this — tightly coupled to one provider and model
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    response_format={"type": "json_object"}
)

# Do this — thin adapter layer, swappable backend
# BookingIntentExtractor wraps your model_client and owns the schema
extractor = BookingIntentExtractor(client=model_client, config=model_config)
intent = extractor.extract(user_message)
# Swap model_client from OpenAI to Anthropic or a self-hosted model
# without touching extraction logic or downstream agents

That abstraction costs you an afternoon. Skipping it can cost you a quarter when a model is deprecated or a pricing tier changes.

The Fast-Build Reality Check

With modern tooling — Claude, Codex, hosted embeddings, managed vector stores — the engineering hours to stand up a working AI feature are genuinely small. We're talking days to a functional prototype, weeks to something production-hardened. This changes the calculus.

The bottleneck isn't writing code anymore. It's:

  • Data: cleaning, structuring, and getting access to the right internal datasets
  • Integrations: connecting to legacy systems that weren't built to be connected to
  • Compliance: legal review of what goes to which model under which terms
  • Decisions: getting alignment on what "good" looks like before you can evaluate anything

When I hear "we can't build that in-house, it'll take too long" — I push back. Unless the blocker is one of the four things above, the time estimate is usually carrying ghost overhead from pre-LLM-era development culture.

A Working Framework

Here's the decision tree I actually use:

  • Core differentiator + your data makes it better → Build the logic, buy the infrastructure
  • Commodity capability, no proprietary data advantage → Buy, integrate thin, stay portable
  • Regulatory or data-residency constraint → Investigate commercial enterprise terms before defaulting to self-host
  • Uncertain? → Buy first, instrument everything, revisit in 90 days when you have real usage data

That last point matters. A lot of build decisions get made on hypothetical scale that never arrives. Buy until the unit economics break, then build. You'll have actual data to justify the investment — and actual usage patterns to design against.

The Honest Default

For most product teams in the region, the right answer is: buy more than you think, build at the layer that's uniquely yours, and abstract every external dependency behind an interface you control.

The goal isn't to own more of the stack. The goal is to own the parts that compound in your favor — and move fast everywhere else.

If you're mid-decision and the conversation has stalled on "build vs buy" without anyone having mapped it to differentiation first, that's the sign to stop and reframe. The months you'll save are sitting in that reframe.

Working on something like this? I take on a few fractional-CTO and AI engagements at a time.

The AI CTO playbook

Get my AI playbooks — straight to your inbox

Practical notes on shipping production AI, scaling teams, and the calls a CTO actually has to make. A few times a month. No spam, no fluff.