Build vs Buy: When an Operator Should In-House Dip Interpretation Instead of Paying a Service Company

Every subsurface organisation with image logs eventually faces the same line item: dip interpretation. Someone has to unwrap the imagery from two different microresistivity imaging tools, pick the sinusoids, fit the curves, and hand back depth, dip, and azimuth for every fracture and bedding plane in the well. For decades the default answer was to pay a service company by the well or by the interval, treat interpretation as a variable cost, and never think about it again. That default is now worth re-examining — not for ideological reasons, but because the unit economics have moved. When an in-house model picks a metre of image log in roughly 30 seconds, runs 5x faster than a manual interpreter, and clears accuracy bars a petrophysicist will sign off on, the build-vs-buy calculation stops being obvious.

This piece is for the CTO, the subsurface manager, and the head of digital who has to make that call. The argument, drawn from a roughly twenty-month engagement with a mid-sized Middle East carbonate operator we partnered with, is narrow and specific: the gate that decides build-vs-buy is not model novelty. The architecture is, frankly, a solved problem. The gate is data access and MLOps maturity. Get those two right and the maths favours building; get either wrong and you should keep writing the cheque.

The economics that actually changed

Start with the cost the service company is replacing. Manual sinusoid picking is slow, expert-bound, and serial. The classical computer-vision alternative is not much better: a path-morphology approach of the kind widely cited in the literature runs at roughly 4 to 5 minutes per metre and is prone to false positives in vuggy, fractured carbonate. Either way, a single deep well with hundreds of metres of imaged interval is days of a senior interpreter's time, and that interpreter is the scarcest resource in the building.

The in-house model we built reaches a different operating point. The fracture-and-bedding picker — internally AutoFrac — interprets at about 30 seconds per metre, a 5x speed-up over manual picking. Its sibling vug-detection tool, AutoVug, posts the same 5x improvement. And the well-to-well correlation layer — W2W — lifted interpretation productivity by 60% and interpretation accuracy by 75%, hitting 95% target-location precision and 90% stratigraphic success on the operator's own wells. Those last four numbers are the ones that change a CFO's mind, because they are not "the model is fast" — they are "the model is fast and the geoscientist trusts the output enough to redeploy their hours upmarket."

That is the real prize. Build-vs-buy is usually framed as cost avoidance — stop paying the per-well fee. The larger value is capacity: every hour a senior interpreter does not spend tracing sinusoids is an hour spent on integration, uncertainty, and decisions only a human should make. The model is not there to replace the geoscientist; it is there to move them up the value chain.

Why model novelty is not the gate

Here is the uncomfortable truth for anyone hoping to win on algorithmic cleverness: the architecture is not the moat. The picker is a Detection Transformer adapted to set prediction — each fracture is a sinusoid parameterised by depth, dip, and azimuth, the decoder emits a fixed set of object queries, and a Hungarian bipartite matching loss assigns predictions to ground-truth picks with no anchors and no non-maximum suppression. The vug detector is a deterministic CV pipeline — adaptive thresholding, contour extraction, then area-and-circularity gating to suppress false positives. None of this is secret. Both are buildable by any competent applied-AI team from the public literature.

What is not commoditised — what actually separates a working in-house capability from a science-fair demo — is everything around the model. And that is exactly where the build-vs-buy decision lives.

In 2026 the AI build-vs-buy split in oil & gas is sorted by operator tier, and the deciding variable is the depth of the proprietary subsurface corpus an operator owns. Pick a tier — NOCs (Build), Western IOCs (Partner), mid-tier independents (Buy) — and the panel reconfigures to that tier's posture, named operators and the article's own commitments. The orange ladder is the single argument: the deeper the owned corpus (the sourced NOC band runs from ADNOC's 50+ years to Aramco's 90 years), the further toward BUILD a tier sits. Drag the corpus-depth marker — or step tiers with the chips / arrow keys — and the recommended posture snaps to the band the depth lands in. Named operators, the NOC corpus depths, model sizes ($340M / 28 fields, 250B / 70B params), the 70% / 75% gains and the $7.6B→$25B market are the article's own; the corpus-depth axis, the gate thresholds and the IOC / independent marker positions are illustrative.

Gate one: data access

A set-prediction model is data-hungry by construction. The supervision per query is sparse, the matching is global, and the model has to discover the number of objects in each patch on its own — so it needs to see enough geology to generalise. In our engagement that dependence was brutally visible. Sweeping the number of training wells, classification error fell from 93.1% at 3 wells, to 18.4% at 6, to 1.06% at 9, and the full 14-well dataset landed the fractures-only model at a 2.54% classification error. The curve is steep and non-linear: the difference between a useless model and a deployable one was a handful of wells.

But raw well count is only half the story, and this is the part that decides build-vs-buy. Those 14 wells shared one depositional environment — a single fractured carbonate system, imaged with two consistent microresistivity imaging tools. That homogeneity is what made 14 wells enough. A model trained on one carbonate play does not transfer for free to a clastic field two basins over; it sits on an out-of-distribution cliff the moment the geology changes. So the data-access question is not "do you have wells?" It is: do you own a deep, labelled, single-environment corpus of image logs with raw dip picks — and can your team get at it without a procurement cycle for every file?

This is where the build case is strongest for a national oil company or a focused independent operating a coherent acreage, and weakest for a diversified operator with a scatter of one-off wells across unrelated plays. If you own the corpus and the depositional environment is coherent, you have the asset a service company can never replicate: data they are not allowed to see. If you do not, no amount of model engineering closes the gap, and you should keep buying.

There is a labelling multiplier worth naming. Through overlapping image patches and aggressive augmentation, the operator's roughly 900 raw image–ground-truth pairs were grown into a training set of more than 55,000 — about a 65x expansion. That DataOps engineering is what let a 14-well corpus behave like a far larger one. Build-vs-buy is not "how many wells do you have"; it is "how many wells, times how good is your data pipeline at extracting supervision from them."

Gate two: MLOps maturity

The second gate is the one most build cases underestimate. A model that picks fractures in a notebook is a science project. A model that picks fractures every day, on new wells, with monitored drift, versioned weights, reproducible training runs, and a path back to a human when confidence is low — that is a product, and it is a different discipline entirely.

The in-house capability in our engagement ran on a real production stack, not a laptop: on-premise multi-GPU infrastructure, an experiment-tracking and model-versioning system, governed data storage, and a custom MLOps orchestration layer to move a model from training run to served prediction and to catch it when its inputs drift. Even the tool maturity was tracked like a product — the picker, the vug detector, and the correlation layer were each versioned and shipped against defined service-level deliverables, not handed over as a one-off model file.

This is the capability a service company has and most operators do not. When you pay for interpretation, you are also paying — invisibly — for their pipeline, their drift monitoring, their reproducibility, their on-call. If you build, you inherit all of it. An operator with a mature data-and-ML platform absorbs that cost cheaply because the muscle already exists. An operator without one is signing up to stand up an MLOps function before it sees a single interpreted well — and that is frequently the hidden line item that sinks an in-house business case that looked great on the model accuracy alone.

A tiered decision, not a binary one

Put the two gates together and the answer is rarely "everyone should build." It is tiered. An operator that owns a deep, coherent, single-environment image-log corpus and runs a mature ML platform — typically a national oil company or a focused independent on its core acreage — is past the gate; building is a capacity and sovereignty win, and the per-well service fee becomes margin reclaimed. An operator with the data but not the platform should build the data asset now and partner for the platform until the MLOps muscle exists. An operator with neither — a diversified player with scattered wells and no ML function — should keep buying, because both gates are shut and model novelty will not open them.

The investment scales with ambition. In this engagement the platform was sized across tiers from roughly USD 250–350K for a focused capability, through USD 650–800K, up to USD 1.5–4M for a full sovereign AI platform spanning fractures, vugs, and well-to-well correlation. The right tier is not the most expensive one; it is the one matched to how much of your interpretation backlog sits in a single depositional environment you control.

The two-question test

Answer "yes" to both and build; if either is "no," keep buying until it turns into a "yes." Data: do we own a deep, labelled corpus of image logs in a single, coherent depositional environment, and can our team reach it without friction? Platform: do we have — or can we cheaply stand up — the MLOps to train, serve, monitor, and version a model in production, with a human-in-the-loop fallback?

Model novelty is not on that list, and that is the whole point. We have seen this pattern hold across operators in the Middle East and the United States: the teams that win at in-house interpretation are not the ones with the cleverest architecture. They are the ones who owned the data and had the engineering discipline to put a model into production and keep it there.

Key takeaways

The economics moved: an in-house picker that runs at ~30 s/m and 5x faster than manual, with a correlation layer lifting productivity +60% and interpretation accuracy +75% (95% target-location precision, 90% stratigraphic success), changes the build-vs-buy maths against paid service-company dip interpretation.
Model novelty is NOT the gate. The Detection-Transformer picker and the CV vug pipeline are buildable from public literature by any competent applied-AI team. The moat is everything around the model.
Gate one is data access: classification error fell from 93.1% (3 wells) to 1.06% (9 wells) to 2.54% (14 wells), but only because all 14 shared ONE depositional environment. Own a deep, coherent, single-environment corpus with raw picks — and a DataOps pipeline that grew ~900 pairs into 55,000+ (~65x) — or keep buying.
Gate two is MLOps maturity: production GPU infra, experiment tracking and model versioning, governed storage, drift monitoring, and a custom orchestration layer with human-in-the-loop fallback. A service company already has this; building means inheriting it.
The decision is tiered (~USD 250–350K to 1.5–4M), not binary. NOCs and focused independents with the data and the platform should build; operators with data but no platform should build the data asset and partner for the platform; diversified players with neither should keep buying.

Build vs Buy: When an Operator Should In-House Dip Interpretation Instead of Paying a Service Company

The economics that actually changed

Why model novelty is not the gate

Gate one: data access

Gate two: MLOps maturity

A tiered decision, not a binary one

The two-question test

Continuous AI for explorers

About Earthscan

Products

Legal

Follow us on