Classical CV vs Deep Learning for Carbonate Vug Quantification

There is a reflex in applied AI that says the newest architecture is always the right answer. Spend enough time shipping models into upstream workflows and you learn to distrust that reflex. In a roughly twenty-month engagement with a mid-sized Middle East carbonate operator we partnered with, we built a Detection Transformer for fractures and bedding planes — a genuinely deep-learning system, set-prediction loss and all. For the other reservoir-quality feature on the same image logs — vugs, the millimetre-to-centimetre dissolution cavities that carry secondary porosity — we deliberately did not reach for deep learning. We built a classical computer-vision pipeline instead. This piece is about why, and about the two metrics that decided it: speed and interpretability.

What a vug actually demands of a model

A fracture on an image log is a sinusoid — a single global curve with three parameters (depth, dip, azimuth). That is a set-prediction problem, and it is exactly the shape of problem a DETR-style model is built for. A vug is the opposite kind of object. It is a small, irregular, locally-defined blob of low resistivity — dark on a water-based-mud high-resolution borehole image log because the conductive vug-filling fluid contrasts with the resistive carbonate matrix. There is no global geometry to regress. What a petrophysicist wants out of a vug is not "is this a vug, yes or no" but a measurement: the area of each cavity, how round it is, how the population is distributed across a depth interval, and whether its resistivity signature is physically consistent with secondary porosity.

That reframes the engineering target. The deliverable is not a detection score — it is a per-vug statistics table, computed every 0.1 m down the well: count, total area, mean area, standard deviation of area, plus area, circularity, and azimuth spectra. A model that returns a segmentation mask but cannot hand you those numbers, in physical units, has not solved the petrophysicist's problem. It has solved a benchmark.

The classical pipeline, as an engineering object

It is tempting to file "classical CV" under not real AI, but the vug pipeline is a precisely engineered algorithm with a dozen tunable stages, each doing identifiable work. It runs as a deterministic sequence:

Local-variation enhancement by top-K mode subtraction — strip the dominant intensity modes so the faint dark vugs pop against a normalised background.
Gaussian-modulated adaptive thresholding — a local, spatially-varying threshold (block size 31 by default, the Gaussian-weighted local mean as the offset) that binarises the image without a single global cutoff that would wash out low-contrast cavities.
Contour extraction via Suzuki-Abe border following — turn the binary blobs into closed contours.
Geometric refinement — compute each contour's area by the shoelace formula (Green's theorem) and its circularity against its minimum enclosing circle, then keep only contours whose circularity falls in a 0.3–1.0 gate and whose area is plausible.
Aggregation — de-duplicate overlapping detections by centroid distance and an IoU merge at a 20% threshold.
Physics-aware filtering — a Laplacian-variance contrast filter and a mean-intensity (inner-vs-enlarged-circle) check, sign-aware for water-based mud, to reject false positives that survive geometry.

Every constant in that list is a knob a domain engineer can read, defend, and re-tune. When we moved from a vertical well to a horizontal well logged with a compact microresistivity tool ~10 km away and a second vertical well ~12 km away, we did not retrain anything. We adjusted exactly two thresholds — the Laplacian and mean-based filter values — and, for the noisier tool, the adaptive-threshold block size. One parameter set, three wells, two imaging tools, two well orientations. That is portability you can audit.

Speed: the metric that decides what gets used

Interpretation throughput is not a vanity number in this domain — it is the difference between a method that gets run on every well and one that gets run on a demo well for a conference slide. The classical pipeline processes image logs at roughly 15 seconds per metre. The morphological path-opening baseline it replaced (Li et al., 2019) ran at about 5 minutes per metre. That is a 20x speedup, and it lands at the exact point in the workflow — the petrophysicist's desk, on a single well that is often more than 200 m long — where a 20x factor turns an overnight batch into an interactive pass.

A modern instance-segmentation network could of course be fast at inference. But "fast at inference" hides the real cost ledger: you first need labelled vug masks across enough wells to train it, and on this programme that label set did not exist — the entire point of the CV pipeline was to produce the first credible vug masks. The classical algorithm is fast on day one, on a single well, with zero training data. For a team that has to deliver before it can assemble a training corpus, that asymmetry is decisive.

Interpretability: per-vug statistics a DL mask does not give you

Speed gets the pipeline run; interpretability gets its output trusted. Because every vug is an explicit contour with a measured geometry, the pipeline emits per-vug numbers a black-box mask cannot. Across this field, individual vug areas ranged from about 1 to 12 cm², concentrated in the 1–3.5 cm² band (right-skewed, as carbonate dissolution populations tend to be), and circularity spanned 0.28 to 0.85 with a peak at 0.45–0.7 — a population dominated by semi-circular cavities, not perfect circles and not linear artifacts.

The single most load-bearing decision in the whole pipeline is the circularity gate. Set it at 0.3–1.0 and you reject linear fractures and wellbore-parallel artifacts — which are also dark and conductive — while keeping the genuinely rounded dissolution vugs. It is one geometric criterion that does the discriminative work an entire trained classifier would otherwise have to learn. The instrument below makes that gate tactile: drag it across the circularity axis and watch contours fall in and out of the vug catalog, including the three expert-missed vugs we recovered across the vertical and horizontal wells.

The pipeline's single geometric decision, made tactile. Each contour the detector traces sits on a circularity axis — 0.28 (elongated, dissolution-aligned) to 0.85 (near-circular). A circularity gate of 0.3–1.0 rejects linear fractures and wellbore-parallel artifacts while keeping true vugs. Drag the gate: contours left of it fall out as rejected fractures (grey), contours right of it enter the vug catalog (teal), and the three orange contours are vugs that meet every geometric criterion the expert applied — yet manual picking missed across two of the validation wells. The gate bounds, circularity span, 1–12 cm² area, and three recovered intervals are the article's own; each contour's exact coordinate is schematic (a plausible population, not a published catalog).

That recovery is the punchline on accuracy. Validated against the expert's interpretation-software ground truth, the pipeline reached roughly 85% agreement — and where it disagreed, it was frequently right: it caught vugs the manual interpreter had missed, across three discrete depth intervals. A final quality check confirmed the recovered cavities carried the lower-resistivity signature secondary porosity should, ruling out noise. Crucially, none of this is a confidence score you have to take on faith. Every claim traces to a measured contour you can pull up, overlay on the static image, and argue about — the way petrophysical interpretation has always worked.

So where does deep learning win?

Not nowhere — and not never. The honest framing is a division of labour by feature geometry and by data availability.

For fractures, deep learning already won on the same dataset. Fractures are global parametric curves, and the Detection Transformer's set-prediction objective fits them natively. But it earned that win at a price the vug problem could not pay: it is data-hungry. Sweeping the fracture model's training wells showed classification error collapsing from ~93% at 3 wells to ~1% by 9 wells — the model only becomes deployable once it has seen enough geology. We had 14 fracture wells. We had labelled vug masks on zero.

For vugs, that inverts the calculus. With no training labels, a data-hungry model is a non-starter; a deterministic algorithm that works on the first well is the only thing that ships. So the right architecture is sequential, not competitive: the classical CV pipeline runs now, delivers interpretable per-vug statistics now, and — because its output is a clean per-vug mask with measured geometry — it manufactures exactly the labelled corpus a future supervised or semi-supervised segmentation model will need. The CV pipeline is not the deep-learning method's rival. It is its data-generation front end.

This is the pattern we keep seeing across the subsurface engagements we run, with operators across the Middle East and the United States: deep learning is the bridge to a more automated future, but the first span is almost always built from classical, interpretable, label-free computer vision. Pick the tool that matches the object's geometry and your data reality — not the one with the newest acronym.

Key takeaways

A vug is a small, locally-defined, low-resistivity blob, not a global parametric curve — so the deliverable is a per-vug measurement table (area, circularity, azimuth spectra every 0.1 m), not a detection score. That target favours an explicit, interpretable CV pipeline over a black-box mask.
The classical pipeline is a precisely engineered algorithm — top-K mode subtraction, Gaussian-modulated adaptive thresholding, Suzuki-Abe contours, shoelace area, minimum-enclosing-circle circularity, centroid+IoU dedup, and Laplacian/mean physics-aware filtering — with every constant auditable. Porting across 3 wells, two imaging tools and two orientations needed only two thresholds re-tuned, no retraining.
Speed decided adoption: ~15 s/m vs ~5 min/m for the path-opening baseline — a 20x edge that turns vug interpretation interactive on 200 m+ wells, with zero training data required on day one.
Interpretability built trust: vug areas 1–12 cm² (concentrated 1–3.5 cm²), circularity 0.28–0.85 (peak 0.45–0.7), ~85% agreement with the expert's interpretation-software ground truth — and it caught vugs the expert missed across three discrete intervals, each one a measurable contour rather than a confidence number.
Deep learning is the bridge, not the replacement: fractures (global curves, 14 wells) went to a Detection Transformer; vugs (no labelled masks) went to classical CV, whose per-vug output becomes the training corpus for a future supervised vug-segmentation model. Match the tool to the object's geometry and your data reality.

References

[1] Li, X., et al. Path-morphology baseline for borehole-image feature extraction (2019). The ~5 min/m computational baseline against which the classical vug pipeline's ~15 s/m throughput is measured.

[2] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. End-to-End Object Detection with Transformers (DETR). ECCV (2020). The set-prediction formulation underpinning the companion fracture model. https://arxiv.org/abs/2005.12872

Classical CV vs Deep Learning for Carbonate Vug Quantification

What a vug actually demands of a model

The classical pipeline, as an engineering object

Speed: the metric that decides what gets used

Interpretability: per-vug statistics a DL mask does not give you

So where does deep learning win?

References

Continuous AI for explorers

About Earthscan

Products

Legal

Follow us on