Hold a piece of vuggy carbonate up to the light and it looks like frozen sea-foam: a dense rock shot through with holes, some pin-sized, some big enough to swallow a fingertip. Those holes are vugs — dissolution cavities left behind when slightly acidic groundwater ate away at the grains and cement of a limestone or dolomite long after the rock was deposited. For a reservoir engineer they are not a curiosity; they are where the oil lives and the path it flows through. For a machine-learning engineer arriving in carbonates from a conventional computer-vision background, vugs are the thing your model has to find, count, and measure — and to do that well you first have to understand why a blob on a borehole image is worth a pipeline of its own. We will start from the geology, move to the image, and finish at the algorithm, because in our work with a mid-sized Middle East carbonate operator the three turned out to be one problem viewed from three angles.
Primary versus secondary porosity
Porosity is the fraction of the rock that is empty space available to hold fluid, and the subtlety that trips up newcomers is that not all of it is created the same way. Primary porosity is the void space the rock is born with — the gaps between sand grains, the chambers of fossil shells — and it is broadly depositional and reasonably predictable. Secondary porosity is everything diagenesis creates afterwards, and dissolution is its headline act: acidic fluids percolate through the rock over geological time, dissolving soluble minerals and leaving cavities where solid grains used to be. In carbonates, far more chemically reactive than quartz sandstones, this dissolution-driven porosity can dominate the reservoir story. Vugs are its most visible signature — discrete dissolution pores, often roughly equant, scattered through an otherwise tight matrix.
This is why carbonates are notoriously harder to characterise than sandstones, and the difficulty is not only about storage. Vuggy porosity adds capacity, but the more consequential question is connectivity: whether vugs are touching — vug-to-vug connected, sometimes via dissolution-enlarged fractures — or isolated in the matrix decides whether they carry flow or merely inflate a porosity number that never produces. A porosity tool reports an average over its measurement volume, and an average has no idea whether the pore space is one connected vug system or evenly distributed micro-porosity. Two intervals can log identical average porosity and behave nothing alike; the vugs are the difference, and the vugs are exactly what gets averaged away.
The practical consequence for anyone building a model: the geometry of each vug is the signal, not just the total fraction. A geologist reading an image log is implicitly reading a population — how many vugs, how big, how round, how clustered — because those properties hint at the dissolution history and at whether the cavities are likely connected. Collapse everything to one porosity percentage and you have thrown away the very structure that separates a flowing reservoir from a tight one. The job is to preserve the population.
What a vug looks like on an image log
Now to the image, where a data scientist's intuition usually has to be retrained. A borehole image log — from one of two different microresistivity imaging tools — is essentially a resistivity photograph of the wellbore wall, unrolled into a flat strip where each pixel's brightness encodes how resistive that patch of rock is. In water-based mud, the conductive drilling fluid fills any open cavity, so a fluid-filled vug reads as low resistivity — a dark blob against the brighter, more resistive carbonate matrix around it.
That single sentence carries the whole detection problem. A vug, on a high-resolution borehole image log, is a low-resistivity dark blob: a connected cluster of dark pixels, roughly rounded, sitting in a lighter background. The catch — what makes this a real engineering problem rather than a one-line threshold — is that dark pixels are not unique to vugs. Fractures are dark and linear; bed boundaries are dark and laminar; pad gaps and tool artefacts are dark for reasons that have nothing to do with geology. The detector's entire job is to separate blob-shaped dark (a vug) from line-shaped dark (a fracture) from band-shaped dark (a bed) from junk-shaped dark (an artefact). Geometry is what does the separating.
Two image realities shape the engineering before any algorithm runs. The raw resistivity is rescaled to a 0–255 intensity range so the pipeline can treat it as a grayscale image, and coverage is partial — the button pads do not wrap the full borehole, so a strip typically images up to around 80% of the wall and the unmeasured columns must be masked out before anything dark is counted. The resolution is also coarse: a single image pixel corresponds to roughly 3 cm of true depth, so there is an irreducible ±3 cm depth uncertainty baked into every measurement and small vugs live just a few pixels across. You are doing blob detection at the edge of the sensor's resolving power.
From blob to number: count, area, circularity
Here is the conceptual leap that turns geology into supervision. A geologist's vug pick is qualitative — this interval is vuggy — but a model needs the population expressed as numbers, and the right numbers are the ones that describe each blob's geometry. In our work the pipeline reduced every detected vug to a small, physically meaningful feature vector, computed and summarised on a fixed 10 cm (0.1 m) depth grid down the well. Three properties carry most of the signal.
Count — how many discrete vugs per interval — is the crudest and most intuitive measure of dissolution intensity. A 10 cm window with twenty vugs is a different rock from one with two.
Area — the physical size of each cavity, recovered by tracing the blob's contour and applying the shoelace (Green's-theorem) area formula to it. In this Middle East carbonate the detected vug areas spanned roughly 1 to 12 cm², with the bulk of the population concentrated in the 1 to 3.5 cm² range and a clear right-skew — many small vugs, a long tail of large ones. Reported zone by zone, the typical spread tightened to about 1 to 6 cm². That distribution is the reservoir-quality fingerprint: a shift in the area histogram between two zones is a shift in how the rock will store and flow.
Circularity — how round each blob is, on a scale where a perfect circle scores 1.0 and an elongated sliver scores near 0. This is the single most discriminating geometric feature, because it is what physically separates a vug from a fracture. A dissolution vug is roughly equant; a fracture is a long, thin, low-circularity streak. In this field the measured vug circularity spanned 0.28 to 0.85, with a pronounced peak between 0.45 and 0.7 — semi-circular dominant, exactly the equant-but-not-perfect shape dissolution produces. That peak is not an incidental statistic; it is the prior the detector exploits.
The instrument above makes that decision tactile. Every contour the pipeline traces lands somewhere on the circularity axis, and a circularity gate of 0.3 to 1.0 admits genuine vugs while rejecting the linear, low-circularity contours that are really fractures or wellbore-parallel artefacts. Drag the gate: elongated contours fall out while the equant blobs enter the vug catalog — including a handful that an expert's manual pick missed but that meet every geometric criterion. A single threshold separates two geological objects that look identically dark to a naive pixel counter.
Why this is a computer-vision problem, and what the pipeline does
Vug detection is a clean example of a problem where classical computer vision is the right tool, not a deep network: there is no large labelled corpus of per-vug masks to learn from, only a coarse, partially-imaged grayscale strip and a strong geometric prior. So the pipeline is a deterministic, parameterised CV stack that runs in stages. It begins with local-variation enhancement by top-k mode subtraction — the dominant background intensities in a window are estimated and subtracted so dark blobs pop out against their local surroundings rather than a global average, essential when a single image log runs well over 200 m and brightness drifts down the well. Gaussian-modulated adaptive thresholding over a local block (block size 31, retuned to 51 on one of the validation wells) then binarises blob-from-background without a single global cutoff. Contour extraction (Suzuki–Abe border following) traces each connected dark region into an outline, from which area (shoelace) and circularity (against a minimum-enclosing circle) are computed directly. A geometry gate — the circularity gate of 0.3–1.0 plus an area filter — rejects fractures and slivers; a centroid-difference and IoU rule (intersection-over-union threshold around 0.2) merges duplicate contours; and contrast filters (a Laplacian-variance check and a mud-type-aware mean-intensity test) strip the false positives that survive on shape alone. Out the far end comes the per-10 cm statistical summary: count, total and mean area, and the area, circularity, and azimuth spectra.
Validation matters as much as the algorithm. The pipeline was checked against expert picks across three carbonate wells — a vertical well, a horizontal well logged with a compact microresistivity tool roughly 10 km away, and a second vertical well about 12 km from the first — deliberately spanning two imaging tools and two orientations to prove the geometry generalises, with only two or three thresholds retuned per well. Against the experts' own picks the automated vug-area estimate landed within a mean absolute error of about 1.21 cm², and, tellingly, recovered vugs at several depth intervals the manual pick had missed entirely.
The connection to machine learning
If you came here as an ML engineer wondering where the deep learning is, the honest answer is: not yet, and that is the point. This classical CV pipeline is not the end state — it is the ground-truth factory for one. Hand-picking vugs across kilometres of image log to build a supervised training set is exactly the bottleneck that makes a learned vug-segmentation model impractical to start. A deterministic, validated, geometry-aware detector that emits per-vug masks at 0.1 m resolution across whole wells is precisely the labelling engine that makes a downstream semi-supervised or supervised segmentation model feasible — the blobs it traces today become the masks a network trains on tomorrow. Understanding what a vug is — a dark, near-equant dissolution pore at circularity 0.45–0.7, sized 1–12 cm², carrying the secondary porosity that decides whether a carbonate flows — is the prerequisite for every model that will eventually learn to find one. Get the geology and the geometry right first, and the learning has something true to learn from.
Key takeaways
- Vugs are dissolution pores — secondary, diagenetic porosity created when acidic fluids dissolve carbonate grains long after deposition. In carbonates this secondary porosity can dominate the reservoir, and a single averaged porosity log smears away the discrete-cavity structure that decides whether the rock flows.
- On a high-resolution borehole image log in water-based mud, a fluid-filled vug reads as a low-resistivity dark blob against the brighter matrix. The detection problem is separating blob-shaped dark (vug) from line-shaped dark (fracture) and band-shaped dark (bed) — geometry, not brightness, does the separating.
- Quantifying each vug per 10 cm — count, area (1–12 cm², concentrated 1–3.5 cm²), and circularity (0.28–0.85, peak 0.45–0.7) — converts a qualitative 'vuggy' pick into the structured reservoir-quality signal the detector reports. The area histogram and circularity peak are the fingerprint.
- Circularity is the most discriminating feature: a circularity gate of 0.3–1.0 admits equant dissolution vugs while rejecting linear, low-circularity fractures and artefacts. It is the single geometric decision at the centre of the pipeline.
- The detector is a deterministic computer-vision stack — top-k mode subtraction, Gaussian adaptive thresholding (block 31/51), Suzuki–Abe contours, area/circularity gating, IoU deduplication, Laplacian and mean-intensity contrast filters — validated against expert picks across three carbonate wells (two different microresistivity imaging tools, vertical/horizontal) to ~1.21 cm² mean absolute error, recovering vugs experts missed. Its per-vug masks are the ground-truth factory for a future learned vug-segmentation model.