Thin-Structure Segmentation Beyond Well Logs

The story of VeerNet is usually told as a well-log story, and that framing hides the part that actually generalises. EarthScan built VeerNet to lift curves off scanned raster well logs, so the wells are where the numbers live. But the wells were never the hard problem. The hard problem was that the thing the model had to find was almost nothing: two constant curves per log, each about one pixel wide, drawn across a raster in which roughly 97 percent of the pixels are background. Learning to find a thin bright line in a mostly empty field is a shape of problem, not a subject-matter problem, and once you see it that way the tooling stops being about petrophysics and starts being about a whole family of line-art tasks that share the same lopsided arithmetic.

This note is a primer for that transfer, not a report on our engagement. We have described the VeerNet architecture and its results elsewhere; the point here is complementary. We want to say clearly which parts of the curve-segmentation playbook are portable to domains that have nothing to do with subsurface data, which parts are not, and why the answer turns almost entirely on one number: the fraction of pixels that are foreground.

The shape underneath the subject

Take a raster well log, a topographic map, a mechanical drawing, a piping-and-instrumentation schematic, and a screenshot of a line chart, and strip away what each one is about. What remains is startlingly similar: a field of near-empty background with a small amount of thin, high-contrast line-art laid on top. The lines carry the information and the lines are what the model must segment, and in every case they occupy a small single-digit percentage of the pixels at most. The subject matter differs completely. The pixel statistics barely differ at all.

That shared shape is why the same failure modes recur. A naive segmentation loss discovers that predicting background everywhere scores extremely well, because background everywhere is almost right. On our logs, calling every pixel background is roughly 97 percent accurate before the model has learned anything, and a plain cross-entropy objective will happily sit in that basin. The thin foreground is where all the information is and almost none of the pixels are, so the gradient that should teach the model to find it is drowned out by the gradient that rewards leaving it alone. This is not a well-log pathology. It is the signature pathology of every thin-structure domain, and any tool that solves it for one of them is, structurally, solving it for all.

What carries across, what does not

The parts of the playbook that transfer are the parts that respond to the foreground fraction rather than to the subject. Three of them do most of the work.

The first is the architecture. An encoder-decoder with skip connections, the family Ronneberger, Fischer, and Brox introduced for biomedical images, is a thin-structure architecture before it is a medical one [1]. Its skip connections exist precisely so that fine spatial detail lost in the downsampling path is restored in the upsampling path, which is exactly the detail a one-pixel line lives in. That property has nothing to do with what the pixels depict. It is why the same architecture family that segments cell membranes also segments roads, contours, and our curves.

The second is the loss. The reason a plain objective fails on thin structure is class imbalance, and the fix is a loss that can be told to care disproportionately about the rare class. The Tversky loss generalises the Dice objective with tunable weights on false positives and false negatives, so you can penalise missing a foreground pixel far more heavily than falsely painting one [2]. On our curves that mattered concretely: our best curve R-squared, 0.9891, came under Tversky loss rather than the unweighted alternatives, and the same reasoning applies unchanged to any domain where the line is rare and the background is cheap. The loss does not know it is looking at a resistivity trace. It knows the positive class is scarce, and that is the only fact it needs.

The third is the framing of accuracy itself. On thin structure, pixel accuracy is a vanity metric, since the do-nothing baseline already scores in the high nineties. The honest measures are the ones that ignore the easy background: intersection-over-union on the foreground class, and, once the mask is turned back into a usable output, the error of the recovered line. For us that recovered output is a depth-indexed curve, and the honest number is the per-curve mean absolute error, which under the same Tversky run that gave us the 0.9891 was 0.0277 and 0.1241 for the two curves. A map team would recover a contour polyline and measure its geometric error; a charts team would recover a series and measure its value error. The metric changes name across domains but not character: it always measures the thin thing, never the empty field.

The well-log curve lessons transfer to any thin-structure line-art domain because they all share one property: the foreground the model must find is a tiny fraction of the pixels. The well-log anchor carries the sourced numbers from the engagement archive: two curves per log over roughly 97 percent background, so under 2 percent foreground, on which the encoder-decoder reached a peak curve R-squared of 0.9891 under Tversky loss with a per-curve depth MAE of 0.0277 and 0.1241 from that same Tversky run. The other four domains (engineering line art, cartographic contours, P&ID schematics, and chart trend lines) are placed at illustrative foreground fractions on the same axis to show they live in the same thin band, not to claim measured transfer accuracies. The one orange element that argues is the transfer threshold: drag the maximum foreground fraction where the thin-structure playbook still holds, and watch every line-art domain fall inside it, on the same side of the line as the wells. The R-squared, the MAE pair, the two-curves-per-log count, and the roughly 97 percent background share are sourced; the per-domain foreground fractions and the threshold band are illustrative stand-ins for the shared shape, not measured results.

What does not carry is everything downstream of the mask. Turning a curve mask into a depth-calibrated log requires knowing about depth tracks, header scales, and the physical meaning of the two curves, and none of that transfers to a map or a chart. The synthetic-data generator we built to train on is likewise subject-specific: it fabricates plausible log curves with the right widths and spacings, and it would teach a road model nothing. The transferable layer sits strictly between the raw raster and the clean mask. Above the mask, every domain is on its own, and pretending otherwise is how a cross-domain claim quietly becomes false.

Why the map people got here first

None of this is our discovery, and the honest version of the primer says so. The cartographic community was extracting thin line-art from mostly empty rasters well before curve digitisation became a segmentation problem. Mnih and Hinton were detecting roads, lines a few pixels wide occupying a small share of an aerial scene, with learned models more than a decade ago [3]. The reason their methods rhyme with ours is not influence but convergence: two teams facing the same foreground fraction arrive at the same defenses. When independent fields keep reinventing the same class-weighting and the same detail-preserving architecture, the common cause is the pixel statistics they share, which is the strongest evidence that shape, not subject, is what governs.

The instrument above makes that argument visible. It fans five line-art domains along a single axis, the foreground fraction, and anchors it with the one domain where we have measured numbers: our logs, under 2 percent foreground over 97 percent background, with the sourced R-squared and MAE attached. The other four are placed at illustrative fractions, not measured ones, because we have not run VeerNet on maps or charts and will not pretend we have. The placement is the point regardless of its precision: every line-art domain sits far to the thin side of any sensible transfer threshold, on the same side of the line as the wells.

The primer, in one move

If you are starting a thin-structure segmentation problem in any domain, the fastest thing we can hand you is not our model but our diagnosis: measure your foreground fraction first. If the lines you care about are a small single-digit percentage of your pixels, you are in the regime we were, and three moves apply. Use a detail-preserving encoder-decoder so the thin structure survives downsampling. Use a loss with an explicit knob for the rare class so the objective cannot cheat by predicting emptiness. Judge the result on foreground overlap and recovered-line error, never on pixel accuracy. Everything past the mask is yours to build, but the way into the mask is the same road, and other people paved most of it before we got there.

Limitations

The transferable claim in this note is deliberately narrow, and it is worth marking its edges. The sourced numbers, the peak curve R-squared of 0.9891 and the per-curve MAE of 0.0277 and 0.1241, both from the same Tversky-loss run, together with the two constant curves per log and the roughly 97 percent background share, are all from our own well-log runs and nothing else; they are not evidence about accuracy in any other domain. The foreground fractions the instrument plots for engineering line art, cartographic contours, schematics, and chart trend lines are illustrative placements chosen to sit in the same thin band, not measured statistics from those domains, and a real dataset in any of them could sit meaningfully higher or lower depending on line density and rendering resolution. The transfer we argue for is of method, not of weights: nothing here suggests a model trained on our logs would work on a map without retraining, only that the same architecture family, loss family, and evaluation discipline apply. And the whole argument holds only in the thin-foreground regime; a domain where the target class is a large fraction of the pixels, dense textures, filled regions, most natural-image segmentation, is a different problem with different defenses, and the playbook above does not claim it.

References

[1] Ronneberger, O., Fischer, P., and Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2015, LNCS 9351, pp. 234-241. The encoder-decoder with skip connections that made pixel-accurate thin-structure segmentation practical from small datasets. https://link.springer.com/chapter/10.1007/978-3-319-24574-4_28

[2] Salehi, S. S. M., Erdogmus, D., and Gholipour, A. Tversky Loss Function for Image Segmentation Using 3D Fully Convolutional Deep Networks. Machine Learning in Medical Imaging (MLMI) 2017, LNCS 10541, pp. 379-387. The tunable false-positive and false-negative weighting that lets a loss cope when the foreground class is a tiny fraction of the pixels. https://link.springer.com/chapter/10.1007/978-3-319-67389-9_44

[3] Mnih, V., and Hinton, G. E. Learning to Detect Roads in High-Resolution Aerial Images. European Conference on Computer Vision (ECCV) 2010, LNCS 6316, pp. 210-223. An early demonstration that thin line-art occupying a small share of an overhead scene can be extracted with learned models, the cartographic cousin of curve extraction. https://link.springer.com/chapter/10.1007/978-3-642-15567-3_16

Thin-Structure Segmentation Beyond Well Logs

The shape underneath the subject

What carries across, what does not

Why the map people got here first

The primer, in one move

Limitations

References

Continuous AI for explorers

About Earthscan

Products

Legal

Follow us on