The phrase "subsurface digital twin" almost always gets drawn from the top down. Someone sketches a living reservoir model, wires it to real-time sensors and a fast simulator, and shows it updating as the field produces. The picture is not wrong, but it starts at the rung that photographs well and skips the one carrying the weight. A twin is the visible end of a feedstock chain, and the invisible start of that chain is a layer of machine-readable curves. Before anything can be aligned to depth, calibrated to physics, or kept current, the well-log curves have to exist as numbers a machine can read, not as pixels on a scanned page. This note is about that base rung, because it is the one that decides whether any of the rungs above it can stand.
We should be clear about what this is not. It is not a build guide for a digital twin, and it is not a re-telling of how we digitise a raster log; the architecture and the recovery evaluation live in the VeerNet paper [3] and we will not repeat them here. This is a positioning argument. The digital-twin idea, as Grieves originally framed it, is a virtual counterpart kept in correspondence with a physical asset through a data connection [1]. That framing quietly names the binding constraint: the twin is only as faithful as the data it stays in correspondence with. The industrial-systems survey literature makes the same point structurally, treating a twin as a layered, data-driven construct rather than a single model, so that the integration underneath is what carries the trust [2]. Read those two together and the conclusion is uncomfortable for anyone starting at the top: the subsurface twin's honesty is set at the bottom, in the layer most projects treat as plumbing.
A twin is the top of a ladder, not the whole of it
Picture the chain as four rungs. At the bottom are vectorised curves: the gamma-ray, resistivity, porosity, and density traces lifted off a scanned log and turned back into depth-indexed values. Above that sit depth-aligned features, where those curves are registered to a common depth reference and to each other, so a reading at one depth means the same thing across curves. Above that are calibrated models, where petrophysical or geological relationships are fit to those aligned curves. And at the top is the live twin, the calibrated model kept current as new wells and new measurements arrive. Each rung is built out of the rung below it. None of them can be built out of thin air.
The load-bearing claim is that the base rung is not just first in time, it is a fidelity gate on everything above. Depth-aligned features cannot be more trustworthy than the curves they align. A calibrated model cannot be more faithful than the features it is fit to. And a live twin cannot correspond to the real subsurface any better than the model it keeps updating. Errors do not average out as you climb; they set a ceiling. If the digitised curve is off by a systematic amount at a given depth, no amount of clever alignment or calibration above it recovers the truth, because the truth was already lost at the rung where the pixels became numbers. This is why we treat digitisation not as a preprocessing chore but as the first structural decision in a twin programme.
The base rung has a scale problem and a fidelity number
Two things make the base rung real rather than rhetorical. The first is scale. On the Texas onshore archive we worked from, the raw feedstock was 136,771 TIF raster images paired against 7,781 LAS files as ground truth. That asymmetry is itself the argument for machine recovery: there are far more scanned images than there are already-digital logs, and the gap between them is exactly the value locked in paper that a twin needs freed before it can be fed. Hand-tracing at that volume is not a plan, it is a wish. The base rung has to be built by a model that recovers curves at archive scale, or it does not get built.
The second is fidelity, and here we can be specific rather than hopeful. Recovering a curve is a regression problem once the mask is turned back into depth-indexed values, and it can be scored. On our runs the peak coefficient of determination against LAS ground truth reached R-squared 0.9891, and the lowest mean absolute error was 0.0132. Those are not uniform across every curve and every well, and the VeerNet evaluation is candid about where the recovery is strong and where it struggles [3]; the point here is narrower. A twin inherits whatever variable set the base rung hands up, and on these logs that set is fixed: two curves in Track 3, the neutron-porosity and bulk-density pair, and three curves across Tracks 1 and 2, the gamma-ray or spontaneous-potential or caliper trace and the resistivity trace. Every rung above inherits exactly those curves at exactly that recovered fidelity. The twin does not get to wish for cleaner inputs later.
The instrument above is the argument made draggable. The four rungs stack from vectorised curves at the bottom to the live twin at the top, and the single lever sets the fidelity of the base rung. Pull that fidelity down and the rungs above go dark from the top: the twin blinks out first, then the calibrated models, then the aligned features, because each one loses the trustworthy input it was standing on. Push it back up and the ladder relights from the bottom. The base rung never goes dark, because it is not standing on anything else; it is the thing everything else is standing on. That asymmetry is the whole point. You can lose the twin by degrading the base, but you cannot lose the base by degrading the twin, because the base is upstream of the twin in a way the pretty picture hides.
Why the fidelity of the base rung is not negotiable later
There is a tempting objection: surely the twin, being adaptive, can correct for a noisy base layer as it ingests more data. Sometimes, at the margins, a downstream model can smooth over a random error in the input. But a digitisation error is usually not random. A curve recovered with a systematic bias at high values, or a depth reference that is off by a consistent offset, produces a structured error that a calibration step will happily absorb into its fitted parameters and then reproduce forever. The twin does not correct that error; it launders it, and now it looks like signal. The correspondence Grieves described [1] becomes a correspondence to a distorted version of the well, kept faithfully up to date. A twin that is precisely wrong is more dangerous than one that is obviously incomplete, because it invites decisions.
This is also why the base rung is the right place to spend effort early rather than late. Every rung above it is a multiplier on whatever quality the base rung delivered. A one-time investment in recovering curves faithfully, at the scale the archive demands, pays off at every higher rung and at every future well the twin ingests, because those curves are the feedstock the whole structure metabolises. Spending the same effort three rungs up, tuning a calibration to compensate for a shaky input, buys a fix that holds only for the wells you tuned it on and breaks the moment the input distribution shifts. The economics of a ladder favour the bottom.
What "first step" actually means
Calling digitisation the first step toward a subsurface twin is not a modest claim about sequencing. It is a claim about dependency. The twin is downstream of curves the way a building is downstream of its foundation: you can admire the top floor all you like, but its position in space was decided at the bottom. On the archive we worked, that bottom was 136,771 images and 7,781 LAS files, recovered to a peak R-squared of 0.9891 and a low MAE of 0.0132, handing a fixed five-curve variable set up the ladder. Whether a subsurface twin built on that base is any good is a question about the rungs above, and we make no promises about them here. But whether it can exist at all is a question about the rung below, and the answer is that it cannot exist without one. The least photogenic layer is the one the whole idea rests on.
Limitations
This is a positioning argument, not a demonstration of a working twin, and it should be read that way. We built and evaluated the base rung, the recovery of machine-readable curves from raster logs; we have not built the aligned-feature, calibrated-model, and live-twin rungs on top of it as a delivered system, so the ladder above the base is an argument about dependency, not a report on a shipped stack. The four rungs and the specific fidelity thresholds at which the instrument darkens each one are illustrative structure chosen to make the gating visible, not measured hand-off points from a real pipeline; only the file counts, the per-log curve counts, and the recovery fidelity figures are sourced from the engagement archive. The fidelity numbers themselves are peak and best-case values on one operator's logs, not uniform guarantees across every curve, well, or field, and the VeerNet paper is the place to see where recovery is weaker [3]. Finally, the claim that downstream rungs cannot exceed the base rung's fidelity is a statement about error propagation in this kind of dependency chain, not a theorem; a downstream step could in principle add independent information from other data sources, which this ladder deliberately does not model, because our subject is the curve feedstock specifically.
References
[1] Grieves, M., and Vickers, J. Digital Twin: Mitigating Unpredictable, Undesirable Emergent Behavior in Complex Systems. In Transdisciplinary Perspectives on Complex Systems, Springer (2017), pp. 85-113. The origin framing of the digital twin as a virtual counterpart kept in correspondence with a physical asset through a data connection. https://link.springer.com/chapter/10.1007/978-3-319-38756-7_4
[2] Tao, F., Zhang, H., Liu, A., and Nee, A. Y. C. Digital Twin in Industry: State-of-the-Art. IEEE Transactions on Industrial Informatics 15(4) (2019), pp. 2405-2415. The survey framing of a twin as a layered, data-driven construct whose trust rests on the data integration beneath it. https://ieeexplore.ieee.org/document/8477101
[3] Maiti, T., Nassim, M. Q., Patwardhan, N., and Singh, T. VeerNet: Using Deep Neural Networks for Curve Classification and Digitization of Raster Well-Log Images. Journal of Imaging 9(7), 136 (2023). The architecture and evaluation behind recovering machine-readable curves from scanned logs. https://www.mdpi.com/2313-433X/9/7/136