Seismic data fold — the density of subsurface measurements — has jumped from around 40 in the early 2000s to over 4,000 today, a hundred-fold increase that is fundamentally reshaping what machine learning can do in geophysics.
Why this matters
Every technological leap in upstream geoscience has been triggered by the same question: have we found all the oil? The answer has always been no — but the tools required to prove it have grown exponentially more sophisticated. Twenty years ago, a standard 3D seismic survey ran at 40-fold with offsets under 4.5 kilometers. Today, ultra-long-offset ocean-bottom-node acquisitions routinely reach 18–24 kilometer offsets at 4,000-fold, delivering two orders of magnitude more data per square kilometer.
Fold (early 2000s)
Fold (today)
Data density increase
That volume explosion is not just a storage problem. It unlocks an entirely new class of data-driven workflows — full-waveform inversion, physics-informed neural networks, automated velocity model building — that were mathematically possible but operationally infeasible at 40-fold. The shift from stacking and post-stack migration to reverse-time migration and blended acquisitions mirrors the shift in machine learning from linear regression to gradient-based optimisation at scale.
The current state
Machine learning is not new to geophysics. Linear regression, singular-value decomposition, and principal-component analysis have been foundational tools for decades — only recently rebranded under the ML umbrella. What changed is scale and application. Low-hanging use cases — fault picking, horizon interpretation, top-of-salt delineation — are already production-grade in most interpretation workflows. Denoising, the easiest supervised-learning target because legacy processing chains produce natural labels, is now routine.
But the frontier has moved. Physics-informed neural networks are being applied to trace reconstruction and multiple suppression. Full-waveform inversion, itself a gradient-descent problem, is converging with machine learning: gradient QC, local-minima escape, and initial-velocity-model generation are all active research fronts. Travel-time computation, migration-swing reduction, and footprint attenuation post-migration are transitioning from research to deployment.
Full-waveform inversionA seismic imaging technique that iteratively updates a velocity model by minimising the misfit between observed and modelled waveforms, solving the full wave equation rather than relying on simplified ray-based approximations.The constraint that makes subsurface ML harder than vision is physics. A seismic amplitude encodes elastic impedance contrasts, acquisition geometry, and wave-equation propagation — none of which resemble the statistics of natural images. Transfer learning from ImageNet fails. But when physics is baked into the loss function or the architecture, performance jumps. A denoising method originally developed for satellite imagery, adapted with subsurface priors, outperformed traditional geophysical filters in trials conducted by Viridien's processing teams.
What changed
The shift is not technological alone — it is organisational. Senior Strategy and Business Development Manager at Viridien, with two decades in velocity modelling and seismic interpretation, describes the transition bluntly: the repetitive parameter-tuning and manual QC that consumed weeks per project are being automated. The human role is migrating from execution to lateral thinking.
Lateral thinking means borrowing breakthroughs from adjacent fields. Medical imaging denoisers adapted for seismic. Satellite image enhancement applied to migration artefacts. Large language models predicting optimal processing parameters based on basin context. The knowledge that used to walk out the door when senior geophysicists retired now accumulates in training corpora and parameter databases, accessible to the next generation without re-learning two decades of trial and error.
But automation has limits. A model trained to distinguish cats from dogs will confidently mislabel an edge case unless a human steps in. Similarly, an FWI gradient can converge to a geologically implausible velocity model if the initial guess is poor. The human validates plausibility, connects disparate domains, and asks the question the model was never trained to answer.
Implications
The immediate implication is workforce recalibration. Geophysicists who built careers on manual horizon picking or iterative parameter sweeps face the same pressure that drafters faced when CAD arrived. The survivors will be those who treat ML as a cognitive lever — dumping repetitive logic into models, freeing time for cross-domain synthesis and frontier problems.
The deeper implication is expansion beyond hydrocarbons. Geothermal resource characterisation, lithium deposit mapping tied to geothermal brines, hydrogen and helium exploration, freshwater aquifer delineation — all require subsurface imaging at scale. The same FWI and ML pipelines developed for oil and gas transfer directly. Geothermal alone will not replace hydrocarbons in the energy mix, but it will demand integrated geoscience: seismic, gravity, magnetics, remote sensing, and geology in a single interpretive loop. Machine learning orchestrates that integration.
Cross-domain transfer
What's next
The next frontier is literal: off-Earth. Asteroid mining, lunar regolith characterisation, and Martian subsurface water detection are no longer science fiction. NASA's Artemis programme and private ventures by SpaceX and Blue Origin are building the transport layer; geoscientists will build the resource layer. Seismic methods adapted for low-gravity, airless environments. Ground-penetrating radar for ice and mineral detection. Machine learning models trained on terrestrial analogs and transferred to extraterrestrial data.
The concept is not new — the 1998 film Armageddon recognised that space missions need geologists on board — but the timeline has compressed. Students entering geoscience programmes today will work on projects beyond Earth within their careers. The toolchain they inherit — high-fold seismic acquisition, physics-informed ML, automated inversion — was built for terrestrial oil and gas but generalises to any subsurface imaging problem, on any planetary body.
Back on Earth, the convergence of machine learning and full-waveform inversion will continue. Research groups are already demonstrating FWI driven entirely by neural networks, replacing the adjoint-state solver with learned gradients. Production deployment is years away, but the gap is narrowing. When it closes, the boundary between 'geophysics' and 'machine learning' will dissolve entirely — leaving only subsurface inverse problems solved at the scale the data demands.
Takeaways
- Seismic data density has grown 100× in two decades, enabling ML workflows that were operationally infeasible at lower fold.
- The human role is shifting from manual execution to lateral thinking — connecting breakthroughs across medical imaging, satellite processing, and subsurface geophysics.
- Geothermal, lithium, hydrogen, and freshwater exploration inherit the same ML-driven seismic toolchain developed for oil and gas.
- The next generation of geoscientists will apply these methods off-Earth — to asteroids, the Moon, and Mars — within their careers.
References
[1] Shivaji (Senior Strategy and Business Development Manager, Viridien), interview transcript, 2024.