There is a version of the well-log digitisation pitch that ends the meeting early: the one where you promise to build a better data set than the companies that already own the data. Enverus and IHS have spent decades assembling licensed well headers, production histories, and interpreted logs, and they rent that corpus to nearly everyone who drills. If your plan is to collect a rival corpus and undercut them on price, you have picked the one contest they are structurally certain to win. They started twenty years ago, the data compounds, and every new licence deepens the moat. We are a digitisation team, not a data broker, and our first strategic decision was to stop fighting on that ground at all.
This is a note about the axis we chose instead. It is deliberately not the VeerNet whitepaper, which is about the architecture and the training. It is about the commercial shape: who owns the input, what a unit of output costs to produce, and why those two facts let a small team hold a defensible position next to companies a thousand times its size. We do not compete for the data. We compete for the work of making a customer's own data usable, and that is a different market with a different cost structure.
The moat is real, and it is not the moat you attack
Start by being honest about the incumbent's advantage, because underrating it is how challengers die. The subsurface-data majors own their corpus outright, they license it, and the licence is the product. Christensen's observation applies cleanly: an incumbent optimises the axis it already leads and defends it with everything it has, because that is where its revenue lives [1]. Attacking the corpus means attacking the exact thing the incumbent is best in the world at protecting. You will lose, and you will lose expensively.
The economics of information make the same point from the other side. Shapiro and Varian described why the owner of a large proprietary corpus prices access rather than giving it away: serving one more query costs almost nothing, but the whole collection is enormously valuable, so the owner meters it [2]. That metering is the incumbent's business model, and it is also its blind spot. A licence prices access to data the vendor owns. It says nothing about data the vendor does not own and has no reason to touch.
The data nobody is selling is the data the customer already has
Every operator we have worked with is sitting on an archive of its own scanned logs. Old wells, acquired assets, paper records photographed and filed, raster images that are legally and physically the customer's property and that no incumbent will ever license back to them, because the incumbent does not have them. As a concrete illustration of the scale, a single public archive we trained and tested against holds 136,771 raster TIF scans and 7,781 LAS files. An operator's private version of that pile is the same shape: large, trapped, and worthless as long as the curves live as pixels instead of numbers.
That is the wedge. We are not asking the customer to buy someone else's data. We are turning the customer's own data, which they already paid for once, into something a petrophysicist can compute on. The incumbent cannot follow us here without abandoning the model that makes it money, because there is no corpus to license, only a service to perform. This is the different axis, and it is the whole strategy in one sentence: do not sell data you own, sell the digitisation of data the customer owns.
Why the cost structure holds
A positioning claim is only worth as much as its unit economics, so here is the arithmetic that makes the wedge defensible rather than merely clever. When you sell licensed data, your cost floor is the licence you paid to assemble the corpus. When you sell digitisation of the customer's own archive, your cost floor is compute. One trained model serves every scan, so turning one more raster log into curves costs a slice of a rented GPU, and the rentable tiers we built against were 750 and 1,800 EUR per month. There is no per-record data cost, because there is no data to license and the customer supplies the input for free.
That difference is the reason the position exists. An incumbent's marginal record carries a share of a corpus that took decades and a great deal of money to build. Our marginal record carries a share of a monthly GPU bill. We priced the service at 1,200 USD per seat per year, which works because the thing under the price is compute divided by throughput, not a data licence divided by scarcity. The pricing follows the cost structure, and the cost structure follows the axis we chose.
The market that math implies
None of this requires the challenger to win the whole market, and the sizing makes that plain. Framed the way the incumbents frame it, the total addressable market for oil and gas transactions data runs to about 134 billion USD, and the serviceable slice, the oil and gas technology market where a digitisation tool actually competes, is around 6.7 billion USD. We never modelled taking the 134 billion. Out-hoarding the firms that own that market is exactly the fight we refused. We modelled a 3 percent serviceable-obtainable share of the 6.7 billion, roughly 180 million USD by the end of year five.
Three percent is a modest number on purpose. It is the number you can defend when you are not trying to displace the corpus owner, only to serve a job the corpus owner ignores. Early demand was consistent with that read: we carried three signed letters of intent from prospective customers spanning explorers, a private-equity holder, and an operator, which is a small sample but a real one, and it came from the wedge, not from a promise to out-collect anybody.
What we are not claiming
The honest boundary of this argument is that a good axis is necessary, not sufficient. Choosing to compete on the customer's own archive at a compute-priced margin gets you a position the incumbents have no incentive to defend, per Porter's point that the durable move is to pick ground the leader has not fortified [3]. It does not get you a working product. The digitisation has to be accurate enough that a petrophysicist trusts it, the model has to generalise off the public training corpus onto a given operator's messier scans, and the cost story only holds if serving throughput is high enough to keep the per-log number low. Those are real risks the positioning does not remove. The strategy tells you which fight to have. It does not win the fight.
Limitations
The market figures are the sizing we used, framed as the incumbents frame it, and they estimate a market, not audited revenue: the 134 billion USD total, the 6.7 billion USD serviceable slice, the 3 percent target, and the 180 million USD end-of-year-five projection are a plan, and plans miss. The three letters of intent are a genuine but tiny demand signal, not a repeatable sales motion, and none is a booked contract. The 136,771 TIF and 7,781 LAS counts describe the public archive we illustrate the wedge with, not any one customer's private pile, whose size and quality will vary. The 750 and 1,800 EUR per month GPU tiers are real rental figures, but the per-log cost that follows depends on serving throughput, an engineering result covered elsewhere and not defended here. The positions on the map are a strategic judgement, not a measured coordinate. And the whole argument assumes the digitisation is good enough to sell, which this note asserts rather than proves.
The one decision that mattered
If there is a single transferable idea in this, it is that a challenger's most important choice is often subtractive: deciding which contest not to enter. We could have spent our first two years and all our money trying to assemble a corpus to rival companies that had a twenty-year head start, and we would have had nothing to show for it. Instead we conceded the data they own, entirely and on purpose, and went after the data they do not, the customer's own trapped archive, at a cost structure they cannot match without changing what they are. The moat is real. We just chose not to swim across it.
References
[1] Christensen, C. M. The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail. Harvard Business School Press (1997). Why incumbents optimise the axis they already lead, and how an entrant wins on a different one the incumbent has no incentive to defend. https://www.hbs.edu/faculty/Pages/item.aspx?num=46
[2] Shapiro, C., and Varian, H. R. Information Rules: A Strategic Guide to the Network Economy. Harvard Business School Press (1999). The economics of information as an asset, and why the owner of a large proprietary corpus meters access rather than giving it away. https://www.hbs.edu/faculty/Pages/item.aspx?num=170
[3] Porter, M. E. How Competitive Forces Shape Strategy. Harvard Business Review 57, no. 2 (1979), pp. 137-145. Entry barriers and the case for choosing where to compete rather than accepting the ground an incumbent has already fortified. https://hbr.org/1979/03/how-competitive-forces-shape-strategy