The Number We Deleted: Retracting a Claim Mid-Manuscript

There is a quiet moment in every applied-research project that no one puts on a slide: the moment you delete a true-sounding number because you can no longer prove it. It is not a dramatic retraction. No press release, no correction notice. A single sentence leaves a draft between one revision and the next, and the published paper is stronger for its absence. We want to describe one such deletion in detail, because the discipline it demanded is more transferable than any model we built. On a roughly twenty-month engagement with a mid-sized Middle East carbonate operator, we submitted a peer-reviewed manuscript describing a detector we had built on their borehole image logs. An early draft carried a speed claim. A reviewer pushed on it. We took it out. This is the story of that one line, and of why taking it out was the right call rather than a defeat.

The claim, exactly as it stood

The detector in question found small dissolution cavities in carbonate, and its whole appeal was that it was cheap to run relative to the method it replaced. So an early draft said what any author would want to say: the pipeline processed a metre of image log in under 30 seconds, against roughly four to five minutes per metre for the morphological baseline it stood in for. Written out, that is close to an order-of-magnitude gain, and it is the kind of sentence a reader remembers. It was also, at the time we wrote it, something we believed. The number came from real runs. It was not invented.

That last point matters, because the discipline we are describing is not about catching fabrication. Fabrication is easy to condemn and, in a serious group, rare. The harder case is the number that is honestly obtained and still cannot be defended, and this was that case. The speed figure had been measured, but it had been measured on particular hardware, on particular intervals, with a particular baseline implementation, and none of those conditions were pinned down in the manuscript with enough precision that a stranger could reproduce the comparison and land on the same ratio.

What the reviewer actually asked

The reviewer did not accuse us of anything. The challenge was narrower and more uncomfortable than an accusation: on what, precisely, was this timing measured, and would it hold on someone else's machine against someone else's baseline. That is the correct question, and it exposes an asymmetry that sits at the centre of peer review. The burden of proof is on the author, not the reader. A claim is not admissible because it is probably true. It is admissible because the person making it can hand over enough of the measurement conditions that a skeptic could, in principle, check it. Speed is the most treacherous number in this respect, because it is the one most sensitive to conditions no one records: the CPU, the memory pressure, the exact input length, the reference implementation, whether the baseline was tuned or taken off the shelf.

We sat with the number and asked ourselves a single question: if a stranger re-ran this, on their own tools, would they reproduce the ratio we had printed? We could not say yes with the confidence a published claim deserves. The honest answer was that we did not know how much of the advantage was the method and how much was our particular setup. And a number you cannot underwrite is a liability the moment it appears in print, because it travels: it gets quoted, it becomes the headline a reader carries away, and if it later fails to reproduce it discredits every claim next to it, including the ones that were sound.

The lifecycle of a claim withdrawn under peer review. An early draft asserted that a detector on carbonate borehole image logs processed a metre of log in under 30 seconds against a four-to-five-minute baseline. A reviewer asked on what conditions that had been measured; the figure could not be made reproducible across the tools and hardware it was clocked on, so it was withdrawn in revision and is absent from the published record. The single orange element is the deletion itself, at the WITHDRAWN stage, because the deletion is the argument: the honest record is the version without the line. Toggle between Withdraw and Defend to see the two responses to a challenged number and why withdrawal was correct here, the speed claim was not load-bearing. The lower panels separate what the record keeps, inspectable per-cavity geometry and a defensible method, from the one number it dropped, shown as dropped rather than plotted. No throughput claim is made anywhere. All provenance is sourced from the engagement archive with identifying details masked.

Two ways to respond, and why we chose the second

Faced with a challenged number, an author has two moves. The first is to defend it: add a methods paragraph, specify the hardware, re-run a cleaner benchmark, and argue the reviewer into acceptance. The second is to withdraw it: concede that the claim, as measured, cannot be made reproducible within the scope of this paper, and remove it. Both are legitimate. The choice between them is an engineering judgement about cost and about what the argument actually needs.

Defending the speed line would have meant building a benchmark rig we did not have: matched hardware, a controlled baseline implementation, timing repeated across intervals with variance reported, all of it written up to a standard a reviewer could audit. That is real work, and it is work in service of a claim that was not load-bearing. The paper's case for the detector never rested on speed. It rested on the fact that the method produced explicit, measurable per-cavity geometry that a reader could inspect and argue with, which is a property you demonstrate on a page rather than clock on a stopwatch. Once we named what the argument needed, the speed line was revealed as ornamentation: pleasant, memorable, and inessential. So we withdrew it in revision, and it is absent from the published record.

There is a reflex in this genre to keep every favourable number, on the theory that more evidence is always better. That reflex is wrong. Evidence you cannot underwrite is not neutral padding; it is a weak point that an adversarial reader will find and lever, and when it gives way it takes stronger claims down with it. The discipline is to carry only what you can defend, and to feel the deletion of a true-sounding line as a strengthening rather than a loss.

The record is the version without it

The subtle part of retraction inside a live manuscript is what happens to the withdrawn claim afterward. It does not become a secret. It becomes a documented decision. The right way to hold it is the way we are holding it here: the number existed, a reviewer challenged it, we could not make it reproducible, we removed it, and we say so plainly rather than pretending the draft never carried it. That is different from erasing history and different from quietly hoping no one remembers the earlier version. The published paper does not mention the speed line, because a paper should assert only what it can defend. This account mentions it, because the provenance of a deletion is itself part of an honest record.

And having removed the number, we do not get to lean on it. It would be incoherent to withdraw a claim under peer review and then cite it in a blog post as though the withdrawal were a technicality. So we make no throughput claim here at all. Whether the detector is fast is, for the purposes of this piece, unknown and unclaimed. What we are claiming is narrower and, we think, more durable: that the way you treat a challenged number is a truer signal of a group's rigour than any number it manages to keep.

For completeness, and in one line only, because a companion piece covers it: the same programme sent a different feature to a deep-learning model while this detector used no learned weights at all, a division of labour set by the geometry of each object. That is a separate argument, made elsewhere. The argument here is about the speed line and what we did with it.

Why this generalises past one paper

The temptation the speed line embodies is not confined to academic writing. Every dashboard, every internal benchmark, every vendor slide carries numbers that were honestly obtained under conditions no one wrote down, and most of them are never challenged, so they harden into fact. Peer review is valuable precisely because it supplies the challenge that internal work rarely gets: a skeptical reader whose job is to ask on what this was measured and whether it would hold elsewhere. The habit worth exporting from the manuscript to the rest of the work is to pre-empt that reader. Before a number ships, ask whether a stranger could reproduce it from what you are prepared to disclose. If the answer is no, the number is not yet a result. It is a hope with a decimal point, and the disciplined move is to hold it back until it earns its place, or to let it go.

Key takeaways

Retraction inside a live manuscript is a routine act of integrity, not a scandal: an early draft carried a speed claim (under 30 seconds per metre against a four-to-five-minute baseline), a reviewer challenged it, and it was withdrawn in revision and is absent from the published record.
The hard case is not fabrication but the honestly-obtained number you cannot defend: the speed figure came from real runs, but the hardware, input intervals and baseline implementation were never pinned down precisely enough for a stranger to reproduce the ratio.
Under adversarial peer review the burden of proof sits with the author, and speed is the most condition-sensitive number of all: admissibility depends on disclosing enough measurement conditions that a skeptic could check the claim, not on the claim probably being true.
Withdrawing beat defending because the claim was not load-bearing: the paper's case rested on inspectable per-cavity geometry, so building a benchmark rig to rescue an inessential number was the wrong use of effort, and deleting the line strengthened the argument.
Having withdrawn the number, the piece does not rely on it: no throughput claim is made here at all, and the honest record is the version without the line, with the deletion itself documented rather than erased.

Limitations

This is an account of a single deletion in a single manuscript, offered as a worked example rather than a general theory of when to withdraw. The judgement that the speed line was inessential was ours, made against this paper's specific argument; a different paper, where throughput genuinely carried the case, would have owed the reader the benchmark rig instead of a deletion, and the discipline there would have been to build it, not to drop the claim. We deliberately make no speed claim in this piece, so nothing here should be read as a statement about how fast the detector was or was not. The reconstruction of the reviewer exchange is from our side of the record; the reviewer's own reasoning is inferred from the challenge, not quoted. Identifying details of the operator, the wells, and the personnel are masked throughout for confidentiality.

References

[1] Committee on Publication Ethics (COPE). Guidelines on corrections and retractions to the published record. The distinction between correcting, withdrawing, and retracting a claim, and the principle that the record should be transparent about changes.

[2] Li, X., et al. Path-morphology baseline for borehole-image feature extraction (2019). The morphological baseline against which the withdrawn speed line had been compared before it was removed in revision.

The Number We Deleted: Retracting a Claim Mid-Manuscript

The claim, exactly as it stood

What the reviewer actually asked

Two ways to respond, and why we chose the second

The record is the version without it

Why this generalises past one paper

Limitations

References

Continuous AI for explorers

About Earthscan

Products

Legal

Follow us on