The Refinery

The durable business of this era converts private judgement into legible, priced outcomes and artifacts.

Jun 12, 2026

The easiest causal reading of Sarah Guo ⚡️ Greylock’s excellent recent essay The Untrainable1 is some version of … hide.

The models will absorb whatever can be measured, so retreat to work that can’t be. Get inside private workflows. Defend the judgment no benchmark can reach, and keep retreating ahead of the benchmarks. The despair underneath this is real. If every legible thing is on its way to becoming training data an obvious answer might become less legible.

Why is it wrong?

Simply put, the untrainable is not a place to hide. Illegibility is not a moat. It is unrefined crude.

The durable business of this era is not the company that merely occupies private ground truth. It will be the companies that own the process for converting private ground truth into legible, priced, traceable, trainable artifacts. The companies that win this decade will not be the ones hiding from absorption. They will be the ones running it.

That process is….

Sarah’s essay is right about the terrain and that the next great application companies need permission: access to private correctness, private data, private workflows, private trust. But the terrain is not the business. The business is what you do once you are inside it.

You refine.

Legible means cheap for someone else to verify. You know: a test suite makes code legible. A signed clinical note makes medical judgment legible. A resolved support ticket makes customer service legible. A price makes an outcome legible. A spec makes intent legible :)

Kerosene

When kerosene commoditized, the spoils didn’t go to the producers who fled to artisanal corners of the oil business. It went to John D. Rockefeller, who owned refining, the step that turned crude into the thing everyone had to buy. Commoditization was never the threat to that position: it was the engine of it.

The same position is open today, at far greater scale.

Crude in: private judgment, tacit knowledge, customer history, institutional memory, undocumented workflow, relationship trust, the unwritten definition of done inside tens of thousands of firms.
Product out: evals, traces, specs, corrections, outcome contracts, benchmarks, prices.

Throughput rises with every model generation. Standing in front of this process is the despair. Standing on top of it? That’s a helluva business mannggg.

This new tide has a schedule

The demand side is guaranteed. The labs replace their own flagships on a schedule, cut prices before anyone forces them to, and treat last quarter’s model as this week’s footnote. The faster the models improve, the more the refinery earns.

Anything you can measure becomes a benchmark. Anything that becomes a benchmark gets trained against. Anything trained against commoditizes. Sarah names this the absorption frontier.

Here I’ll just call it the tide, because it 1) has a schedule and 2) it does not run backward.

For example, HumanEval took about three years to die. MATH took three and a half. SWE-bench Verified was introduced as the serious test of real software work, and frontier models began crossing scores that would have sounded absurd not long before. Different tasks, different decades of accumulated human skill turning into a singular shape. And each new sandcastle melts faster than the last, because the tide is the thing improving.

The hiding strategist looks at current gaps and thinks: sanctuary.

All the judgment, review, trust, context, taste, liability, and organizational memory the model can’t touch. I know let’s LIVE THERE!

But it is inventory soon to arrive in a queue.

It is what the current tide hasn’t reached yet. And it is what someone will refine next.

That is the subtle distinction. Private access matters enormously. It may be the first requirement of any great AI application company. But access is not defensibility by itself. Being inside the workflow only matters if you instrument the workflow. Being trusted by the customer only matters if that trust gives you the right to observe correction, define outcomes, capture judgment, and improve the system.

Sarah is right that the ground shrinks under whoever stands on it. So the question is not where you can hide forever. There is no forever. The question is whether you can keep converting the present frontier of illegibility into the next layer of legible product.

The artifact depreciates. The refinery compounds.

So keep at it, until it’s fully digested. As a strong stomach digests whatever it eats. As a blazing fire takes whatever you throw on it, and makes it light and flame. — X. 31

A price is an eval with a billing API

The companies following the process of refining are converting illegibility, as fast as the relationship allows.

Intercom prices Fin, its support agent, at 99 cents per resolution. The agent closes the conversation and the customer pays a dollar. This gets celebrated as the masterstroke of selling illegible value: outsiders can’t verify the work, so get inside and price the outcome.

But that is not a defense of illegibility. It is the conversion of illegibility into a unit of account with a price and billing api attached.

Every resolved conversation is a labeled example: this input, these actions, this output, graded by the one judge that matters, the customer’s money.
And read the fine print on who decides what resolved means. That definition is the product.

The company is not defending illegible territory. It is running a legibilization machine at a dollar per datum.

Abridge is another example. It is medicine’s moat-money-can’t-buy story: ambient AI that turns the clinical conversation into the note, deployed across large health systems and scaled across thousands of clinicians.

At first glance, this looks like exactly the kind of private, illegible workflow the hiding argument would celebrate. Clinical judgment. Physician trust. Patient context. Messy conversation. Institutional deployment. Regulatory friction. Deep workflow integration.

But Abridge is not winning because the work stays illegible.

It is winning because the work becomes traceable.

Every sentence of every draft note links back to the audio that produced it. Every correction a physician makes before sign-off is an expert label captured at the point of care. The final artifact lands inside the system of record.

The note is not merely documentation. It is the interface where tacit clinical judgment becomes auditable, correctable, and eventually trainable.

That too is not hiding from the absorption frontier. It is building a refinery inside one of the most valuable private workflows in the economy.

Mercor does the same: the company pays bankers, lawyers, consultants, physicians, and other professionals to write their judgment down, sells the output to the frontier labs, and then does the thing only trusted insiders are supposed to be able to do: it turns professional work into benchmarks. One is named APEX!

What good white-collar work is (the kind of thing that once lived in apprenticeship, review, taste, correction, and institutional context) is getting written down case by case.

That does not mean every professional judgment is now solved. But the frontier is moving very quickly.

And the next valuable position belongs to whoever has the relationships, permission, and workflow surface area to write down the next layer.

The obvious objection

If every sale is a label, the refinery commoditizes its own ground. It does. That is the oil story retold. Every barrel got cheaper for decades and the refining position compounded anyway.

In AI, the artifact you publish depreciates. The benchmark gets beaten. The eval becomes table stakes. The spec becomes training data. The resolved ticket becomes a pattern. The signed note becomes an example.

But the relationship that produces the next artifact does not depreciate in the same way.

The durable advantage is not the static artifact. It is as Sarah puts it, the permission to sit inside the work while fresh judgment is still being formed. It is the flow of new decisions. It is the right to observe the correction. It is the discipline to write down what everyone else leaves implicit.

Hiding does not opt you out. Judgment exercised through a model leaves exhaust, and exhaust gets refined by someone.

You are the seller or the leak.

My own writing does not escape this. I have spent a year arguing that implementation is disposable and the spec is the asset. But a spec is judgment made explicit, and explicit means trainable. What I wrote last year is and can be crude in someone’s refinery pipeline.

The moat was never the page. It was the pen and the permission to be inside the work. The flow of fresh decisions. The habit of making judgment explicit before the tide arrives.

That is why The Untrainable is right about the terrain and incomplete as a final map. The untrainable is where value survives for now. The refinery is how that value becomes a company.

The most cited benchmark score of this year is not a monument. It is a surveyor’s stake: proof of where the crews just finished, and fair warning of where they go next.

When they reach your ground, will you be the territory? Or the one drawing the map?

Read it, you really should

Meditations on Tech

Discussion about this post

Ready for more?