Scale economies

A business has scale economies when its unit cost falls as volume rises. It’s a supply-side force — the advantage is in your cost structure, not in your users — and it’s the one moat the model layer genuinely has. Don’t confuse it with network effects (demand-side) or switching costs (lock-in).

Section 01

The mechanism

Fixed costs get spread over more units. A chip fab, a film, a search index, a frontier training run — each costs a fortune to build and almost nothing to serve one more time. The more you sell, the lower the cost per unit, so the biggest player can be both the cheapest and the most profitable.

As Elon Musk frames the hard-tech version: competitiveness is set by two things together — level of technology and level of scale. Maximize both and the cost position is very hard to attack.

Section 02

The dangerous part: scale economies invite commoditization

Scale economies drive concentration — a few big players — but they are not a moat against an equally large rival. If two competitors both have scale, the cost advantage cancels and the product competes on price. So scale economies are necessary but not sufficient; on their own they push a category toward a low-margin commodity with a handful of giants. (That’s why they so often need pairing with switching costs or a network effect to become durable.)

Section 03

Scale economies shared

The most powerful variant — Nick Sleep’s name for the Costco and Amazon model. Instead of pocketing the savings from scale, you hand them back to customers as lower prices. Lower prices drive more volume; more volume drives more scale; more scale drives lower costs; which funds still-lower prices. A deliberately under-monetized flywheel that a margin-maximizing competitor structurally cannot match without breaking its own model.

It looks like leaving money on the table. It’s actually buying an ever-widening cost moat with the foregone margin. Amazon turned this into a religion.

Section 04

Do AI models have scale economies? Yes — twice over

This is the one classic moat the model layer really has, on two distinct axes:

Cost amortization. A training run’s enormous fixed cost is spread across every inference that follows. The provider serving the most tokens amortizes a frontier run over the most volume — the textbook fixed-cost-over-volume case. See the cost-of-a-query model.
The data/usage flywheel. More usage yields more interaction data, more long-horizon agentic traces, and more feedback to improve the next model. (Note: this is a scale economy, not a network effect — the benefit returns to the provider through a better product, not to users through each other.)

And the labs are running the shared version: Anthropic cut Opus pricing — while compute-constrained — and consumption rose more than the cut, because users had been forcing Opus problems into cheaper workloads. The goal isn’t per-token margin; it’s proliferation, then per-customer expansion. That’s scale-economies-shared as an adoption flywheel — and the demand response is Jevons Paradox in the wild.

But recall Section 02: scale economies invite commoditization. Several labs now have frontier-grade scale, so the cost advantage partly cancels — which, combined with weak network effects and low switching costs, is why raw intelligence is commoditizing even though the underlying moat is real.

Where this is used

One of three moat forces Peregrinations keeps distinct — alongside network effects and switching costs. Any cost-structure or flywheel argument links here.