Computational Abundance

The bubble builds the intelligence grid. The bust distributes it.

May 23, 2026

Across the AI supply chain, millions of accelerators are in motion: ordered, manufactured, staged, shipped, queued for facilities that do not yet have power.

At any moment, many of them sit on pallets in warehouses around the world. Climate-controlled, security-gated, quiet. Waiting for a transformer. Waiting for a substation. Waiting for steel from another state.

Some may never be plugged into the slots they were ordered for. For others, one, two, even three generations of silicon will ship before they see a power connection.

The first AI capital cycle is overbuilding the machinery of intelligence before the economy knows how to use it. The chips, asleep on pallets, will go to work anyway.

I. The Plug

The limit on AI is electricity, permitting, and time.

The hyperscalers guided to over $700 billion of AI and cloud infrastructure spending for 2026 alone.^[1] To fund it, they have turned to record corporate debt and a quarter-trillion dollars of venture capital in a single quarter (Appendix A).^[2]^[3]

None of it ships intelligence to a customer. It ships steel, copper, silicon, and concrete to a power easement that does not yet exist.^[4]

Amazon told investors European grid interconnections can take up to seven years. A datacentre takes about two years to build.

The model trains in months. The transformer arrives in years.

High-voltage transmission towers and power lines crossing open country under a clear sky — The bottleneck is no longer the chip. It is the substation, the transformer, the queue.

The market priced AI as if compute and intelligence were the same thing. They are not. Compute is silicon in racks connected to substations connected to fuel that has to be physically delivered. The chip is the easy part.

Geography entered the argument in 2026. The same natural gas cost six to seven times as much in Europe and Asia as on the U.S. Gulf Coast.^[5] For a continuously loaded 100 MW campus, the fuel spread alone runs roughly $100 million a year (Appendix B).

The deeper fact is temporal.

Can operators plug chips in before they depreciate?

A bridge delayed by two years is still a bridge. A GPU delayed by two years has crossed an architectural boundary. Hopper to Blackwell. Blackwell to Rubin. Rubin to Feynman.

And the scarce asset is the slot: permitted power, cooling, land, transformer capacity, networking, operators, the legal right to turn electricity into computation.

Every order book is a claim on a future slot. Every “X gigawatts” announcement is a claim on a slot. The order book equals demand only if the slots arrive on time.

In January 2026, AWS raised H200 Capacity Block pricing by 15%^[6]. For the first time in two decades, cloud compute got more expensive. Slot scarcity did that.

II. The Order Book Is Not Demand

Across the largest named projects—xAI’s Colossus, the Oracle/OpenAI Abilene campus, the DOE Blackwell systems, Meta’s multi-gigawatt roadmap—roughly a fifth of the accelerator facility load ordered or planned is actually energised (Appendix C).

That is one half of the gap. The other half is utilisation.

Cast AI’s 2026 survey of enterprise Kubernetes clusters measured an average sustained GPU utilisation of 5% (Appendix D).^[7]

Across the broad surface of paid-for capacity, much of the ordered compute is doing little economic work. The rest is optionality.

The order book is not demand. It is the right to access scarce compute if the workloads arrive.

Capacity announcements are not energised power. A gigawatt of accelerators is a gigawatt of transformers, substations, gas contracts, water permits, and skilled labour.

Reservations are not utilisation. A reserved instance running idle is a depreciating asset, not a productive one.

Capex is not revenue. A balance-sheet wager that the agentic economy will mature on schedule is not the same thing as the schedule maturing.

What has been priced as a vertical demand curve for compute is a vertical demand curve for optionality on compute. In a bottlenecked world, paying for the right to compute later is strategically correct. The problem is duration mismatch: the option is priced as insurance against compute scarcity, but the underlying asset depreciates against an architectural cadence faster than the workload-maturity clock. Options expire. Blackwell-era chips ordered for 2025 deployment run into Rubin and Rubin Ultra. The chips ordered after that run into Feynman-era roadmap pressure.

Aggressive AI-infrastructure pricing assumed agentic workflows would convert model capability into labour substitution on a near-term timetable. The deployment rails did not arrive on time. The frontier-lab demand curve scales. The economy-wide curve materialises eventually. Just not in the energisation window of the chip already ordered.

The frameworks are arriving: NIST AI RMF, ISO/IEC 42001, sector-specific eval suites, the open-source agent-testing ecosystem.^[8] None mature in eighteen months.

The closest analogy is autonomous driving. Google and Waymo demonstrated Level 4 around 2016, impressive enough to convince many observers broad deployment was a year or two away.^[9] A decade later, commercial deployment remains confined to a handful of geofenced cities.

Industrial-grade autonomy requires reliability several orders of magnitude beyond the demo. The last few nines take a decade of guardrails, QA, simulation, verification, geofencing, fleet management, and regulatory engagement. Agentic workflows will follow the same arc.

Capability arrives before verification, regulation, and integration. It always has—in medicine, in finance, in software itself.^[10]

And almost all AI demand growth has concentrated in software engineering. Cursor, Claude Code, Copilot, Codex, Devin: coding built AI’s first billion-dollar revenue lines. Software engineering spent fifty years building exactly the distribution layer the deployment economy requires: compilers, type systems, linters, version control, test frameworks, CI/CD, sandboxed runtimes, code review, IDEs, and open-source ecosystems with documented APIs.

Q1 2026 made the asymmetry impossible to miss.

The frontier labs with the deepest capability and the broadest consumer reach posted quarterly operating losses in the billions while their user growth stalled. The lab whose product lived inside software engineering’s fifty-year-old verification rails was the one converging toward profitability.^[11] The capability was in all three. The deployment scaffolding was not. The quarter’s figures are registered, dated, in Appendix M.

Within coding, the adoption curve is approaching its top. Late-2025 surveys from Stack Overflow, JetBrains, and GitHub’s Octoverse put professional-developer adoption of AI tools above 70%.^[12] The one vertical whose distribution layer exists is saturating in new-user conversion. Further growth comes from deepening usage among existing users, a slower, harder curve.

The order book extrapolates coding’s adoption to verticals where scaffolding does not yet exist.

This is the recurring error of every infrastructure cycle that confuses what is possible with what is deployable today. Railroads in the 1870s. Telegraph in the 1840s. Fibre in 1999. The technology is correctly identified. The timetable is not.

III. The Depreciation Clock

Premium power goes to premium silicon.

A hyperscaler with a scarce power permit and an even scarcer substation will not run last-generation chips in that slot. Frontier training optimises for the current architecture: memory bandwidth, interconnect topology, sparsity support, low-precision throughput, and on-package networking.

This produces a strange asset class. The world’s most ambitious computing hardware, paid for at full price, becomes economically displaced inside the very facilities built to house it, because the marginal frontier watt belongs to something newer. Inventory ages. Some of it never reaches a powered rack, because that rack powers its successor.

The depreciation clock has three hands.

Architecture cadence. Two-year cycles, each generation delivering multiples of performance per watt on the frontier workloads the labs care about.

Software stack alignment. Frontier kernels, compiler optimisations, library support, and reference implementations migrate to current silicon.

Power contract. A 100 MW interconnection built in 2027 will not be filled with the GPUs ordered for it in 2024 if the GPUs ordered in 2026 do much more work per watt on the workloads that pay for them.

The three hands rotate together.

And yet the chip does not stop being useful when it stops being frontier.

In an earlier era, the chip you bought in 1996 was useless in 2000. That era is over. With Moore’s law slowing, per-transistor cost has plateaued.

Per-watt performance still improves, but in narrower bands and on specific workloads. A 7 nm chip from 2019 still does floating-point arithmetic perfectly well. A Hopper-class accelerator from 2023 still performs trillions of multiplications per second per device.

What has changed is the marginal economics of frontier training. The newest accelerators are categorically better there, and they will stay that way.

But most of what the economy asks compute to do is not training a trillion-parameter model from scratch. It is inference, embedding, batch processing, fine-tuning, retrieval, ranking, simulation, rendering, robotics control loops, scientific computing. These workloads care about price per useful operation, latency per query, throughput per dollar, and reliability over time.

A Hopper-era, or plateau, GPU—repriced below the cost basis its first owner needed to recover—is a fantastic inference engine, a fantastic fine-tuning platform, a fantastic simulator. It loses only one contest: the next billion-dollar frontier training run.

And the same AI boom that paid for the order book is bidding up the inputs that make new silicon. Memory and advanced packaging now dominate the cost of an accelerator, with HBM undersupplied through 2027 (Appendix H).

The plateau gets cheaper as the frontier evicts it. The frontier gets more expensive because the boom has bid up its inputs. Software compounding lifts useful work per dollar on the plateau chip in the middle.

The frontier gets more expensive faster than it gets universally better. The plateau gets cheaper faster than it gets obsolete.

IV. The Backbone and the Last Mile

The right historical analogy is fibre.

Between 1996 and 2001, telecom carriers laid millions of miles of long-haul fibre across the United States. WorldCom, Global Crossing, Qwest, 360networks, Williams Communications. The capital structures collapsed almost entirely. By the early 2000s, a vast share of installed long-haul fibre was dark. The cost basis of the asset was crushed against the wall of cancelled customer demand.

And then it became the spine of the internet.

But it took twenty years.

The dot-com backbone was finished by 2001. The last mile took twenty more years.

The bottleneck was distribution, the layer that connected the trunk to the household.

The bottleneck was never the backbone. The backbone was finished by 2001. The bottleneck was distribution—the last copper mile between the optical trunk and the household, dial-up at kilobit speeds, DSL at megabit speeds, then cable DOCSIS, fibre-to-the-curb, fibre-to-the-home, 3G, 4G, 5G. The optical fibre did its job in eighteen months. The distribution layer took two decades.

Netflix streams on that fibre. YouTube serves on that fibre. AWS replicates across it. The applications were waiting on the last mile.

The dot-com bust was correct ideas on the wrong timetable, financed at the wrong duration, against an underbuilt distribution layer.

AI is building the same shape.

The compute backbone is growing at unprecedented scale: hyperscaler campuses, sovereign clusters, megaprojects in Texas, Wyoming, the UAE, Saudi Arabia, Korea, Japan, and when the land runs out, the sea. The accelerator order books reach years into the future.

The models are capable. The distribution layer is not.

In AI, the distribution layer is reliability engineering, verification, observability, permissions, audit, rollback; agentic harnesses, distillation, quantisation, and the optimisations that make smaller models perform like larger ones on commodity hardware; integration into industry software stacks, regulatory regimes, procurement, insurance, liability, and workflow ownership. A frontier training run produces an input. Turning that input into intelligence the economy can apply is the distribution problem.

Coding got there first because coding has compile, lint, test, and revert built in by default. Every output verifies in seconds against deterministic rails. Most other domains do not have that. They build their distribution layer from scratch, against domain-specific failure modes, under regulatory regimes that did not contemplate autonomous agents, inside organisations whose change-management cycles measure in years.

In the meantime, the backbone is being financed at duration assumptions the distribution layer cannot meet.

The reset may propagate more quietly than the fibre crash. Hyperscaler balance sheets are larger than the telecom carriers’ were. Internal cascading absorbs more of the inventory before it ever reaches an external market. Sovereign and enterprise buyers can hold compute on long horizons without forced sale.

The AI cycle may not need WorldComs and Global Crossings to deliver the same outcome. The shape is the same: backbone overbuilt against an underbuilt distribution layer, capacity repriced as the timetable resets, infrastructure that no first owner can monetise becoming the foundation the next owner can.

Bubbles build infrastructure.

V. The Repricing Mechanism

The important event is migration.

The mechanism is simpler than a financial crisis. Hyperscalers have a finite number of premium power slots. The marginal economics force them to fill those slots with the newest hardware. Last-generation chips do not have to fail commercially to lose their place. They only have to fall behind the per-watt economics of the next generation on the workloads the slot was built for.

What happens after is where Abundance lives.

An NVIDIA HGX H100 board, eight gold-topped GPUs on a black baseboard, on a dark reflective surface — A Hopper-class accelerator. Displaced from the frontier, perfect for the plateau. Photo: NVIDIA.

The compute flows through the slot hierarchy by several channels at once. The dominant and quietest is the hyperscaler-internal cascade: chips bought for frontier training drop to internal inference, then to ad ranking, then to batch work. The same silicon quietly serves different economics inside the same firm, year by year. It shows up only as falling unit prices in cost-per-token guidance and cheaper hyperscaler inference SKUs.

In May 2026, Anthropic took over the entire ~290 MW capacity of Colossus 1—the Memphis facility built around 220,000+ Nvidia processors—under a SpaceX S-1 disclosed agreement at $1.25 billion per month through May 2029, with reduced ramp fees and a 90-day termination right for either party.^[13] xAI shifted internal frontier training to Colossus 2 in Southaven, Mississippi, deploying next-generation GB200 Blackwell.

All while the planned expansion of the Oracle/OpenAI Stargate campus in Abilene from 1.2 GW to 2.0 GW collapsed in early 2026 against financing complexity, revised OpenAI demand forecasts, and a winter weather event that took the liquid-cooling loop offline for several days. Microsoft took over the adjacent expansion with Crusoe, adding two buildings and an on-site 900 MW gas plant alongside Oracle’s existing 350 MW backup.^[14]

Beyond the hyperscaler tier, the institutional layer absorbs another wave: tier-2 colocation, regional clouds, sovereign clusters, enterprise on-premise, university research, industrial sites with cheap local power, and crypto-era datacentres rebuilt for a different workload.

Underneath the institutional layer is the edge: a workstation in a research lab, a server in a small business, a single accelerator in a household. The slow-Moore cushion plus algorithmic deflation means a last-generation accelerator running a distilled, quantised model is a capable assistant on a personal machine.

The Slot Hierarchy

Frontier slot - newest accelerators, premium permitted power, largest training runs. Few sites. Enormous capex. Capability discovery.

Plateau slot - previous-generation accelerators, cheaper power, inference, fine-tuning, simulation. Many sites. Modest capex per site. Capability diffusion.

Edge slot - laptops, phones, embedded GPUs, local agents. Near-zero marginal cost. Ubiquitous. Capability access.

A second axis runs through the slot hierarchy.

Inside the plateau tier, hyperscaler public cloud is the premium access layer. The spread of up to 10× between hyperscaler on-demand H100 SKUs and specialist clouds delivering the same hardware (Appendix J) is not depreciation. The chip is identical. The premium is for trust, energised slot, procurement, immediate availability, and enterprise support.

That premium is durable for one class of customer and a tax for another. The boom phase had one scarce unit: access to any serious GPU capacity. Hyperscalers dominated it. The deployment phase has a different scarce unit: verified intelligence inside a workflow at acceptable cost. That lives wherever regional cloud, sovereign cluster, second-tier colocation, enterprise rack, or industrial site provides verified compute at the lowest fully-burdened cost.

Hyperscalers may retain revenue share while losing workload share. The cloud keeps the frontier. The plateau escapes the cloud.

So the new chip wins wherever the slot is scarce. And in the migration, the old chip discovers a different slot. Frontier scarcity is contiguous high-density power: hundreds of megawatts to gigawatts behind a single substation, with the cooling and operator depth for synchronous training. Plateau abundance is fragmented low-density power: regional colocation, sovereign and university clusters, industrial behind-the-meter capacity, commercial buildings, household nodes. At full chip price, a new accelerator cannot justify those slots. Written down, an older one can.

The financial damage of the bust is real but secondary. Some neocloud operators levered against five-year PPAs and six-month customer books do not survive. Some equity gets wiped. Some debt gets restructured. The losses fall hardest on the investors who underwrote the cycle at its peak. The structural event, underneath, is migration. The silicon changes hands, changes facility, changes country, and changes scale, from concentrated frontier campuses to distributed plateau and edge slots that did not previously have access to compute at this price.

Internal cascade, institutional absorption, edge fabric: each shows up most often as a falling unit price on a billing page. The physical chip may never leave the hyperscaler that ordered it. Abundance arrives anyway.

The first buyer takes the loss. The second buyer takes the asset.

The mechanism has been operating across every prior datacentre accelerator generation. T4 at year 8 still serves production inference across AWS and GCP at a fraction of its launch price. A100 at year 6 carries the bulk of enterprise inference at heavily depreciated cloud-SKU pricing. H100 at year 3–4 already trades at higher memory-bandwidth-per-dollar than new B200 retail. The bust does not create the cascade. It scales one that has been continuously operating since at least 2017.

In every prior infrastructure boom this is how the abundance arrived. Canals in Britain. Railroads in the United States. Optical fibre at the turn of the millennium. The infrastructure outlives the company that built it. The cost basis resets. The customer who could not afford the asset at the asking price finds it accessible at the recovery price. The customer who could not access it at all, because it was concentrated in a few hands at a few sites, finds it nearby.

Liquidation makes headlines. Redistribution changes the world.

VI. Stockfish for Everything

The maximalist version of the AI story imagines one enormous model doing everything. A country of geniuses in a datacentre. A digital workforce replacing labour wholesale.

Stockfish is the strongest chess engine in human history. It wins because thirty years of engineering have produced a system in which a small neural network sits inside ruthlessly optimised search code, transposition tables, opening books, endgame tablebases, move-ordering heuristics, pruning techniques, and a testing harness so disciplined that no patch lands without statistical proof of improvement.

Modern Stockfish now outperforms the original AlphaZero-style benchmark systems. A small neural network wrapped in engineered scaffolding, running on commodity CPUs, beats a much larger pure-neural approach that required GPU compute just to play.

Stockfish matters here because it shows how useful intelligence becomes reliable: by embedding learned judgement inside an engineered system that constrains, tests, and corrects it.

Most economically useful AI will look the same.

The same arc—imitation learning, then reinforcement learning, then engineered environments—has played out three times in three different domains.

Game-playing AI ran it through the 2010s, ending in Stockfish-class systems: a small NNUE network wrapped in three decades of search, tablebases, and a statistical testing harness, outperforming pure-RL approaches on commodity CPUs.

Self-driving ran it through the 2010s and 2020s, ending in Waymo’s L4 stack: specialised perception networks wrapped in HD maps, geofencing, hand-engineered planners, redundant safety layers, and simulation running billions of synthetic miles. The wrapped-and-bounded approach has outpaced pure end-to-end neural driving by a decade in commercial deployment.

Large language models are running it now: imitation at pre-training scale, then RLHF and verifiable-reward RL, and now the environments phase, wrapping frontier models inside sandboxed execution, tool use, retrieval, verification, and rollback.

Three domains, one shape: learned components inside engineered scaffolding, outperforming the unbounded learned system on production tasks. The Stockfish architecture is what deployable AI already looks like.

The economy does not need one god-model. It needs a million Stockfishes.

The electrification cycle produced thousands of appliances, each a bounded engineered system that turned generic grid power into specific economic value. The AI cycle is producing the same shape, in software.

A frontier model is the evaluation function. The system that surrounds it is the chess engine. Without that scaffolding, the model is a powerful piece evaluator playing without rules. With it, the model becomes an engineered agent: bounded, testable, verifiable, deployable.

The model is the evaluator. Capability per token. Powerful, expensive, improving.

The scaffolding is the engine. Search, verification, tool use, permissions, observability, rollback. Mostly conventional software.

The workflow is the game. Legal review, customer support, accounting close, materials design, surgical scheduling, each with its own rules, scoring function, and acceptable failure modes.

The diffusion is the network of engines. Thousands of bounded agents, each specialised, each verified, each cheap to run.

The scaffolding is also the safety answer. The same envelope that verifies an agent’s output constrains its failure: permissions bound what it may touch, observability records what it did, rollback undoes what it should not have done.

This is what the migrated infrastructure will host: smaller, distilled, specialised models running on plateau-tier accelerators, wrapped in heavy verification software, deployed against narrow but valuable workflows.

The frontier labs will keep training larger models on the newest hardware. They are correct to. But capability deployment happens on the cheaper, older, more plentiful tier, with smaller models distilled out of the frontier and re-engineered for industrial use.

The recipe is standard. A frontier model serves as the teacher, generating millions of synthetic reasoning traces and logit distributions for a target workflow. A smaller dense student, single-GPU class, is fine-tuned to match those traces, then aligned with reinforcement learning against an automated validator: a compiler, a schema check, a domain-specific scorer. The result is a deterministic specialist that operates at the bulk of the teacher’s accuracy on the bounded workflow at a small fraction of the inference cost.

Frontier intelligence discovers, judges, and creates. Plateau intelligence executes, repeats, monitors, and verifies.

The cost of running a given level of intelligence on a given piece of hardware falls every year. Distillation, quantisation, sparsity, speculative decoding, KV-cache compression, mixture-of-experts routing, FlashAttention, compiler advances—each compounds on the others. Measured at cost-per-token for a given capability level, the effect is orders of magnitude.^[15]

The same chip, two years later, runs better software, against smaller models, with cheaper inference, on a broader set of workloads. The depreciation curve and the algorithm curve run in opposite directions. The first owner carries the hardware loss. The second owner inherits the algorithmic gain.

They diverge from the moment the chip ships.

Hardware depreciates. Algorithms compound.

A bridge does not become a better bridge while it waits to be opened. A length of fibre does not become more capable while it sits unlit. An AI accelerator does. The chip on the pallet, waiting two or three years for a power connection, is incubating in capability. The hardware sits still. The world compounds around it.

Project that forward against a glut of Hopper-class and Blackwell-class accelerators all benefiting from the same compounding software stack. Inference prices fall sharply on hyperscaler SKUs and on the open market alike. Capability per dollar continues to climb.

Abundance is what happens when repriced compute meets three other layers: a software stack that lets smaller models do bigger work, engineered scaffolding that constrains those models to verifiable envelopes, and institutional verification frameworks such as evals, audit, rollback and regulatory acceptance.

VII. The Asymmetry

Time does two opposite things to AI hardware at once.

For the first owner, time destroys value. The chip misses its frontier slot, gets overtaken by the next generation, becomes economically wrong for the high-density power site it was meant to occupy.

For the second owner, time creates value. The same chip runs distilled models, cheaper inference stacks, better kernels, and domain-specific workflows that did not exist when it was manufactured. The deployment layer, in those intervening years, learned how to extract more intelligence from it.

Plateau hardware wins where the workload threshold is met, the model fits inside the older silicon’s memory footprint, latency is acceptable, and power is cheap enough that operating margin survives. The frontier retains the workloads where capability appetite remains uncapped: the largest training runs, the most demanding inference latencies, the models that exceed the previous generation’s VRAM ceiling, the deployments behind premium-priced gas turbines where every watt has to earn its keep.

The frontier also retains a class of capability that has yet to be discovered. ARC-AGI-3, the interactive reasoning benchmark released in 2026, flatlined every frontier model at launch. The climb since is what the frontier is for.

Intelligence compounds in capability and falls in price faster than institutions can metabolise intelligence at any fixed level. Once a model is good enough for contract review, tutoring, diagnosis support, software analysis, scheduling, monitoring, or simulation, the deployment layer needs years to build the verification, liability, procurement, and trust infrastructure required to use it. Industry continues pushing that same level of capability down the cost curve by 10×, 100×, 1,000×.

The world will never again have scarce intelligence.

This is the intelligence endowment. At every threshold of useful capability, intelligence falls toward ambient cost before the deployment layer has finished absorbing the previous wave. The endowment is practically inexhaustible.

VIII. The Intelligence Layer

The cost of computation has never had a stable demand curve.

Every prior collapse in compute price produced a larger market overall, organised around use cases that were uneconomic at the previous price—business software, video games, AI. Cheaper bandwidth produced streaming, social networks, cloud, SaaS, video calls, online gaming, the mobile internet.

When the price of a plateau slot collapses, three things happen—

Things once scheduled become continuous. Climate models. Materials search. Robotics simulation. Drug-candidate screening. What used to be a study becomes a query. What used to be a query becomes a background process.

Things once centralised become local. Inference on every laptop and phone, with full privacy and zero latency. Personal agents on the user’s device, not the cloud.

Things once expert-only become ambient. Legal review against every contract. Diagnostic second opinions for every imaging study. Tutoring at the marginal cost of a kilowatt-hour. Industrial optimisation for every shop floor.

Demand scales exponentially when we lower price.

Create an Age of Wonders named this: intelligence as infrastructure.^[16] The post-bust world is what that phrase means in practice.

Bandwidth became an ambient property around 2010. Electricity became ambient in the 1930s. Refrigeration. Plumbing. Roads. Each started as an event and became infrastructure: present, cheap, taken for granted.

Compute is in the middle of that transition. Intelligence is following it.

Rows of server racks in a datacentre lit in deep blue, receding into the dark — Compute cheap enough to diffuse through everything.

The shape of the post-bust intelligence layer is already visible. Smaller models embedded inside ordinary software. Local agents on laptops and phones. Open-weights ecosystems running on commodity hardware. Specialist models for thousands of narrow tasks.

The frontier labs will still exist. The frontier models will still be expensive. The capability gradient will keep climbing. But the deployment surface, where the economic work gets done, will be the cheap, plentiful, diffused tier.

Carlota Perez named the pattern. The installation phase is when capital floods into a new infrastructure faster than society can reorganise around it. The financial crisis arrives when the timetable proves wrong. The deployment phase follows, decade-long, on the infrastructure left behind. Canals, railroads, electricity, highways, fibre. Each produced a financial event that obscured an infrastructural arrival. The arrival is what mattered.

Paul David showed it took American factories forty years after electrification to put a motor on every machine. The productivity payoff arrived only when they did^[17].

Electrification happened in the 1890s. The productivity payoff arrived in the 1930s.

Forty years to put a motor on every machine. The AI cycle is at the start of the same arc, on a compressed clock.

The AI cycle is at the start of the same long arc, on a compressed clock. Intelligence will become boring infrastructure.

The AI bubble is the price of making intelligence accessible.

The first financiers inherit the loss.

Civilisation inherits ambient intelligence.

Technical Appendix

Key arithmetic and supporting data behind the quantitative claims in the essay body. Inline citations map to the reference section below.

A. AI Capex Aggregate, 2025–2026

The bond-issuance line signals a regime change: historically self-funding technology platforms have shifted to external debt markets as free cash flow compresses under physical buildout.

Source	Figure	Period	Notes
Hyperscaler AI/cloud capex guidance (Microsoft, Alphabet, Meta, Amazon, Oracle, CoreWeave)	>$700B aggregate	FY2026 plans	Industry sum from Reuters aggregation across hyperscaler guidance; Moody’s projects ~$820B in FY2027.^[1]
Hyperscaler corporate bond issuance	$121B (2025) → $175B projected (2026)	Annual	Top-five U.S. hyperscalers; ~4× historical annual average. Amazon’s $54B March 2026 issuance was the company’s largest ever debt transaction. Bank of America projection for 2026.^[2]
AI venture funding	~$255B	Single quarter (Q1 2026)	PitchBook tracked quarterly deployment, exceeding the full 2025 total. Dominated by OpenAI ($122B), Anthropic ($30B), and xAI ($20B) financings; excluding the top-five transactions, broader quarterly deal value contracts by 73%.^[3]
U.S. datacentre investment	$45.7B PE	FY2025	S&P Global Market Intelligence; PE deployment represented 72% of $63.35B total sector investment.^[18]

The figures are not comparable line items. Capex, debt issuance, venture funding, and private-equity deployment are different cash flows. Together they describe the scale and structural mutation of finance directed at the AI infrastructure stack.

Amazon’s trailing-twelve-month free cash flow fell roughly 95% to $1.2 billion in the same window it issued $54 billion in bonds. Meta issued $30 billion in October 2025. Alphabet issued $32 billion in February 2026, including a rare 100-year bond. Oracle planned $25 billion for 2026 against an upward-revised $50 billion FY2026 capex guide.

B. Energy Marginal Cost

Using the U.S. Energy Information Administration’s 2024 average operating heat rate for natural gas-fired generation (7,754 Btu/kWh)^[19] and front-month gas prices in May 2026:^[5]

Hub	Front-month gas	Fuel-only generation cost	Multiple vs. Henry Hub
Henry Hub (U.S.)	$2.83/MMBtu	~$22/MWh	1.0×
TTF (Europe)	~$17/MMBtu	~$132/MWh	~6.0×
JKM (Asia)	~$18.80/MMBtu	~$146/MWh	~6.6×

For a 100 MW continuously loaded campus (8,760 hours per year at full load = 876,000 MWh annual draw), the fuel-only marginal-cost spread between Henry Hub and TTF is approximately $96M/year. This excludes transmission, capacity charges, cooling overhead, financing, and carbon costs. The spread is the lower bound on the fuel-only delta between regions. It is not a full delivered-power cost comparison.

C. Named-Project Facility-Load Estimates

Roughly a fifth of disclosed ordered load is plugged in.

Project	Reported scale	Facility load	Confirmed energised
xAI Colossus 1	220,000+ Nvidia processors^[20]	~290 MW (consistent with reported facility power)	~290 MW; 300 MW expansion under build
xAI Colossus programme	~1,000,000 target	~1.3 GW target (heuristic)	Partially built
Oracle / OpenAI Abilene	400,000 GB200-class^[21]	~1.2 GW (facility design, per Oracle / Crusoe disclosure)	2 of 8 buildings operational (~300 MW)
DOE Solstice + Equinox	110,000 Blackwell^[22]	~140 MW (heuristic)	Equinox planned H1 2026
Meta roadmap 2025–2027	”Multi-million” Blackwell/Rubin^[23]	Multi-GW	Undisclosed

The heuristic of ~1.3 kW of all-in facility load per top-end accelerator^[24] is an order-of-magnitude estimate. Where projects have disclosed permitted facility design power (Colossus 1, Abilene), the table uses that figure instead. The point of the table is the gap between announced and energised, not the precise kW-per-chip multiplier.

Summed across the three projects with disclosed status (Colossus ~1.3 GW + Abilene 1.2 GW + DOE 0.14 GW), the order book implies ~2.6 GW of facility load. The mid-2026 energised share, Colossus 1 (~290 MW) plus Abilene 2-of-8 (~300 MW), totals ~0.6 GW.

D. Energisation and Utilisation

The binding constraint is workflow design, not chip capability.

The body does not multiply the roughly one-fifth energised share and the Cast AI 5% utilisation share together. They describe different stages of the compute pipeline:

Energisation share describes ordered facility load that has been turned on. Source: public sample of named hyperscaler and sovereign projects in Appendix C. Roughly a fifth of disclosed ordered load in the sample is plugged in.
Utilisation share describes the time-fraction of an active GPU spent on productive work. Source: Cast AI 2026 Kubernetes survey across AWS, Azure, and GCP enterprise clusters.^[7] Approximately 5% mean sustained.

Frontier training clusters at the labs almost certainly run far higher utilisation than enterprise cloud fleets. The named hyperscaler projects also represent only a subset of total accelerator orders. Treat the two figures as bounding cases rather than a unified ratio. Together they describe a material gap between paid-for capacity and economically productive capacity.

The Cast AI 2026 State of Kubernetes Optimization Report shows the broader pattern: 8% average sustained CPU utilisation, 20% memory utilisation, 69% CPU overprovisioning headroom, 79% memory overprovisioning headroom, and fewer than 2% of GPU workloads on spot instances, primarily because spot capacity was effectively unavailable at hyperscaler scale. Specialised clusters using GPU time-slicing and MIG partitioning sustain up to 49% GPU utilisation.

E. Cost-Per-Token Decline

Training capex and inference opex have diverged by orders of magnitude.

Stanford HAI AI Index and Epoch AI compute-cost time series document a roughly 10× annual decline in inference cost at fixed capability level across the 2022–2026 window.^[15]

GPT-3.5-equivalent capability cost ~$20 per million tokens in late 2022 and ~$0.07 per million tokens in late 2024, a roughly 280× reduction over two years. GPT-4 launched at $30/$60 per million input/output tokens. GPT-4o offered the same capability tier at $3/$10 per million eighteen months later. DeepSeek V3 entered at $0.14 per million input tokens, undercutting GPT-4o by ~95%. Epoch AI reports a median annual decline of ~50× across the inference market.

The decline is attributable to hardware improvement, algorithmic efficiency (quantisation, distillation, sparsity, speculative decoding, KV-cache optimisation, mixture-of-experts routing, FlashAttention, and compiler advances), competitive pricing from open-weights models, and software stack maturity. The hardware contribution is much smaller than the headline rate. The software-and-competition contribution is dominant.

The training side carries the opposite economic profile. Trillion-parameter mixture-of-experts training requires synchronous gigawatt-class slots, multi-million-dollar coordination overhead across thousands of GPUs, and substantial risk of catastrophic instability. The inference side is memory-bandwidth-bound and runs cheaply on plateau-tier silicon: an MoE architecture reads only a fraction of the total parameter pool from memory per token, so the same silicon hosts a far larger model than its launch generation could deploy. That gap is the structural condition behind the diffusion shape this appendix measures.

F. The Grid-Avoidance Tax

The seven-year European grid interconnection queue^[4] and similar U.S. backlogs have driven hyperscalers and developers toward behind-the-meter generation: modular, on-site, gas-fired power deployed directly alongside datacentres. The canonical 2026 instance is the $1 billion strategic investment by Blackstone Tactical Opportunities and Halliburton into VoltaGrid. The investment supports a 7.5 GW forward order book of modular natural-gas systems through 2030 and manufacturing capacity sized for up to 300 MW per month of reciprocating engines and turbines.^[25] The pitch: skip the queue, install power in months rather than years, operate under simplified minor-source air permits.

The pitch is real. So is the thermodynamic cost.

Thermal efficiency of a generator is the ratio of its electrical output to fuel input, conventionally expressed via the operating heat rate (Btu of fuel per kWh of electricity):

$\eta = \frac{3{,}412.142}{\mathrm{HR}}$

Using the U.S. EIA’s 2024 average tested heat rates by prime mover:^[19]

Generation class	Heat rate (Btu/kWh)	Thermal efficiency	Gas consumption (ft³/kWh)
Combined-cycle utility gas (best in class)	7,548	~45.2%	~7.35
Average U.S. grid natural gas	7,754	~44.0%	~7.55
Reciprocating internal-combustion gas	8,924	~38.2%	~8.69
Simple-cycle gas turbine	10,999	~31.0%	~10.71

Gas consumption assumes approximately 1,027 Btu per cubic foot of pipeline natural gas.

The combined-cycle utility plant captures secondary heat through a steam turbine. Behind-the-meter simple-cycle and reciprocating units do not: the marginal payoff for captured heat does not justify the deployment delay and the engineering complexity at modular scale. Per kWh of electricity delivered, a simple-cycle gas turbine consumes roughly 45.7% more natural gas than a best-in-class combined-cycle plant, and emits CO₂ in direct proportion.

That 45.7% markup is the grid-avoidance tax. The chip is the same. The slot is the same. The workload is the same. The difference is the thermodynamic penalty paid to install power in months rather than years.

The tax compounds the regional cost spreads of Appendix B. In LNG-import regions, where the underlying fuel is already six or seven times the Henry Hub price, a behind-the-meter simple-cycle installation can run at roughly nine to ten times the fuel-only marginal cost of a Henry Hub combined-cycle baseline. Compute is a geography, and the cost of escaping the geography is itself a tax. This is the operational lower bound on the slot-scarcity rent: the marginal slot is not merely expensive. The most readily available substitutes for it are physically inefficient.

G. Slot Density, the OpEx Paradox, and the Migration Path

The body argues that compute migrates down a slot hierarchy: frontier → plateau → edge. A natural objection is that if energised power slots are universally scarce, next-generation silicon will displace even chips with zero remaining CapEx from every slot they could occupy. The migration would collapse before it started. Older hardware would become e-waste rather than redistributed asset.

Call this the OpEx vs slot paradox: even at zero CapEx, an older chip still carries an opportunity cost equal to the next-generation chip that could occupy the same slot. If slots are uniformly bottlenecked, the opportunity cost is uniformly high, and the rational operator runs the most efficient chip available everywhere. Migration is irrational. The older hardware is shredded.

The paradox resolves once you notice that slot scarcity is not uniform.

What is scarce is contiguous high-density power. Single permitted sites delivering hundreds of megawatts to gigawatts behind a single substation, with the cooling, networking depth, and operator capability to absorb a synchronous training run. Those sites are bottlenecked by transformer queues, substation permits, multi-year grid-interconnect cycles, and the institutional capability to operate a hyperscale facility. There are few in the world. Each one is intensely contested.

What is not scarce is decentralised low-density energised capacity. Tier-2 colocation facilities with hundreds of kilowatts of spare power. Sovereign and university clusters. Commercial buildings with industrial HVAC and stable kilowatt-class server rooms. Industrial sites with behind-the-meter generation. Office buildings, research labs, household nodes—millions of permitted, energised, underutilised electrical slots scattered through the built world. Using them requires no grid expansion. They already exist.

Slot tier	Power density	Site count (order of magnitude)	Workload type	Binding constraint
Frontier	100 MW – 1 GW contiguous	tens to low hundreds globally	Synchronous training	Substations, transformer queues, multi-year permitting
Plateau	1 MW – 100 MW distributed	thousands	Inference, fine-tuning, batch, simulation	Existing colocation, sovereign, and enterprise capacity
Edge	0.5 kW – 1 MW fragmented	millions	Local inference, agents, batch	Wall socket, room cooling, residential grid

These decentralised slots are currently empty of frontier compute not because they cannot host it, but because a new chip carrying a $30,000–$40,000 cost basis requires 90%+ utilisation at premium hyperscaler economics to justify the purchase. A brand-new accelerator running at 15% utilisation in a 50 kW office slot is a CapEx disaster.

Once the chip’s CapEx is written down toward zero, that arithmetic flips. A depreciated chip running at 20% utilisation in a 50 kW slot pays back trivially. The hardware becomes economic at utilisations and densities that were impossible at full price. The migration is not chip-replaces-chip in the same scarce slot. It is chip-discovers-new-slot in the previously-uneconomic decentralised tier.

A further mechanism aggregates these decentralised slots into useful continuous capacity. Intermittent compute distributed across time zones can be networked into a follow-the-sun virtual cloud: workloads route to whichever region is currently sitting on cheap energy. Inference, fine-tuning, batch processing, and simulation are latency-tolerant and do not require the synchronous gigawatt-class interconnects that frontier training requires.

This is the architecture of plateau-tier intelligence: hyper-cheap silicon harvesting the latent energy margins of the built world, aggregated temporally to deliver continuous utility on the latency-tolerant workloads that constitute the bulk of economic compute demand.

The deployment phase will have its own binding constraints—verification overhead in decentralised multi-tenant networks, WAN bandwidth for cross-region pipeline parallelism—but the thesis does not require these to be solved today. It requires only that the existence of hyper-capable, hyper-cheap silicon plus a vast tier of underutilised decentralised slots creates the economic incentive to solve them.

The OpEx paradox is real. But it is a paradox only in the hyperscaler frame. Once the frame widens to include the entire built world’s latent kilowatts, the paradox dissolves into a migration path.

H. The Plateau Crossover

The body argues that AI hardware depreciates out of frontier use long before it depreciates out of economic usefulness. This appendix formalises that claim in two stages: first as a CapEx crossover on the capital-cost side, then as a fuller economic crossover that brings in operating costs, architectural compatibility, and the workflow-discovery recursion. The headline thesis the appendix supports:

AI hardware depreciates against the frontier faster than it depreciates against usefulness. When capital write-down exceeds frontier improvement on a workload’s binding dimension, and when software progress keeps lowering the capability threshold, previous-generation accelerators become the natural substrate for plateau intelligence.

Definitions. The core dynamics involve four rates of change—frontier hardware growth ( $g_q$ ), software efficiency ( $r_s$ ), capital cost decay ( $\delta$ ), and workflow-discovery recursion ( $\beta$ )—plus one architectural correction factor ( $\mu$ ):

Symbol	Quantity	Empirical value	Source
$g_q$	Annual frontier hardware capability growth per chip, on the binding dimension (FLOPS/chip, GB/s/chip, FLOPS/W, etc., not capability-per-dollar)	Per-chip values from H100 SXM5 → B200 SXM, 3-yr annualised: ~1.31× (FP16/BF16 throughput); ~1.34× (memory bandwidth); ~1.34× (memory capacity, 80→192 GB); ~1.47× (FLOPS/W); ~1.65× (FP8 throughput)	Nvidia published specs; per-chip values, not per-dollar
$r_s$	Annual software-efficiency multiplier at fixed capability	~50× (inference cost-per-token); ~3× (pre-training compute-per-capability)	Stanford HAI AI Index; Epoch AI
$\delta$	Annual secondary-market capital cost decay	~0.25 (used H100 SXM5: ~$35K new 2023 → ~$15K used mid-2026)	Industry pricing
$\mu(t,\tau) \geq 1$	Architectural divergence penalty (software gains hard-coded to new silicon that does not backport cleanly)	Modest for FP16/FP8 inference; significant for FP4 and the latest attention/decoding kernels	ISA differences
$\beta \geq 0$	Workflow-discovery recursion rate (AI-assisted engineering productivity feedback)	Empirically positive; bounded above by deployment friction	This essay

The critical convention: $g_q$ is physical capability per chip on the binding dimension. It is not capability per dollar. Price enters separately through $P_0$ and $\delta$ . Different workloads bind on different dimensions, so the relevant $g_q$ is workload-specific. For cross-reference, Epoch AI tracks long-run FLOP/s per dollar at ~1.37×/yr. That figure is the combined effect of per-chip growth ( $g_q$ ) and launch-price growth ( $p_F$ ), and the identity below does not use it as a per-chip value.

The CapEx Crossover Identity. Let $\tilde{C}(t, \tau)$ denote useful capability per dollar of a chip purchased used at time $t$ , with age $\tau$ :

$\tilde{C}(t, \tau) = \frac{Q_0 \cdot E(t)}{P_0 \cdot (1-\delta)^\tau}$

where $Q_0$ is the chip’s physical capability on the binding dimension (fixed at manufacture) and $E(t)$ is the contemporary software multiplier. A new frontier chip at time $t$ —assuming approximately stable launch prices across generations (the $p_F$ correction is treated below)—has capability $Q_0 \cdot g_q^\tau$ and price $P_0$ :

$\tilde{C}_F(t) = \frac{Q_0 \cdot g_q^\tau \cdot E(t)}{P_0}$

The software multiplier $E(t)$ benefits both equally and cancels in the ratio. The structural factor—the CapEx Crossover Identity—is:

$\boxed{\frac{\tilde{C}(t, \tau)}{\tilde{C}_F(t)} = \frac{1}{[g_q \cdot (1-\delta)]^\tau}}$

When $g_q(1-\delta) < 1$ , plateau hardware delivers more useful capability per dollar than frontier hardware on the binding dimension, with an exponentially growing advantage in $\tau$ . This is a CapEx-only object. Operating costs enter in §H.5 below.

The threshold condition.

$g_q \cdot (1-\delta) < 1 \iff \delta > \frac{g_q - 1}{g_q}$

The secondary market must reprice faster than per-chip frontier capability advances on the dimension at hand. The condition is dimension-specific:

Binding capability dimension (per chip, H100→B200, 3-yr annualised)	$g_q$	Threshold $\delta^*$	Observed $\delta$	Identity fires?
FP16/BF16 throughput per chip	1.31	0.24	0.25	Yes (marginal)
Memory bandwidth per chip	1.34	0.25	0.25	Yes (marginal)
Memory capacity per chip	1.34	0.25	0.25	Yes (marginal)
FLOPS per Watt per chip	1.47	0.32	0.25	No
FP8 throughput per chip	1.65	0.39	0.25	No

The identity fires marginally on three of the five per-chip dimensions—bandwidth, capacity, and FP16/BF16 throughput—and fails on the two dimensions where Blackwell made the largest architectural jump (FP8, FLOPS/W). This is the most conservative form of the identity. Two refinements strengthen the plateau case empirically.

The launch-price correction. Flagship retail prices have risen at $p_F \approx 1.05$ /yr (H100 SXM5 ~$30–35K in 2023, B200 SXM ~$40K in 2025). Including this in the derivation: a plateau chip launched $\tau$ years ago had launch price $P_0/p_F^\tau$ , and its used price today is $(P_0/p_F^\tau)(1-\delta)^\tau$ . The full identity:

$\frac{\tilde{C}(t, \tau)}{\tilde{C}_F(t)} = \left[\frac{p_F}{g_q(1-\delta)}\right]^\tau$

with crossover condition:

$\delta > \frac{g_q - p_F}{g_q}$

The threshold drops by exactly the launch-price drift. Updated table:

Binding capability dimension	$g_q$	Threshold $\delta^*$ ( $p_F = 1.05$ )	Observed $\delta$	Identity fires?
FP16/BF16 throughput per chip	1.31	0.20	0.25	Yes
Memory bandwidth per chip	1.34	0.22	0.25	Yes
Memory capacity per chip	1.34	0.22	0.25	Yes
FLOPS per Watt per chip	1.47	0.29	0.25	No (marginal)
FP8 throughput per chip	1.65	0.36	0.25	No

The identity fires cleanly on the three dimensions that govern most production inference, and fails on the two dimensions Blackwell optimised hardest for. The split matches the migration thesis exactly: plateau wins on the dimensions that govern plateau workloads, and frontier wins where it should.

Numerical validation. For mid-2026 used H100 ($15K, 3.35 TB/s, age 3 yr) versus new B200 ($40K, 8.0 TB/s, retail) on memory bandwidth, the full identity predicts:

$\frac{\tilde{C}_P}{\tilde{C}_F} = \left[\frac{p_F}{g_q(1-\delta)}\right]^\tau = \left[\frac{1.05}{1.34 \times 0.753}\right]^3 = \left[\frac{1.05}{1.009}\right]^3 = (1.041)^3 \approx 1.12$

Predicted plateau advantage: ~12%. Empirical observation (Table H.2): used H100 at 0.223 GB/s/$ vs new B200 at 0.200 GB/s/$ = 11.5% advantage. Model and empirical agree to within one percentage point. The identity is quantitative, and it matches the market.

The second refinement—capability thresholds rather than capability appetites—formalises the lift to the workload-economic plane.

H.1: Log capability scaling. Inference benchmark scores scale approximately as $\beta_K \log_{10}(\text{compute})$ . A 4× FLOPS deficit corresponds to a benchmark gap of roughly 3–5 percentage points on MMLU, GPQA, or HumanEval at current scale. Most economically useful workloads have capability thresholds, not capability appetites: a customer-support router that needs 85% intent-classification accuracy does not benefit from a model scoring 92%. The operative question is not whether plateau capability matches frontier headroom. It is whether plateau capability exceeds the workload’s threshold at lower cost. The workload surface where plateau is sufficient grows monotonically with $E(t)$ .

H.2: Memory-bound, not compute-bound. Autoregressive LLM inference is memory-bandwidth-bound, not FLOPS-bound. The relevant cap-per-dollar metric is bandwidth per dollar:

Hardware	State	Price	HBM bandwidth	GB/s per $
H100 SXM5	Retail (2023)	$35,000	3.35 TB/s	0.096
H100 SXM5	Used (mid-2026)	$15,000	3.35 TB/s	0.223
B200 SXM	Retail (mid-2026)	$40,000	8.0 TB/s	0.200

Used H100 delivers ~12% more memory bandwidth per dollar than a new B200 at retail. The B200’s structural advantages—192 GB VRAM, 4× FP8 throughput, NVLink 5 at 1.8 TB/s—bind on training workloads with synchronous gradient sync and on models exceeding 80 GB. Most production inference exhibits neither. Plateau hardware does not beat frontier globally. It beats frontier on the dimensions that govern specific deployment workloads.

An emerging exception: latent recursion. A research-stage class of architectures reasons by iterating a small weight-tied block through depth rather than stacking unique layers: latent-space reasoning and looped or recurrent-depth models (COCONUT, looped transformers, the HRM/TRM line). On the compounding side this is another entry in the $E(t)$ ledger alongside distillation, quantisation, and sparsity. It decouples reasoning depth from parameter count, lowering the model size and therefore the cost and hardware required to clear a given sufficiency threshold (H.1). That widens the workload surface a written-down chip can serve, and strengthens the plateau case.

On the binding dimension, recursion bifurcates the market rather than shifting it uniformly. Pure latent recursion loops a fixed state with a flat KV cache, so in isolation its inference tilts serial-latency and compute-bound rather than memory-bandwidth-bound, toward FP8 throughput and FLOPS-per-watt, where the crossover identity fails and newer silicon keeps its edge. But recursion will most plausibly arrive fused with sparse mixture-of-experts routing: dense looped blocks suffer representation collapse, and sparse routing is the known fix. That fusion splits the hardware market along tier lines.

A frontier recursive core can be co-designed with the latest silicon to hold a compact working set resident near compute and cycle a latent state through it at speed (the Groq and Cerebras direction). Its binding dimension becomes sequential latency, on-chip memory, and FLOPS-per-watt, the axes newest hardware owns. A plateau looped-MoE is the distilled descendant: a sparse weight-streaming engine that still loads its relevant experts from HBM each step, so its binding dimension stays memory-bandwidth-per-dollar, exactly the axis on which written-down silicon already beats new (this section).

So recursion does not make old hardware obsolete. It bifurcates the workload. The frontier discovers cognitive depth on resident cores tied to the newest process node. The plateau distributes the compressed descendants of that depth at the lowest memory-bandwidth cost. This is the shape the framework predicts rather than an observation—recursion is not a production scaling vector today—but it follows from the model’s one rule: score every architectural break by asking which binding dimension it makes scarce, and for whom.

H.3: Architectural Divergence Penalty. The cancellation of $E(t)$ in the headline identity assumes today’s software gains backport cleanly to older silicon. In practice, some optimisations are tied to physical blocks on the newest chips: FP4 quantisation runs natively on Blackwell’s 5th-gen Tensor Cores and emulates with severe penalty on Hopper. The latest speculative-decoding heads assume the memory hierarchies of newer interconnects. Define $\mu(t, \tau) \geq 1$ as the fraction of contemporary software efficiency that cannot be ported to plateau hardware. The corrected identity:

$\frac{\tilde{C}(t, \tau)}{\tilde{C}_F(t)} = \frac{1}{\mu(t, \tau) \cdot [g_q \cdot (1-\delta)]^\tau}$

$\mu$ partially offsets the depreciation advantage. Empirically, $\mu$ is small for workloads dominated by FP16/BF16/FP8 transformer inference (most production deployment today) and substantial for workloads that lean on the newest precision formats. The migration thesis is robust to modest $\mu > 1$ . It fails only if $\mu$ grows faster than $1/[g_q(1-\delta)]$ , which would require a sustained pattern of new software locking out previous generations on most economically useful workloads. The empirical record shows the reverse: distillation, quantisation, and inference kernels keep finding ways to fit modern models onto older silicon.

H.4: VRAM step-function. A 70B-parameter model at FP8 occupies ~70 GB before KV cache. On H100 (80 GB), it fits with little headroom and typically requires two-GPU tensor-parallel configurations, introducing all-reduce communication overhead of 9–23% of end-to-end decoding latency. On H200 (141 GB) or B200 (192 GB), the same model fits single-GPU with concurrent KV caches and no TP penalty. The CapEx identity assumes capability decays smoothly. In reality, model-fit thresholds create non-linear cliffs. Plateau hardware wins inside the cliff and loses outside it. The migration thesis holds because most production workloads fit inside the cliff for previous-generation hardware, even as frontier training requires the next cliff.

H.5: The Economic Crossover (OpEx-extended). The CapEx identity is a numerator-only model. The full economic comparison requires operating costs. Define amortised capital cost per useful operation as $A(\tau, t)$ and operating cost per useful operation as $O(\tau, t, s, w)$ , where $s$ indexes the slot and $w$ the workload. Economic capability per dollar:

$\tilde{C}_{\text{econ}}(\tau, t, s, w) = \frac{Q_0 \cdot E(t) \cdot M_w(t, \tau) \, / \, \mu(t, \tau)}{A(\tau, t) + O(\tau, t, s, w)}$

where $M_w$ is the workload-compatibility factor (capability threshold met, VRAM fits, software stack supported). The CapEx Crossover Identity governs the numerator-side condition. The OpEx term determines where the comparison breaks operationally.

For slots that rely on behind-the-meter generation (the grid-avoidance pattern of Appendix F), the electricity cost carries a thermodynamic multiplier $\theta \geq 1.45$ relative to grid-supplied combined-cycle power. The economic obsolescence boundary is reached when marginal revenue per token falls below the energy cost per token:

$R_{\text{token}} \;<\; \frac{\theta \cdot \mathrm{PUE} \cdot \mathrm{TDP}_{\mathrm{kW}} \cdot P_{\text{elec}}}{3600 \cdot \mathrm{TPS}}$

with $R_{\text{token}}$ in $/token, $\theta$ the dimensionless thermodynamic multiplier from Appendix F (~1.0 for utility-supplied combined-cycle power, ~1.45 for behind-the-meter simple-cycle generation), $\mathrm{PUE}$ the facility power-usage-effectiveness multiplier (typically 1.1–1.4 for modern datacentres), $\mathrm{TDP}_{\mathrm{kW}}$ the thermal design power in kilowatts, $P_{\text{elec}}$ the unit cost of utility power in $/kWh, and $\mathrm{TPS}$ the active token throughput in tokens per second. The factor of 3600 converts hours to seconds.

Sanity check. A 1 kW chip at $\mathrm{PUE} = 1.2$ , $\theta = 1.0$ , $0.10/kWh power, and 100 TPS yields:

$\frac{(1.0)(1.2)(1)(0.10)}{3600 \cdot 100} \;\approx\; 3.3 \times 10^{-7} \; \text{\$/token}$

—or roughly $0.33 per million tokens, within the empirical range for production inference.

Behind-the-meter slots compress the operating margin by the factor $\theta$ . This is the mathematical link between Appendix F (the grid-avoidance tax) and Appendix H (the plateau migration): older hardware migrates to slots where $\theta$ is small—grid-supplied utility power, university clusters, enterprise datacentres on bulk power contracts, sovereign facilities, and the colocation tier—and not to slots where $\theta$ is large (the behind-the-substation reciprocating-engine deployments most useful for frontier training when the grid will not connect in time).

H.6: The workflow-discovery recursion. The CapEx and OpEx framework above treats $E(t)$ as exogenous. The fourth structural rate is that $E(t)$ is itself accelerated by AI-assisted engineering labour. The predictive claim is the weak one: $\beta > 0$ . AI-assisted engineering improves the rate at which the software layer adapts to whatever compute is cheapest and available. The CapEx Crossover Identity holds for any $E(t)$ trajectory satisfying this condition.

A stronger illustrative result follows if engineering throughput $L(t)$ scales with deployed plateau intelligence, $L(t) = L_0 \cdot E(t)$ . Define the rate of software-efficiency improvement as:

$\frac{d \log E}{dt} = \alpha_0 + \beta \log L(t)$

Substituting the recursion and solving the resulting linear ODE in $u = \log E$ :

$E(t) = E_\infty \cdot \exp\!\big(A \, e^{\beta t}\big)$

A double exponential, bounded above by deployment friction (verification, testing, integration, organisational change-management). The functional form is an intuition pump for why plateau-fill may run faster than prior infrastructure cycles, not a forecast. The migration thesis fires on $\beta > 0$ alone.

H.7: Frontier slowdown. The $g_q$ values above assume the frontier continues to compound at recent rates. Three constraints suggest the rate may not hold indefinitely.

Moore’s law has stalled at the device level. The frontier still compounds, but through capital intensity, packaging innovation, and system-scale engineering rather than transistor scaling. Hyperscaler AI spending exceeds $700 billion in 2026.^[1] Frontier training compute has grown ~5×/year since 2020, but hardware FLOP/$ has improved only ~1.37×/year. Capability appears to scale approximately logarithmically with compute on standard benchmarks, so each additional order of magnitude buys a roughly fixed absolute increment, not a proportional one.

The identity is monotonic in slowdown. A frontier compounding at $g_q = 2.0$ per architectural release does not fire the CapEx identity. A frontier compounding at $g_q \approx 1.15$ annualised fires it strongly. If scaling continues, plateau hardware runs the compressed descendants of fresh capability. If scaling slows, plateau hardware runs durable capability for longer. The thesis is robust in both directions.

H.8: Boom-driven input-cost inflation. The binding inputs to new-generation silicon—HBM, advanced packaging, substrate capacity, networking, power delivery—are in tight supply against the same demand curve driving the order book. The $p_F \approx 1.05$ /yr launch-price drift is not a noise term. It is the cost-side channel through which the boom widens the plateau wedge.

Epoch AI’s teardown places B200 module production cost at roughly $5,700–7,300, with HBM and advanced packaging accounting for ~two-thirds of variable unit cost.^[26] SemiAnalysis estimates memory could rise to ~30% of hyperscaler AI data-centre capex in 2026, up from ~8% in 2023–2024, with HBM undersupplied through 2027.^[27] If boom-driven inflation pushes $p_F$ from 1.05 to 1.08 over the next two generations, the threshold on FLOPS/W (currently failing at 0.29 against observed $\delta = 0.25$ ) drops to ~0.26, on the cusp of firing. The wedge does not need the frontier to slow—only to keep getting more expensive.

The cycle modulates. The cascade persists. The cost-side channel is cyclical, but the post-bust generation arrives later than naïve cyclical reasoning suggests: the production pipeline is committed years in advance. Hyperscaler order books already span Rubin and Rubin Ultra into 2027–2028 and Feynman into 2028–2029. What the bust changes is the clearing price of that pipeline as it lands, not the schedule. Committed chips arrive on the published cadence. Their secondary-market pricing collapses to plateau levels on a shorter clock than in prior cycles. The slot mechanism of §V handles pre-ordered and new-design generations identically.

Three depreciation regimes. The identity rests on distinguishing three depreciation rates that, in AI hardware, move at different speeds:

Regime	Driven by	Timescale	What it means
Frontier obsolescence	$g_q$ , $\mu$	~2 years	Chip no longer optimal for the next-largest frontier training run
Capability obsolescence	Software stack abandonment, physical failure	5–10 years	Chip cannot run any useful workload
Economic obsolescence	OpEx vs revenue (slot $s$ , workload $w$ , thermodynamic multiplier $\theta$ )	Set by slot, workload, and electricity price	Output worth less than operating cost

For most asset classes the three regimes move in lockstep. A car ages out of newness, off the dealership floor, and out of usefulness on similar schedules. AI hardware does not. The frontier moves on two years. Capability moves on five to ten. Economic obsolescence is set by the slot. The CapEx Crossover Identity gives the magnitude of the gap between frontier obsolescence and capability obsolescence—the years during which a chip has fallen off the frontier but is still capable, still supported, and still economic on slots its first owner did not want.

Three empirical tests. The framework admits three cleanly falsifiable observables on different timetables.

The earliest and sharpest one: by end of 2027, secondary-market clearing prices for Blackwell B200 SXM should sit in the $22–28K range—a $\delta$ of approximately 0.25–0.35 over two to three years from launch, reproducing the Hopper depreciation arc one generation later. If Blackwell holds near launch (above $35K) into 2028, the eviction mechanism that powers the migration is not operating on the current generation. The framework is wrong in the specific way it claims to be falsifiable: the inequality $\delta > (g_q - p_F)/g_q$ fails empirically, not theoretically.

The second runs on the workload surface: by the end of 2028, the largest share of economically deployed AI—measured by tokens served, queries answered, decisions made, dollars saved—should be running on Hopper-class and early-Blackwell hardware rather than on whatever the contemporary frontier silicon is at that time. If frontier hardware still serves the majority of production workloads in late 2028, the framework is wrong in a way no amount of careful unit analysis can rescue.

The third runs on the pricing surface itself: by the end of 2028, the spread between hyperscaler on-demand H100/H200 rentals and specialist-cloud or marketplace rentals for the same hardware should remain wide or widen—not converge. The spread today reaches up to 10× (Appendix J). If hyperscaler pricing converges down toward specialist pricing—closing the spread below approximately 2×—the bifurcation between premium access layer and economic substrate has not occurred, and the workload-share / revenue-share decoupling (§V, Appendix J) is falsified. The spread is checkable monthly from public price pages.

The deeper point. As long as $\delta > (g_q - p_F)/g_q$ on the dimension a workload binds on—after correcting for architectural divergence $\mu$ and matching the workload to slots where the thermodynamic multiplier $\theta$ is bearable—that workload migrates to plateau hardware once the chip is repriced. The first capital cycle pays the frontier price. The second buyer inherits the asset. The algorithms make the asset compound.

I. The Cascade in Progress

The framework predicts a cascade. The cascade is already running, observably, across every prior datacentre accelerator generation. The bust window does not invent the dynamic. It scales an existing one to public visibility.

Generation-by-generation status, mid-2026.

Generation	Launch	Launch price (top SKU)	Secondary clearing (mid-2026)	Current production role
Turing T4	2018	~$2,500–10,000 depending on SKU	~$1,500–2,500	Default inference on AWS g4dn, GCP equivalent SKUs; embeddings, ranking, smaller-model serving across enterprise
Ampere A100	2020	~$15,000–20,000 (40 / 80 GB SXM4)	~$5,000–8,000	Primary production workload on AWS p4d, Azure ND A100, GCP a2; substantial share of enterprise AI inference globally
Hopper H100	2022/2023	~$30,000–35,000 (SXM5)	~$15,000 (per Appendix H)	Current workhorse; frontier-adjacent; per Table H.2 already trades at higher memory-bandwidth-per-dollar than new B200 retail
Hopper H200	2024	~$30,000–40,000 (limited disclosed)	Limited secondary supply	Current production tier; increasingly inference-priced as Blackwell capacity ramps
Blackwell B200	2024/2025	~$40,000	Negligible secondary supply	Current frontier tier; first observable depreciation cycle still 12–24 months ahead

Each generation in the table is simultaneously serving a different tier of the workload stack today. The slot hierarchy is already in motion.

Cloud SKU repricing as the dominant migration channel. Public AWS pricing data shows g4dn (T4) on-demand prices have fallen roughly 60–65% relative to 2019 launch; p4d (A100) by roughly 40–55% relative to 2021 launch. The chips never left Amazon. The migration ran entirely inside the hyperscaler’s balance sheet, visible only as falling unit prices on customer-facing SKUs—the quiet form of the cascade the body describes.

The cascade is visible in a second pricing surface today: hyperscaler on-demand H100 SKUs list at approximately $6.88–$12.29 per hour, while specialist clouds and marketplace/spot networks deliver the same H100 capability at $1.25–$4.29 per hour—a spread of up to 10× on the same chip in the same year (full breakdown in Appendix J). The hyperscaler price is the slot rent. The marketplace price is the underlying chip economics, repriced for a buyer who can tolerate intermittency.

Hyperscaler depreciation extensions. Microsoft (2022), Alphabet (2023), Meta (2023), Oracle (2024), and Amazon (2024) have each extended useful-life assumptions for server infrastructure from approximately four years to six years. These are CFO-signed statements with Sarbanes-Oxley liability, audited by the Big 4, and defended against internal utilisation data no outside analyst sees. Each extension is an audit-level acknowledgement that the chips remain operationally useful far longer than the original capex cycle anticipated—precisely the gap between frontier obsolescence (~2 years) and capability obsolescence (~5–10+ years) the three-regime table predicts.

Microsoft’s FY2023 extension alone added ~$3.7 billion to annual operating income. The aggregate impact across the top five hyperscalers is on the order of $15 billion per year in reduced depreciation expense. The same CFOs are simultaneously issuing record bond volumes to fund new silicon and extending useful-life on the existing fleet. Together they imply that the productive compute fleet is becoming both larger and longer-lived than the original capacity model assumed.

Falsification test on depreciation policy. The cleanest forward signal against the plateau thesis would be a hyperscaler shortening AI-GPU depreciation in audited filings. As of mid-2026, no major hyperscaler has done so. The keynote curve makes capability claims unaudited. The audited curve makes utility claims unmarketed. The thesis tracks the second.

Anthropic / Colossus as cascade in miniature. The May 2026 Anthropic–SpaceX/xAI agreement (§V; ref [20]) is the cascade operating across firm boundaries on premium current-generation silicon. Even energised frontier-tier hardware is allocated to whichever tenant can put it to highest-value use. The dominant version of this cascade happens silently inside hyperscaler SKU price sheets. The Anthropic–Colossus version is the audible instance.

Cost-per-token compression as observable substrate-level repricing. Cost-per-token at GPT-3.5-equivalent capability fell ~280× over 2022–2024 (Appendix E). The decline is software-dominant. But the hardware-side decay rate is independently visible in SKU prices on ageing silicon, tracking the $\delta \approx 0.25$ band the framework predicts.

The validation that has already resolved. Per Table H.2, used H100 at $15K delivers 0.223 GB/s per dollar of memory bandwidth. New B200 at $40K delivers 0.200. The Plateau Crossover Identity is firing today, on the dominant production-inference dimension.

J. Hyperscaler Premium and the Plateau Substrate

The second axis named in §V—premium access layer versus economic substrate—is observable on the pricing surface today. The same chip clears at different prices depending on the bundle wrapped around it.

Snapshot, May 2026. Public pricing for H100-class capacity, by access tier:

Access tier	Example	Per-H100 GPU-hour
AWS on-demand	p5.48xlarge (8× H100)	~$6.88 ($55.04/hr ÷ 8)
AWS reserved (Capacity Blocks)	p5.48xlarge reserved	~$4.33 ($34.61/hr ÷ 8)
Azure on-demand	ND H100 v5 class	~$12.29
Specialist cloud, mid-band	CoreWeave, Lambda, Civo, Denvr	~$2.25–$4.29
Marketplace / spot network	vast.ai-class supply, GPUPerHour	~$1.25–$2.75
Owned hardware, amortised	New H100 PCIe ~$25K; SXM ~$35–40K	under $1.50 at high utilisation

The spread between hyperscaler on-demand and marketplace supply for the same chip in the same year reaches up to 10×. The structural delta is not depreciation. It is the cost of a bundle: trusted, energised, instantly available, supported, billed, and procured through enterprise channels. That bundle is durable for one class of customer and a tax for another.

What the spread implies. Hyperscaler GPU rental pricing measures slot rent. A chip can move out of frontier status while the hyperscaler SKU built around it remains expensive. The H100 is the visible case: out of frontier as of Blackwell launch, still priced as scarce on hyperscaler clouds, simultaneously available at marketplace rates that approach amortised owned-hardware cost. AWS raising H200 Capacity Block prices by 15% in January 2026 (§I, body) is the same mechanism running on the next-generation chip: scarcity priced into the slot, not into the silicon.

The bifurcation, mapped. The premium access layer captures: largest training runs, premium managed inference, latency-sensitive global APIs, regulated enterprise workloads, customers paying for procurement simplicity, organisations without internal MLOps depth. The economic substrate captures: batch inference, embeddings, fine-tuning, document processing, simulation, scientific computing, bounded agentic systems, retrieval and ranking pipelines, narrow-vertical Stockfish workflows. Both surfaces persist. They serve different customers.

Revenue share versus workload share. The two metrics can decouple. Hyperscalers may continue to capture the majority of AI infrastructure revenue—because enterprise customers pay the bundle premium—while losing the majority of AI inference workload by token volume, query volume, or useful inference-hours. The workload migrates to wherever fully-burdened cost is lowest. The revenue follows wherever the trust premium is paid. The two surfaces measure different things and need not move together.

Multi-model routing volume as workload-share signal. OpenRouter—the multi-model routing platform serving 8M+ developers—disclosed token throughput growing from ~5 trillion to ~25 trillion tokens per week over the six months ending May 2026, a 5× expansion across a heterogeneous plateau-tier model surface (Anthropic, OpenAI, Google, Meta, DeepSeek, and open-weights).^[28] CapitalG led its $113M Series B with NVIDIA Ventures, Snowflake, Databricks, MongoDB, ServiceNow, a16z, and Menlo Ventures—the enterprise-data and silicon stacks underwriting the multi-model topology rather than a winner-take-all frontier outcome. The data confirms the topology the framework predicts. Whether the absolute level is durable depends on how the 2026–2027 procurement reset resolves.

Falsifier. The premium access layer thesis fails if the hyperscaler/marketplace spread on the same hardware converges below approximately 2× by the end of 2028 (the third empirical test in Appendix H): hyperscaler clouds would then be the natural home of plateau intelligence, and the revenue-share/workload-share decoupling would not materialise. As of May 2026, the spread is widening, not closing.

Caveat on snapshot volatility. Pricing snapshots will move. AWS Capacity Blocks reprice dynamically. Specialist clouds run promotional rates. Marketplaces fluctuate with supply. The structural claim does not depend on the specific dollar figures in the table being correct on any given month. It depends on the shape of the price surface—a wide, durable spread between premium and economic-substrate tiers for the same hardware. If that shape collapses, the bifurcation has not occurred.

The expected shape over the full cycle, recorded July 2026. Approximately 10× at the boom peak, holding wide through the installation phase—the test window above—then compressing through the repricing event toward a durable premium in the 2–3× band: above, never below, the 2× bifurcation floor this falsifier sets. Post-reset compression toward that band confirms the cascade. Convergence below 2× refutes it.

K. Macroeconomic Absorption and the Deployment-Phase Drag

The cascade described across §III–§V predicts that civilisational intelligence diffuses through depreciated silicon on fragmented power slots over a multi-year horizon, not a multi-quarter one. This appendix audits the contemporaneous macroeconomic data confirming that horizon—the wedge between localised firm-level efficiency gains and aggregate macro-level productivity growth—and locates the cascade inside the canonical Brynjolfsson Productivity J-Curve and Carlota Perez installation-to-deployment framework. The headline finding: the deployment window holds at a 5–10 year primary phase with continued macroeconomic absorption through 2035+. The cascade mechanism is structurally anti-fragile to which point in the macro projection band turns out right.

K.1. The Brynjolfsson J-Curve and intangible capital accumulation

Brynjolfsson, Rock, and Syverson (2021) formalised why a massive general-purpose technology boom can co-exist with stagnant aggregate productivity.^[29] Firms must accumulate substantial intangible capital—process redesign, retraining, verification scaffolding, organisational restructuring—before the technology’s productivity dividend appears in measured output. Because the intangible accumulation is expensed as opex (not capitalised), national accounts systematically under-measure both GDP and TFP during the absorption phase. Historical adjustments for computer-era intangibles found that true TFP was 15.9% higher than official measures by the end of 2017.

The J-Curve is now quantified in current AI data. PricewaterhouseCoopers’ 2026 AI Performance Study (1,217 senior executives across 25 industries) finds 74% of AI’s measurable economic value captured by the top 20% of firms, leaving 26% to the remaining 80%.^[30] The long tail of adopters has not yet completed the intangible-capital accumulation that would let them realise the gains, even where they have purchased the software.

The cash budget ratio is now visible. The SXSW 2026 CMO Survey of 400 organisations finds that positive-ROI AI deployments spent approximately $2.60 on training and change management per $1 spent on the AI software itself, with organisations failing to match this ratio experiencing tool abandonment rates of 60–70% within six months.^[31]

The implied market-value ratio is older but consistent. Brynjolfsson, Hitt, and Yang (2002) found $1 of physical computer hardware historically associated with approximately $9 of corporate market value—the market pricing the unmeasured complementary intangible capital long before the national accounts measured it.

K.2. Enterprise deployment friction is structural, not transient

The deployment friction is not theoretical. S&P Global Market Intelligence and the RAND Corporation (2025) report that 42% of corporate AI projects were scrapped in 2025 and 80.3% of AI initiatives failed to deliver business value—twice the failure rate of traditional, non-AI IT projects.^[32] Gartner projects that more than 40% of agentic AI projects will be cancelled by end-2027 due to escalating costs, lack of clear business value, and inadequate risk controls.^[32] Although 97% of surveyed enterprises have experimented with AI agents in some form, only 10–12% have successfully transitioned them into production environments.

At the workflow level, the downstream verification burden compounds the friction. The METR randomised controlled trial (2025) found experienced open-source developers were 19% slower using AI tools. The cognitive cost of reviewing non-deterministic AI output exceeded the speedup from generation.^[33]

Google’s DORA 2024 report associated a 25% increase in AI adoption with a 7.2% decrease in production delivery stability. Google Cloud (2026) reports 45% higher burnout among frequent AI users, with approval fatigue identified as the primary mechanism. GitClear’s analysis of 153 million changed code lines projected a doubling of code churn translating directly into delivery instability.

The pattern is now replicated across multiple independent sources: individual-level speedups of 20–56% on isolated tasks fail to translate into firm-level velocity because architectural integration and verification absorb the gains—the Productivity-Reliability Paradox.

K.3. The labour-market signal

The Stanford AI Index 2026 reports that early-career software-developer employment (ages 22–25) in AI-exposed roles fell approximately 20% from 2024 to early 2026, against stable aggregate white-collar wages.^[34] The signal validates the plateau cascade mechanism from a different surface. As AI automates the work entry-level developers historically did, human labour migrates upstream to verification, integration, and high-abstraction architecture. This is exactly the Stockfish-bounded engine wrapping learned judgement structure §VI predicts.

The shift happens in entry-level roles first because that is where syntactic work concentrates. Senior engineers retain their wage premium because their work is verification-heavy rather than generation-heavy. So the labour market is not just a productivity surface but also a verification-of-mechanism surface: it shows the plateau cascade transferring labour exactly as the workflow-ownership thesis predicts, on the timeline the J-Curve framework predicts, in the direction the framework predicts.

K.4. Aggregate productivity dispersion and the Perez modification

U.S. aggregate labour productivity has grown at 1.6% annualised since Q4 2019—a modest acceleration from the 1.2% pre-pandemic decade. Forward projections are widely dispersed.

Goldman Sachs Global Macro Research projects AI-driven labour productivity acceleration to 1.7–1.9% through 2029, peaking at 1.9–2.3% in the early 2030s, with potential GDP growth elevated to 2.1–2.3% for the rest of the decade.^[35]

Penn Wharton Budget Model projects permanent annual potential-output gains of less than 0.04 percentage points, with aggregate GDP rising only 1.5% by 2035.^[36] Acemoglu (2024) similarly estimates aggregate GDP gains of 1.1–1.6% over a decade and a marginal annual TFP boost of approximately 0.05%.^[37]

The dispersion reflects genuine uncertainty about how fast the J-Curve resolves.

The Carlota Perez framework—installation phase → financial crash → deployment phase—historically required a structural separation between speculative financial capital and operational production capital. Telecom equity holders went bankrupt while consolidator-platforms (Level 3, Equinix, Digital Realty predecessors) acquired the physical infrastructure at salvage and re-leased into the deployment-phase Golden Age.

The AI cycle’s most credible modification of that pattern is the convergence of these two capital types inside hyperscaler balance sheets. Trillion-dollar platforms fund their own builds from organic cash flows plus low-cost corporate debt ($121B of hyperscaler bonds issued in 2025 per the Moody’s / Bank of America aggregation referenced in §I; see Note [2]). They absorb depreciating silicon internally through frontier-to-inference cascades rather than clearing it through the secondary market. The deployment phase still arrives. The bankruptcy-driven clearing event runs through the mid-market specialist-cloud and integrated-developer layer rather than through the trillion-dollar platforms. The access layer—energised slot, plateau silicon, master lease—is captured at the slot level rather than at the equipment level.

K.5. Implications for the cascade

Three structural facts follow.

First, the deployment window is calibrated as a 5–10 year primary phase with continued macroeconomic absorption through 2035+. The J-Curve’s intangible accumulation takes years, not quarters, and the failure-rate evidence is overwhelming that this is structural rather than transient.

Second, the cascade mechanism described across §III–§V is robust to which point in the macro projection band turns out right. In the Goldman optimistic case, workloads scale fast and plateau-tier capacity re-leases into a rising-demand environment as the deployment phase compresses. In the Penn Wharton / Acemoglu conservative case, plateau silicon remains uncontested by frontier displacement for materially longer. Aggregate demand never reaches the threshold that would justify the next-generation chip displacing the current one across the plateau workload base. Different mechanisms, both favourable to the cascade.

Third, the access layer is the structurally underwritable surface across the dispersion band. The macro outcome is uncertain. The access constraint is not. The cascade resolves either way. The access surface gates how the value distributes. The plateau is where civilisational intelligence will be deployed because that is where the absorption-phase economy can afford to deploy it.

K.6. A note on method

The forward predictions in this appendix are dated and observable on known timetables. The framework updates if any falsifies.

L. Domain-Specific Diffusion and the Stockfish Multiplier

§VI argues that the deployable shape of intelligence is bounded learned components inside engineered scaffolding—Stockfish for everything. This appendix quantifies the implication: the deployment timeline under that architecture is wave-shaped across 2026–2040, governed by a workload-specific reliability-bar lattice and a scaffolding multiplier that stacks on top of base-model capability. The shorthand “AI will reach 99.99% reliability in N years” is correct on the headline number and misleading on the economic substance. Most addressable value crosses its threshold long before the terminal bar is reached.

L.1: The reliability-bar lattice. Autonomous deployment of an AI workflow requires effective system reliability $R_{\text{eff}}(w)$ to exceed a workload-specific bar $R_{\text{bar}}(w)$ , set by cost-of-error, recourse availability, and regulatory regime. The bar spans four orders of magnitude across the economy. The values below are approximate, intended as order-of-magnitude thresholds rather than measured universal constants. The strong claim is the spread across workload classes, not the exact decimal assigned to any single task:

Task class	$R_{\text{bar}}$	Rationale
Brainstorming, drafting, outlines	~0.70	Human review default
Translation, summarisation	~0.92	Direct user consumption
Code completion (suggested)	~0.80	Developer accepts/rejects
Customer support (assisted deflection)	~0.85	Human escalation handles tail
Code generation with tests-in-loop	~0.90	Test suite catches failures
Junior knowledge work (supervised)	~0.85	Supervisor catches failures
Research synthesis with citations	~0.92	Citations enable verification
Tier-2 autonomous customer resolution	~0.92	Human escalation tail
Sales prospecting and outreach	~0.85	Low cost of error
Tax preparation (autonomous, standard cases)	~0.98	Liability-bearing
Legal contract drafting (standard)	~0.98	Liability-bearing
Code review with autonomous merge	~0.98	Production risk
Medical diagnosis (autonomous, routine)	~0.99	Patient safety
Surgical decision support (high autonomy)	~0.999	Life-critical
Financial trading (autonomous)	~0.9999	Cascading capital risk
Self-driving (L4+, all edge cases)	~0.99999	Public safety, regulatory
Autonomous surgery	~0.99999	Patient mortality

The shape of the diffusion follows from the shape of the bar lattice, not from any single benchmark trajectory.

L.2: Base capability growth. Frontier-model headline benchmark improvement, annualised across 2024–2026 releases (Anthropic, OpenAI, Google):

Benchmark family	Annualised improvement	Notes
Hard reasoning (HLE, GPQA)	~10–15 pp/yr	Diminishing returns near ceiling
Hard coding (SWE-Bench Pro)	~15–20 pp/yr	Largest current gains
Saturating agentic tasks (OSWorld)	~3–6 pp/yr	Approaching benchmark ceiling
Practical knowledge work (Finance Agent, GPQA-AA)	~8–12 pp/yr	Mid-saturation

Capability scales approximately logarithmically in compute (§H.1). Benchmark improvement saturates as scores approach 1.0. The naïve linear extrapolation that produces a 2038–2042 estimate for 99.99% across hard agentic tasks treats the base-model rate as the only input. It is not.

L.3: The Stockfish multiplier. Each scaffolding layer in the §VI architecture catches an independent fraction of base-model failures. Treat each layer as a Bernoulli catcher with catch probability $c_i$ , applied independently to residual failures. Effective system reliability for workload $w$ :

$R_{\text{eff}}(w, t) \;=\; 1 \,-\, \big(1 - R_{\text{base}}(t)\big) \cdot \prod_{i \in \text{stack}(w)} \big(1 - c_i\big)$

Illustrative catch probabilities from observed deployment systems:

Scaffolding layer	$c_i$ (illustrative)	Mechanism
Test-driven generation loop	~0.60	Failing tests reject candidate outputs
Multi-agent disagreement detection	~0.50	Two-agent disagreement triggers tiebreaker
Formal verification on covered surface	~0.95	Type system, proof checker, schema conformance
Retrieval-grounded generation	~0.70 on factual claims	Citation anchoring eliminates a hallucination class
Tool augmentation (calc, code exec, search)	~1.0 on the augmented sub-task	Deterministic computation
Specialised fine-tuning on domain	~0.30 lift on residual error	Domain-specific failure-mode coverage
Human-in-the-loop on calibrated abstention	~0.95 on flagged cases	Uncertainty triggers escalation

Layers stack multiplicatively on independent failure modes. An 85% base model + 60% test-loop catcher + 70% retrieval catcher yields:

$R_{\text{eff}} \;=\; 1 - (1 - 0.85)(1 - 0.60)(1 - 0.70) \;=\; 1 - 0.018 \;\approx\; 0.982$

Roughly 98% effective system reliability emerges from a base model that one-shots at 85%. This is the empirical pattern observed when Claude Code, Cursor, or comparable agentic harnesses wrap a frontier model in test-loop scaffolding: codebase-scale operations succeed at materially higher rates than the underlying model’s one-shot benchmark would predict.^[38]

The independence assumption is an upper-bound simplification. Scaffolding layers have correlated failure modes: bad retrieval, ambiguous instructions, shared model priors, common blind spots, or domain misunderstandings can cause multiple layers to fail together. So the deployment question is not whether every layer is statistically independent, but whether enough layers are orthogonal—tests, retrieval, formal checks, human escalation, sandboxed execution, audit logs—to reduce residual failure below the workload bar. Three to five well-chosen orthogonal layers clear most production reliability bars. The operative work is selecting layers whose failure modes do not stack, not adding more layers of the same kind.

The Stockfish multiplier is not metaphor. It is the architecture of the deployed system and the lever the next decade of effective intelligence runs on.

L.4: Deployment-phase timeline under the multiplier. Combining base capability growth (10–15 pp/yr on hard tasks, saturating logarithmically) with empirically achievable Stockfish multipliers, task classes cross their respective bars in the following windows. Each window assumes the relevant scaffolding stack is built and deployed. Absent the stack, the crossing slips by years.

Window	Task classes crossing their bar
2024–2026 (crossed)	Brainstorming and drafting, translation, summarisation, code completion, email drafting, Tier-1 customer-support deflection, research drafting with citations
2026–2028	Code generation with tests-in-loop, Tier-2 autonomous customer resolution, document review (legal, financial, audit, with human sign-off), personal-assistant task handling, sales prospecting automation, standardised tax preparation, educational tutoring
2028–2031	Autonomous coding for non-critical systems, medical diagnosis support (assisted), legal contract analysis and standard drafting, low-risk autonomous code-review merge, financial advice (assisted, with liability framework), junior engineering design, geofenced robotaxi expansion
2031–2035	Autonomous medical diagnosis (routine), autonomous standard legal drafting, autonomous coding for critical systems (with formal verification), high-autonomy surgical decision support, L4 robotaxi across most environments, autonomous portfolio management
2035–2040	Autonomous surgery (with safety backup), binding autonomous legal counsel, fully autonomous self-driving across all edge cases, model-led novel scientific discovery
Possibly not under current paradigm	Cross-domain executive judgement, fully autonomous novel cross-disciplinary theorising, some long-tail physical-world autonomy edge cases

The economically dense windows are 2026–2028 and 2028–2031. The 2031–2040 windows clear the critical-domain autonomy bars but capture a smaller fraction of total economic surface than the earlier waves. Most addressable value reorganises through the access layer during 2026–2031, well before the headline 99.99% bar is cleared on any domain.

L.5: Wave-shaped inference demand and the lease term. The diffusion is not concentrated at the terminal date. Each row of the L.4 table represents a new tenant class entering the inference compute market on plateau-tier silicon. The compounding produces monotonically rising inference demand across the 2026–2040 window:

2026–2028: knowledge-work and customer-facing AI wave. Largest tenant count, lowest per-unit inference, broad geographic distribution.
2028–2031: professional-services and supervised-autonomy wave. Mid tenant count, mid per-unit inference, regulated-vertical concentration.
2031–2035: critical-domain autonomy wave. Smallest tenant count, largest per-unit inference per critical decision, highest tenant credit quality.

The agentic compounding factor of §VI—dynamic workflows, hundreds of parallel subagents per user-facing action, longer-running tasks—is approximately orthogonal to the bar-crossing schedule.^[38] Same model + 10–100× agentic complexity per action multiplies inference demand independent of which bar has crossed. The two effects compound: wave-shaped tenant growth × agentic-complexity multiplier on every active tenant.

For an access-layer operator, this wave shape supports multi-year underwriting. A position originated in 2027 captures the full 2028–2031 professional-services wave and the early 2031–2035 critical-domain wave inside a 5–10 year horizon. The underwriting does not depend on any single wave materialising on schedule. It depends on enough waves materialising across the horizon to keep utilisation above the underwriting floor. The wave shape converts deployment-timeline uncertainty into portfolio-style diversification across the access-layer position.

L.6: Three falsifiers. Risky predictions on observable timetables:

The first runs on the early waves: by the end of 2027, the share of customer-support, sales-outreach, document-drafting, and code-completion work-hours served by AI scaffolding in autonomous-execution mode should exceed 25% across measured enterprise deployments. Autonomous-execution mode means the AI system executes the task without per-action human pre-approval, while preserving human escalation, sampled review, or post-hoc audit—distinguishing it from suggestion mode, where each output requires acceptance before it has effect. If the share remains below 10%—i.e. AI continues to be reviewed suggestion-only and does not transition to autonomous-execution mode at scale—the L.4 2026–2028 crossing window is empirically incorrect and the diffusion is materially slower than the framework predicts.

The second runs on the mid-wave: by the end of 2029, the autonomous-merge rate on code-review benchmarks—PR review systems acting without human approval on flagged-low-risk changes—should exceed 30% on the production codebases of mid-sized engineering organisations. If the rate remains below 5%, the L.4 2028–2031 window is empirically incorrect and the multiplier model overstates achievable scaffolding gains.

The third runs on multiplier mechanics directly: by the end of 2028, peer-reviewed work on multi-component scaffolding stacks (test loops, multi-agent verification, retrieval grounding, formal verification on covered surface) should report combined effective reliability of ≥99% on tasks where base-model one-shot reliability sits in the 80–90% range. If the published evidence converges on combined effective reliability that stays below ~95% on the same baseline, the L.3 multiplier identity is overstated and the L.4 timeline shifts later by two to four years.

The deeper point. The headline is the long march. The economic substance is the wave. The essay’s §VI argues that the deployable shape of intelligence is bounded learned judgement inside engineered scaffolding. This appendix quantifies the consequence: the deployment timeline under that architecture is wave-shaped across 2026–2040, governed by the reliability-bar lattice and the Stockfish multiplier. Each wave passes inference demand through to the access layer continuously. The “decade-plus to 99.99% everything” framing is correct on the terminal date, irrelevant to the economic story between now and that date, and structurally favourable to a multi-year lease-underwritten access-layer position. The slower the base-model curve, the more value migrates to whoever built the scaffolding around it, and the more durable the demand for the inference compute that hosts it.

M. The Q1 2026 Register

The body carries only facts that cannot stop being true. These can. They are recorded here as reported at the time—a dated snapshot preserved as a falsifiable register, per reporting across Reuters, Bloomberg, and The Information.^[11]

OpenAI: Q1 revenue ≈ $5.7 billion against an operating loss ≈ $6.95 billion (−122% margin), tracking toward a $36.6 billion annual loss. ChatGPT weekly active users stalled at 905 million against a 1 billion target. Sora shut down. The Disney partnership ended. The Walmart pilot shuttered.
xAI: $2.47 billion Q1 loss on $818 million of revenue—roughly $1 billion a month of burn. Grok absent from the top-25 App Store downloads.
Anthropic: $30 billion annualised revenue run rate by April—80× annualised Q1 growth against a 10× target, most plausibly a short-run anomaly of coding-adoption scaling rather than a rate that holds the full year. Claude Code past $1 billion ARR, on track for ≈ $11 billion of Q2 revenue at ≈ $600 million operating profit.

Falsifier. The §II claim these figures support is that profitability follows the distribution layer, not the capability gradient. If, by 2028, the lab ranking by operating margin does not track the lab ranking by deployment-scaffolding maturity, the deployment-rails argument is wrong.

References

[1] Reuters / Breakingviews (2026). “How Big Tech’s USD 630 billion AI splurge will fall short.” See also Reuters market commentary: “Investors stay calm as AI capex boom eclipses dotcom mania.” (Industry capex aggregation across hyperscaler AI and cloud infrastructure guidance.)

[2] Bank of America Global Research / Moody’s Ratings / Reuters-linked market reporting (2026). “Amazon’s record USD 54 billion bond sale shows AI’s staggering costs.” See also Reuters commentary on AI infrastructure funding: “Investors stay calm as AI capex boom eclipses dotcom mania.” (Top-five U.S. hyperscaler bond issuance, 2026 issuance projections, and Amazon’s record debt transaction; a composite market-data reference standing in for the original Moody’s / BofA notes.)

[3] PitchBook (2026). “Q1 2026 AI VC Trends.” See also PitchBook coverage: “Q1 2026 AI funding blows past 2025 total with three deals accounting for 67% of capital.” (First-quarter AI venture funding of approximately USD 255 billion, dominated by mega-financings.)

[4] Reuters (2026). “Power grid delays challenge Amazon’s data center expansion in Europe.” (Amazon European infrastructure briefing on grid interconnection timelines, citing seven-year horizons against roughly two-year data-centre construction cycles.)

[5] CME Group / ECB / market quote pages (2026 snapshot). CME Henry Hub Natural Gas futures quotes. ECB euro foreign-exchange reference rates. ECB reference-rate archive for 15 May 2026. (Commodity and FX prices are dynamic; the essay’s arithmetic uses the May 2026 Henry Hub, TTF, and JKM observations recorded in Appendix B.)

[6] Amazon Web Services / Data Center Dynamics / InfoQ (2026). “Amazon EC2 Capacity Blocks for ML pricing.” “AWS quietly increases prices for H200 EC2 instances by 15%.” “AWS hikes EC2 Capacity Block rates by 15% in uniform ML pricing update.” (AWS H200 Capacity Block price increase in January 2026; interpreted as evidence of scarce energised, permitted capacity.)

[7] Cast AI (2026). “State of Kubernetes Optimization Report.” See also Cast AI summary: “2026 State of Kubernetes Resource Optimization: CPU at 8%, Memory at 20%, and getting worse.” (Average sustained GPU utilisation of approximately 5% across surveyed production clusters; not necessarily representative of frontier training clusters.)

[8] NIST / ISO (2023–2024). “NIST AI Risk Management Framework.” “Artificial Intelligence Risk Management Framework (AI RMF 1.0), NIST AI 100-1.” “Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, NIST AI 600-1.” “ISO/IEC 42001:2023—Artificial intelligence management system.” (Foundational AI governance frameworks; sector-specific extensions remain necessary for autonomous-agent deployment.)

[9] SAE International / Waymo / Reuters (2025–2026). “SAE J3016: Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles.” Waymo service areas. “Waymo launches robotaxi freeway service in San Francisco, LA, Phoenix.” (Level 4 autonomy as an analogue for the verification, regulation, and integration lag between impressive demonstrations and broad commercial deployment.)

[10] U.S. Federal Reserve / Office of the Comptroller of the Currency (2011–2026). “SR 11-7: Guidance on Model Risk Management.” “SR 26-2: Revised Interagency Guidance on Model Risk Management.” “OCC Bulletin 2026-13: Model Risk Management.” (Model-risk governance as a deployment gate for banking, capital-markets, and credit-decision systems.)

[11] The Information / Reuters / Axios / Wired / Anthropic (2026). The Information reporting on OpenAI and Anthropic Q1 2026 revenue. Axios: “Anthropic’s unprecedented revenue growth.” Wired: “OpenAI shuts down Sora.” Anthropic Series G funding announcement. (Composite Q1–Q2 2026 reference for private-company revenue, loss, ARR, and product-status claims; some source material is paywalled.)

[12] Stack Overflow / JetBrains / GitHub (2025). “Stack Overflow Developer Survey 2025: AI.” “JetBrains State of Developer Ecosystem 2025.” “GitHub Octoverse 2025.” (Professional-developer AI-tool adoption across independent industry surveys.)

[13] xAI / Reuters (2026). xAI: “New Compute Partnership with Anthropic.” Reuters reporting on Colossus 1 and Anthropic compute access. (Anthropic access to Colossus 1 and xAI / SpaceXAI compute reallocation; precise financial terms per the SpaceX S-1.)

[14] Data Center Dynamics / OpenAI / Reuters-linked reporting (2026). “Oracle/OpenAI drop plans to expand flagship Abilene Stargate site.” OpenAI: “Five new Stargate sites.” (Abilene expansion, tenant reallocation, financing complexity, and Microsoft / Crusoe / Nvidia-related reporting; composite reporting rather than a single transaction document.)

[15] Stanford HAI / Epoch AI / Artificial Analysis (2026). “Stanford HAI AI Index Report 2026.” “AI Index Report 2026: Economy Chapter.” “The Price of Progress: Price Performance and the Future of AI.” (Inference cost and fixed-capability cost declines over the 2022–2026 window.)

[16] Age of Wonders. “Create an Age of Wonders.” (Internal cross-reference for intelligence as infrastructure and the broader Age of Wonders thesis.)

[17] David, P. A. (1990). “The Dynamo and the Computer: An Historical Perspective on the Modern Productivity Paradox.” American Economic Review, 80(2). RePEc record. (The canonical account of the multi-decade lag between electrification and its measured productivity payoff.)

[18] S&P Global Market Intelligence (2026). “Private equity investment surge sends US data center deals to 5-year high.” (Private equity deployment into U.S. datacentres during 2025.)

[19] U.S. Energy Information Administration (2024). “Average operating heat rate for selected energy sources.” (7,754 Btu/kWh used as the conversion basis for fuel-only generation cost.)

[20] Reuters (2025–2026). “Musk’s xAI buys third building to expand AI compute power.” See also Reuters reporting on Anthropic’s access to Colossus 1: “SpaceX signs cloud deal with Google.” (Colossus 1 reported holding more than 220,000 Nvidia processors; 300 MW expansion capacity and one-million-accelerator programme target referenced in xAI / SpaceXAI reporting.)

[21] Financial Times / Data Center Dynamics (2025–2026). “Oracle to buy USD 40 billion of Nvidia chips for OpenAI data centre.” “Oracle to spend USD 40 billion on Nvidia chips for OpenAI Texas data center.” (Abilene, Texas campus: 400,000 GB200-class chips, 1.2 GW design power, Oracle / OpenAI / Crusoe structure, and project financing.)

[22] NVIDIA / U.S. Department of Energy (2025). “NVIDIA and Oracle to Build U.S. Department of Energy’s Largest AI Supercomputer for Scientific Discovery.” “Energy Department Announces New Partnership with NVIDIA and Oracle to Build Largest DOE AI Supercomputer.” (Combined 110,000 Blackwell GPUs planned for Solstice and Equinox at Argonne National Laboratory.)

[23] NVIDIA / Meta Platforms (2026). “Meta Builds AI Infrastructure With NVIDIA.” See also Reuters: “Nvidia to sell Meta millions of chips in multiyear deal.” (Infrastructure roadmap referencing large-scale Blackwell- and Rubin-class deployments.)

[24] Reuters / industry power-density references. “Reuters reporting on Colossus 1 capacity and processor count.” (1.3 kW all-in facility load per top-end accelerator used as an order-of-magnitude heuristic, consistent with reported Colossus processor count and associated facility power. Real deployments vary with cooling architecture, networking density, and silicon mix.)

[25] Blackstone / VoltaGrid / Halliburton (2026). “VoltaGrid Announces USD 1 Billion Strategic Equity Investment from Blackstone and Halliburton.” VoltaGrid official announcement. (Modular natural-gas generation systems, Propell acquisition, 7.5 GW forward order book, and data-centre / industrial-site deployment.)

[26] Epoch AI (2026). “How much does it cost to produce an Nvidia B200 GPU?” “AI chip component cost shares.” (Component-level teardown of Nvidia B200 production cost; HBM and advanced packaging as dominant variable-cost drivers.)

[27] SemiAnalysis / Epoch AI / Tom’s Hardware (2025–2026). “Memory Mania—How a Once in Four Decades Memory Cycle Is Emerging.” SemiAnalysis datacenter industry model. “Memory will consume 30% of hyperscaler spending this year.” Epoch AI: “AI chip supply chain constraints.” (HBM, packaging, memory-share, and supply-chain constraints across the AI accelerator stack.)

[28] OpenRouter / BusinessWire (2026). “OpenRouter Raises USD 113 Million CapitalG-led Series B as Weekly Volume Explodes to 25T Tokens.” (OpenRouter as an inference-routing and workload-aggregation layer across heterogeneous model providers.)

[29] Brynjolfsson, E., Rock, D., & Syverson, C. (2021). “The Productivity J-Curve: How Intangibles Complement General Purpose Technologies.” American Economic Journal: Macroeconomics. NBER Working Paper PDF. See also Atlanta Fed / Richmond Fed / Duke CFO Survey: “Artificial Intelligence, Productivity, and the Workforce.” Working paper PDF. (The structural lag between GPT arrival and measured productivity gains, mediated by complementary intangible capital.)

[30] PricewaterhouseCoopers (2026). “PwC 2026 AI Performance Study.” (Executive survey on AI value concentration, business-model reinvention, and autonomous decision-making.)

[31] SXSW CMO Survey / Brynjolfsson, Hitt & Yang (2002). “CMOs at SXSW 2026: For every AI dollar, invest USD 2–3 in training.” “Intangible Assets: Computers and Organizational Capital.” Brookings paper PDF. (On complementary intangible capital, training, change management, workflow redesign, and organisational adaptation.)

[32] S&P Global Market Intelligence / Gartner / Reuters (2025–2026). “Generative AI shows rapid growth but yields mixed results.” “Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027.” Reuters coverage of Gartner’s agentic-AI forecast. (Enterprise AI implementation failure, project abandonment, and agentic-AI cancellation forecasts.)

[33] METR / Reuters / Google DORA / Google Cloud / GitClear (2024–2026). “Measuring the impact of early-2025 AI on experienced open-source developer productivity.” METR working paper on arXiv. Reuters coverage of the METR study. Google DORA Report 2024. “When AI writes the code, who reviews it?” GitClear AI Copilot Code Quality Report 2025. (Productivity-reliability paradox, review burden, approval fatigue, code churn, and delivery-stability concerns.)

[34] Stanford Human-Centered AI Institute / Stanford Digital Economy Lab (2026). “AI Index Report 2026.” “AI Index Report 2026: Economy Chapter.” “Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence.” (Early-career software-developer employment effects and seniority-skewed labour-market impacts.)

[35] Goldman Sachs Global Macro Research (2023–2026). “Generative AI could raise global GDP by 7%.” (Goldman Sachs optimistic productivity and GDP uplift projections for generative AI.)

[36] Penn Wharton Budget Model (2025). “The Projected Impact of Generative AI on Future Productivity Growth.” (Conservative long-run estimates for generative AI’s effect on U.S. GDP and productivity.)

[37] Acemoglu, D. (2024). “The Simple Macroeconomics of AI.” NBER Working Paper No. 32487. RePEc record. (Conservative macroeconomic estimate of generative AI’s GDP and productivity effects.)

[38] Anthropic (2026). “Introducing Claude Opus 4.8.” (Product announcement, system-card material, dynamic workflows, subagents, and benchmark comparisons.)

Background references

DeepMind / OpenAI / Anthropic technical references on distillation, quantisation, inference efficiency, and model-system optimisation over the 2022–2026 window. DeepMind AlphaFold. OpenAI model optimisation documentation. Anthropic research.

BlackRock (2026). BlackRock Investment Institute. (EMEA client survey and AI infrastructure / energy-investment commentary; use the specific survey PDF where available.)

Reuters / U.S. Department of Energy (2026). DOE Loan Programs Office. DOE Office of Nuclear Energy. (U.S. grid-side and nuclear-capacity responses to AI power demand; cite specific programme releases where available.)

Brynjolfsson, E. & McAfee, A. (2014). “The Second Machine Age.” W.W. Norton.

Smil, V. (2017). “Energy and Civilization: A History.” MIT Press.

Perez, C. (2002). “Technological Revolutions and Financial Capital.” Edward Elgar.

NIST AI Risk Management Framework. https://www.nist.gov/itl/ai-risk-management-framework

International Energy Agency (2024). “Electricity 2024.” (Data-centre electricity demand projections through 2026.)

Explore More

Read new essays exploring abundance, access, and the Age of Wonders ahead.