The AI Decoupling: How the Tech Economy Split in Two¶

Source: The AI Decoupling
Date Published: 2026-05-26
Author: Pierre-Carl Langlais, Vintage Data

TL;DR¶

Starting in May 2025, the tech economy structurally split into two: SaaS/Cloud suffered its largest sell-off since the pandemic, while AI Labs and Infrastructure soared into a separate high-margin ecosystem. This decoupling is driven not by hype but by hard technological breakthroughs in sparse Mixture-of-Experts (MoE) inference, synthetic data pipelines, and Model-IP protections that fundamentally broke the economic assumptions of the previous software era.

The Great Decoupling¶

"It started exactly one year ago: in May 2025, the tech economy split into software and AI."

The Event: SaaS and cloud services saw the biggest sell-off since the pandemic began. AI labs and associated infrastructure soared.
Root Cause: MoE inference economics and synthetic pretraining changed what a "product" is.

The Engine: High-Margin MoE Inference Economics¶

"The MoE market is high margin by design: inference gets cheaper by several orders of magnitude provided you have simultaneously enough compute and enough demand."

Architecture: Highly sparse Mixture-of-Experts with native quantization. Expert routing is a form of economic optimization.
Context Management: Long context is affordable because models learned to manage their own attention.
Barrier to Entry: Medium-sized dense models are no longer competitive. Very high technical and financial barriers.
Chip Autonomy: Google, Meta (MTIA), and DeepSeek are pushing for hardware/model co-development, entrenching concentration.

The Asset: How Models Stopped Commoditizing¶

The End of Web-Scale Training¶

"Models are no longer trained on the web but through large scale synthetic pipelines."

Anthropic's latest models are rumored to be trained on up to 150 trillion tokens — far beyond quality text available online. Synthetic pretraining allows precise selection of what the model memorizes, redefining data shapes, and modeling much more than language.

The Appropriability Regime (Model-IP)¶

"You wouldn't steal a reasoning trace: how models stopped commoditizing."

Paradox of Open Weights: Labs release advanced models because they are confident no one can replicate their internal custom kernels.
China Closes Access: Qwen-Max is closed. Kimi 2.6 and Minimax 2.7 are open under strict conditions.
US Aggression: Retraining on synthetic output is now called "distillation attacks." Labs pivot away from "all training data is fair use."

The Crisis: Broken Tokenomics¶

The Fundamental Pricing Problem¶

"Claude Code does not price on a per-seat basis... token-based consumption pricing does not behave like the software line items CFOs know how to model."

Real World Failure: Uber burned its 2026 AI budget in four months on Claude Code
Opaque Outputs: Hidden reasoning traces and subagent delegation break billing/ownership contracts

The Emerging Solution: Outcome-Based Pricing¶

OpenAI CFO Sarah Friar (Jan 2026): "As intelligence moves into scientific research, drug discovery, energy systems, we need outcome-based pricing models."

The Ultimate Question: Is SaaS in Terminal Decline?¶

The best-performing tech companies are now either AI labs (OpenAI, Anthropic, DeepSeek) or AI infrastructure (NVIDIA, CoreWeave). The software layer — once the highest-margin part of tech — is being compressed between powerful frontier models and commoditized interfaces.

Key Takeaways¶

The tech economy structurally decoupled in May 2025 — SaaS declined while AI infrastructure soared
MoE inference economics create a high-margin, high-barrier AI market that traditional SaaS cannot compete with
Synthetic pretraining (up to 150T tokens) and Model-IP protections have stopped model commoditization
Token-based pricing creates a mismatch with enterprise budgeting — outcome-based pricing is emerging
The software layer is being compressed between powerful frontier models and commoditized interfaces