The AI Decoupling: How the Tech Economy Split in Two¶
Source: The AI Decoupling
Date Published: 2026-05-26
Author: Pierre-Carl Langlais, Vintage Data
TL;DR¶
Starting in May 2025, the tech economy structurally split into two: SaaS/Cloud suffered its largest sell-off since the pandemic, while AI Labs and Infrastructure soared into a separate high-margin ecosystem. This decoupling is driven not by hype but by hard technological breakthroughs in sparse Mixture-of-Experts (MoE) inference, synthetic data pipelines, and Model-IP protections that fundamentally broke the economic assumptions of the previous software era.
The Great Decoupling¶
"It started exactly one year ago: in May 2025, the tech economy split into software and AI."
- The Event: SaaS and cloud services saw the biggest sell-off since the pandemic began. AI labs and associated infrastructure soared.
- Root Cause: MoE inference economics and synthetic pretraining changed what a "product" is.
The Engine: High-Margin MoE Inference Economics¶
"The MoE market is high margin by design: inference gets cheaper by several orders of magnitude provided you have simultaneously enough compute and enough demand."
- Architecture: Highly sparse Mixture-of-Experts with native quantization. Expert routing is a form of economic optimization.
- Context Management: Long context is affordable because models learned to manage their own attention.
- Barrier to Entry: Medium-sized dense models are no longer competitive. Very high technical and financial barriers.
- Chip Autonomy: Google, Meta (MTIA), and DeepSeek are pushing for hardware/model co-development, entrenching concentration.
The Asset: How Models Stopped Commoditizing¶
The End of Web-Scale Training¶
"Models are no longer trained on the web but through large scale synthetic pipelines."
Anthropic's latest models are rumored to be trained on up to 150 trillion tokens — far beyond quality text available online. Synthetic pretraining allows precise selection of what the model memorizes, redefining data shapes, and modeling much more than language.
The Appropriability Regime (Model-IP)¶
"You wouldn't steal a reasoning trace: how models stopped commoditizing."
- Paradox of Open Weights: Labs release advanced models because they are confident no one can replicate their internal custom kernels.
- China Closes Access: Qwen-Max is closed. Kimi 2.6 and Minimax 2.7 are open under strict conditions.
- US Aggression: Retraining on synthetic output is now called "distillation attacks." Labs pivot away from "all training data is fair use."
The Crisis: Broken Tokenomics¶
The Fundamental Pricing Problem¶
"Claude Code does not price on a per-seat basis... token-based consumption pricing does not behave like the software line items CFOs know how to model."
- Real World Failure: Uber burned its 2026 AI budget in four months on Claude Code
- Opaque Outputs: Hidden reasoning traces and subagent delegation break billing/ownership contracts
The Emerging Solution: Outcome-Based Pricing¶
OpenAI CFO Sarah Friar (Jan 2026): "As intelligence moves into scientific research, drug discovery, energy systems, we need outcome-based pricing models."
The Ultimate Question: Is SaaS in Terminal Decline?¶
The best-performing tech companies are now either AI labs (OpenAI, Anthropic, DeepSeek) or AI infrastructure (NVIDIA, CoreWeave). The software layer — once the highest-margin part of tech — is being compressed between powerful frontier models and commoditized interfaces.
Key Takeaways¶
- The tech economy structurally decoupled in May 2025 — SaaS declined while AI infrastructure soared
- MoE inference economics create a high-margin, high-barrier AI market that traditional SaaS cannot compete with
- Synthetic pretraining (up to 150T tokens) and Model-IP protections have stopped model commoditization
- Token-based pricing creates a mismatch with enterprise budgeting — outcome-based pricing is emerging
- The software layer is being compressed between powerful frontier models and commoditized interfaces