Skip to content

The AI Decoupling: How the Tech Economy Split in Two

Source: The AI Decoupling
Date Published: 2026-05-26
Author: Pierre-Carl Langlais, Vintage Data


TL;DR

Starting in May 2025, the tech economy structurally split into two: SaaS/Cloud suffered its largest sell-off since the pandemic, while AI Labs and Infrastructure soared into a separate high-margin ecosystem. This decoupling is driven not by hype but by hard technological breakthroughs in sparse Mixture-of-Experts (MoE) inference, synthetic data pipelines, and Model-IP protections that fundamentally broke the economic assumptions of the previous software era.

The Great Decoupling

"It started exactly one year ago: in May 2025, the tech economy split into software and AI."

  • The Event: SaaS and cloud services saw the biggest sell-off since the pandemic began. AI labs and associated infrastructure soared.
  • Root Cause: MoE inference economics and synthetic pretraining changed what a "product" is.

The Engine: High-Margin MoE Inference Economics

"The MoE market is high margin by design: inference gets cheaper by several orders of magnitude provided you have simultaneously enough compute and enough demand."

  • Architecture: Highly sparse Mixture-of-Experts with native quantization. Expert routing is a form of economic optimization.
  • Context Management: Long context is affordable because models learned to manage their own attention.
  • Barrier to Entry: Medium-sized dense models are no longer competitive. Very high technical and financial barriers.
  • Chip Autonomy: Google, Meta (MTIA), and DeepSeek are pushing for hardware/model co-development, entrenching concentration.

The Asset: How Models Stopped Commoditizing

The End of Web-Scale Training

"Models are no longer trained on the web but through large scale synthetic pipelines."

Anthropic's latest models are rumored to be trained on up to 150 trillion tokens — far beyond quality text available online. Synthetic pretraining allows precise selection of what the model memorizes, redefining data shapes, and modeling much more than language.

The Appropriability Regime (Model-IP)

"You wouldn't steal a reasoning trace: how models stopped commoditizing."

  • Paradox of Open Weights: Labs release advanced models because they are confident no one can replicate their internal custom kernels.
  • China Closes Access: Qwen-Max is closed. Kimi 2.6 and Minimax 2.7 are open under strict conditions.
  • US Aggression: Retraining on synthetic output is now called "distillation attacks." Labs pivot away from "all training data is fair use."

The Crisis: Broken Tokenomics

The Fundamental Pricing Problem

"Claude Code does not price on a per-seat basis... token-based consumption pricing does not behave like the software line items CFOs know how to model."

  • Real World Failure: Uber burned its 2026 AI budget in four months on Claude Code
  • Opaque Outputs: Hidden reasoning traces and subagent delegation break billing/ownership contracts

The Emerging Solution: Outcome-Based Pricing

OpenAI CFO Sarah Friar (Jan 2026): "As intelligence moves into scientific research, drug discovery, energy systems, we need outcome-based pricing models."

The Ultimate Question: Is SaaS in Terminal Decline?

The best-performing tech companies are now either AI labs (OpenAI, Anthropic, DeepSeek) or AI infrastructure (NVIDIA, CoreWeave). The software layer — once the highest-margin part of tech — is being compressed between powerful frontier models and commoditized interfaces.

Key Takeaways

  1. The tech economy structurally decoupled in May 2025 — SaaS declined while AI infrastructure soared
  2. MoE inference economics create a high-margin, high-barrier AI market that traditional SaaS cannot compete with
  3. Synthetic pretraining (up to 150T tokens) and Model-IP protections have stopped model commoditization
  4. Token-based pricing creates a mismatch with enterprise budgeting — outcome-based pricing is emerging
  5. The software layer is being compressed between powerful frontier models and commoditized interfaces