How AI Agents Reshape Knowledge Work¶

Paper: arXiv 2606.07489 · Authors: Jeremy Yang, Kate Zyskowski, Noah Yonack, Jerry Ma · Institution: Harvard / Perplexity

Problem & Motivation¶

The transition from conversational AI assistants (chatbots) to autonomous agents (systems that independently execute multi-step workflows) is widely hyped but lacks empirical evidence. While there is abundant speculation about how agents will transform knowledge work, there is no rigorous field data quantifying the actual behavioral and economic shifts. The central question: when knowledge workers shift from chat-based AI interaction to autonomous agent delegation, what changes in work patterns, productivity, and task scope?

Method / Approach¶

The authors leverage a natural experiment: Perplexity's transition from offering a search-only product (Perplexity Search) to an autonomous agent product (Perplexity Computer). This provides a clean comparison between conversational AI assistance and autonomous agent delegation within the same platform and user base.

Analytical framework: Agents are modeled as having higher fixed delegation cost (setup, specification) but lower marginal cost per step (autonomous execution) compared to chatbots. This implies an optimal task complexity threshold — below it, chat is cheaper; above it, agents win.

Three families of findings:

Autonomy — Machine work per session increases 48× (26 minutes vs 33 seconds), and user dissatisfaction drops 55%.
Efficiency — Task completion time drops 87% (269 minutes → 36 minutes), and cost drops 94%.
Scope — Horizontal expansion: cross-occupation work increases by +9 percentage points. Vertical expansion: 50% of agent queries are "Create"-level tasks vs 26% for chat.

Key Results¶

Setting	Metric	Result
Autonomy	Machine work per session	48× (26 min vs 33 sec)
Autonomy	User dissatisfaction	-55%
Efficiency	Task completion time	-87% (269→36 min)
Efficiency	Cost per task	-94%
Scope (horizontal)	Cross-occupation work	+9pp
Scope (vertical)	"Create"-level queries	50% vs 26%

The fixed-cost model successfully predicts a complexity threshold: simple queries (<2 steps) are cheaper with chat; complex tasks (>4 steps) strongly favor agents.

Contributions¶

First field evidence on the economic and behavioral impact of autonomous agents vs conversational AI, using real-world usage data from a major platform.
Provides a clean economic model (fixed vs marginal cost) that explains when agents outperform chat and quantifies the threshold.
Documents both vertical expansion (agents tackle harder tasks) and horizontal expansion (agents spread across more occupations) — two distinct mechanisms for productivity gains.
Large, ecologically valid dataset (Perplexity user base) rather than a controlled lab study.

Strengths¶

Real-world data: This is not a toy experiment or curated benchmark but actual user behavior on a major platform. The ecological validity is a major strength over lab studies.
Clean natural experiment: The Perplexity Search → Perplexity Computer transition provides an unusually clean comparison with good counterfactual control.
Economic framework: The fixed-cost / marginal-cost model is simple, intuitive, and yields testable predictions (the complexity threshold) that are empirically confirmed.
Quantifies shifts in task scope: The vertical (harder tasks) and horizontal (cross-occupation) expansion findings are novel and important — agents don't just accelerate existing work, they change what work is done.

Weaknesses / Limitations¶

Platform-specific: Perplexity's user base may not generalize to enterprise knowledge workers or other agent platforms (e.g., Codex, Claude Code, ChatGPT Tasks).
Selection bias: Users who adopt agent features may be systematically different from those who don't, potentially confounding the comparison.
Task categorization: "Create"-level vs other levels relies on Perplexity's internal task taxonomy, which may not replicate to other settings.
Cost measurement: The 94% cost reduction is platform-internal (compute cost); the total cost of ownership including user time, setup, and validation is not captured.
Novelty effects: The data covers the early transition period; long-run equilibrium effects may differ once novelty wears off.

Connections & Follow-ups¶

Related to the economic literature on task automation (Autor, Levy, Murnane), the productivity paradox (Solow), and recent work on AI and knowledge worker productivity. Complements controlled studies like Peng et al. (GitHub Copilot) and Noy & Zhang (GPT-4 writing tasks) by providing field evidence at platform scale. The horizontal expansion finding connects to the "task bundling" literature — AI doesn't just substitute for individual tasks, it reorganizes work boundaries.

My Take¶

This is one of the more important empirical papers in the AI agents space this year. The key contribution isn't the productivity numbers (though 87% time reduction is striking) — it's the economic framework showing that the fixed cost of delegation creates a threshold below which chat is optimal and above which agents dominate. This has immediate practical implications: organizations should audit their task portfolio and identify tasks above the complexity threshold. The vertical expansion finding is also underappreciated — agents don't just make existing workflows faster, they enable new categories of work that were previously not attempted. The 48× increase in machine work per session is the headline number, but the 55% reduction in dissatisfaction is arguably more interesting — it suggests users prefer this paradigm, not just that it's more efficient.