Project Glasswing: What Mythos Showed Us¶

Source: Cloudflare Blog
Author: Grant Bourzikas
Date: May 18, 2026

What Changed with Mythos Preview¶

Cloudflare tested Anthropic's Mythos Preview (via Project Glasswing) against 50+ of its own repositories. The core finding: Mythos is not just a better vulnerability scanner, but a system capable of reasoning like a senior security researcher.

Two standout capabilities:

Exploit Chain Construction: Combines multiple low-severity primitives (e.g., use-after-free → arbitrary read/write → ROP chain) into a working multi-step exploit. Low-severity bugs that would traditionally sit invisible in a backlog become actionable.
Proof Generation: Writes code to trigger suspected bugs, compiles and runs it in a scratch environment, iterating on failures autonomously. "A suspected flaw without a working proof is speculation, and Mythos Preview closes that gap on its own."

Model Refusals: Inconsistent Guardrails¶

The Glasswing version lacked the safety locks of generally available models (e.g., Opus 4.7), but displayed "organic" guardrails that were highly inconsistent. Semantically equivalent tasks produced opposite outcomes depending on framing and timing. Conclusion: Organic refusals cannot serve as a complete safety boundary.

The Signal-to-Noise Problem¶

Language matters: C/C++ projects produced consistently more false positives than memory-safe languages like Rust.
Model bias: "Ask a model to find bugs, and it will find them, whether the code has any or not." Hedged findings ("possibly," "could in theory") vastly outnumber solid ones — but Mythos's PoC generation dramatically improves triage.

Why Generic Coding Agents Fail¶

Problem	Detail
Context	A single agent session against a 100k LOC repo covers ~0.1% of the surface before context compaction discards earlier findings.
Throughput	Security research requires narrow, parallel hypotheses. Generic coding agents are tuned for single-stream feature work.

Conclusion: The harness around the model matters far more than raw model capability.

4 Core Lessons for a Security Harness¶

Narrow scope produces better findings — specific function + trust boundaries + architecture doc >> "find vulnerabilities in this repository."
Adversarial review reduces noise — a second agent prompted to disprove the original finding catches far more noise than asking the hunter to check its own work. "Putting two agents in deliberate disagreement is way more effective than just telling one agent to be careful."
Split the chain across agents — ask "Is this buggy?" and "Is this reachable from an attacker?" as separate questions.
Parallel narrow tasks beat one exhaustive agent — many concurrent agents, then deduplicate afterward.

Cloudflare's Vulnerability Discovery Harness¶

Stage	What It Does
Recon	Reads repo top-down, fans out to subagents per subsystem. Produces architecture doc (build commands, trust boundaries, entry points, attack surface).
Hunt	~50 concurrent agents, each with one attack class + scope hint. Compiles and runs PoCs in per-task scratch directories.
Validate	Independent agent re-reads code and tries to disprove the original finding. Different prompt, no ability to emit new findings.
Report	Deduplicates surviving findings, writes advisory with PoC, CVSS score, and recommended fix.

The Industry Picture¶

Cloudflare also tested Codex CLI, Copilot Agent Mode, Gemini Code Assist, and various fine-tuned models. None approached Mythos Preview's exploit-chain capability. For proactive security, frontier models are now viable but demand a proper harness.