To Have Machines Make Math Proofs, Turn Them Into a Puzzle¶

Source: Quanta Magazine
Interview with: Marijn Heule (Carnegie Mellon University)
Date: November 10, 2025

Core Idea¶

Marijn Heule uses SAT (Satisfiability) — a symbolic AI technique that turns math problems into giant binary constraint puzzles (think Sudoku with millions of cells) — to solve long-standing open problems in pure mathematics. His track record includes the Empty Hexagon, Schur Number 5, and Keller's Conjecture (dimension 7), problems that resisted proof for 90+ years.

His vision is a three-part pipeline that could produce the first mathematical proof ever discovered by AI that humans cannot verify independently:

LLM — Carves a big mathematical statement into smaller, plausible lemmas (high-level "big picture" work).
SAT Solver — Proves or refutes each lemma, returning minimal counterexamples that act like a human learning from failure.
Lean — A formal proof checker that glues all certified pieces together into a watertight whole.

"A SAT tool is not computing with zeros and ones. Instead, it is searching for a combination of them that satisfies all the constraints."

The "Understanding vs. Trust" Debate¶

The philosophical heart of the piece. Timothy Gowers (Fields Medalist) called Heule's Pythagorean triples proof "the most disgusting proof ever" because it offered no human-comprehensible insight.

Heule's counter: Understanding in mathematics is highly overrated. No single mathematician understands all of math — we rely on chains of trust. Automated reasoning can produce proofs more trustworthy than most pen-and-paper proofs.

"LLMs can do all of their bullshitting, but as soon as automated reasoning is able to say, 'OK, but this part is actually correct, and here's a proof,' this is actually more trustworthy than most of the pen-and-paper proofs out there."

AI as Co-Author, Not Replacement¶

Heule emphasizes humans remain essential — his successes came from collaborating with mathematicians who spent years developing conceptual insights, which he then encoded for the SAT solver. The future is LLMs helping more mathematicians learn to encode problems, not removing humans from the loop.

"The creative intuition, the conceptual reframing, that's still something people are uniquely good at. The magic comes from the collaboration."

Key Takeaways¶


SAT ≠ Neural Networks	SAT is symbolic GOFAI — hard-coded logical rules, not pattern matching. It searches rather than computes.
Minimal counterexamples	SAT solvers return small, interpretable refutations, providing insight into why a conjecture fails.
The bottleneck	Encoding a math problem for SAT is currently an expert skill. Heule wants LLMs to automate this, opening the pipeline to more mathematicians.
Trust > Understanding	Heule provocatively flips the traditional mathematical value system — certified correctness matters more than human-comprehensible narrative.
Future goal	The first AI-discovered proof of a problem that no human can independently verify.