GPT-5.2 Solves Erdős Problems: A New Era for AI in Mathematics – AI News – #3 January 2026

3min.

Comments:0

19 January 2026

GPT-5.2 Solves Erdős Problems: A New Era for AI in Mathematics – AI News – #3 January 2026d-tags
A historic milestone in mathematics and artificial intelligence occurred this weekend: GPT-5.2 autonomously solved Erdős Problem #397, a conjecture that had remained open for decades. The validity of the proof was officially confirmed by Terence Tao, one of the world's greatest living mathematicians. Unlike previous controversies, this is not a case of AI simply finding a forgotten paper in an archive; it is the generation of an original solution, fully formalized and verified in the Lean language.

3min.

Comments:0

19 January 2026

Over the weekend, news circulated through the math and AI communities that may mark a turning point in the human-machine relationship. Neel Somani, a software engineer and former quant researcher, tested the mathematical capabilities of OpenAI’s newest model—GPT-5.2—on the “Erdős problems,” a collection of over 1,000 conjectures posed by the Hungarian mathematician Paul Erdős that have long challenged scientists.

Somani input Problem #397—which questions the existence of infinitely many solutions for a specific equation involving central binomial coefficients—into the model. After approximately 15 minutes of “thinking,” GPT-5.2 generated a full proof that effectively disproved the conjecture by providing an infinite family of counterexamples.

Crucially, the process didn’t end with chatbot text. Somani utilized Harmonic’s Aristotle tool to formalize the proof in Lean, a code-based verification language. Only after this machine-verified proof was established did it reach Terence Tao, who confirmed its correctness. On the erdosproblems.com database, the status of Problem #397 was updated to “DISPROVED (LEAN).”

Informacje z twittera Neela Somani odnośnie rozwiązanego problemu Erdosa
Source: https://x.com/neelsomani/status/201021516214660712

Why This Isn’t “Fake News” (Like October 2025)

Many industry observers may recall the hype from October 2025, when headlines claimed GPT-5 had solved ten Erdős problems. As Thomas Bloom, the maintainer of the problem database, later clarified, the AI had made no actual discovery. The models had simply performed an effective literature search, locating forgotten but existing solutions in old academic papers.

This time, the situation is fundamentally different. The solutions for Problem #397—as well as the recently solved #728 and #729—are original proofs. While they utilize known techniques, the specific arguments did not exist in mathematical literature prior to this week. We are witnessing the generation of new knowledge, not just advanced data synthesis.

The New Pipeline: From Prompt to Formalization

This event highlights a new, highly effective workflow for AI-assisted mathematical research. It relies not on blind trust in a chatbot, but on a rigorous, multi-stage verification pipeline:

  1. Prompting: A human researcher precisely formulates the problem.
  2. Candidate Generation: GPT-5.2 proposes a proof or counterexample.
  3. Auto-Formalization: Tools like Aristotle (Harmonic) translate the proof into the Lean language.
  4. Machine Verification: The Lean compiler checks the logical validity of every step (errors are rejected).
  5. Expert Review: A top-tier mathematician (like Terence Tao) accepts the result.

This pipeline drastically compresses the time required to solve such problems from years or months to mere days.

The Long Tail and “Low Hanging Fruit”

Are mathematicians becoming obsolete? Terence Tao advises a nuanced view. On Mastodon, he noted that current AI models are best suited for the “long tail” of Erdős problems. These are challenges that:

  • Require the application of standard techniques in novel ways.
  • Were too niche or time-consuming for top mathematicians to prioritize.
  • Are relatively “simple” (in academic terms), but no one had previously bothered to write out a formal proof.

Tao predicts that many of these “easier” problems will soon be solved by purely machine-based or hybrid methods. GPT-5.2 currently scores 77% on competition-level math, making it a powerful tool for clearing the backlog of unsolved mathematical conjectures.

Implications for AI

For professionals tracking AI development, this case demonstrates the transition of Large Language Models (LLMs) from pattern-matching to genuine reasoning. The ability to autonomously generate logically sound, multi-step proofs has implications far beyond mathematics.

In the coming months, we can expect a surge of applications utilizing this type of “watertight reasoning” in other fields requiring high rigor, such as legal contract analysis, engineering optimization, and regulatory compliance.

The Erdős problem database has unexpectedly become the most important benchmark for AI systems. Watching Terence Tao’s repository and the rate at which “open” problems disappear will be the best indicator of real progress in artificial intelligence.

Sources:

Author
Maciej Jakubiec - Junior SEO Specialist
Author
Maciej Jakubiec

SEO Specialist

A marketing graduate specializing in e-commerce from the University of Economics in Kraków – part of Delante’s SEO team since 2022. A firm believer in the importance of well-crafted content, and apart from being an SEO, a passionate music producer crafting sounds since his early teens.

Author
Maciej Jakubiec - Junior SEO Specialist
Author
Maciej Jakubiec

SEO Specialist

A marketing graduate specializing in e-commerce from the University of Economics in Kraków – part of Delante’s SEO team since 2022. A firm believer in the importance of well-crafted content, and apart from being an SEO, a passionate music producer crafting sounds since his early teens.