Show Notes
Picture Dr. Katharina Hess — she runs the Computational Chemistry Group at one of the big pharma companies in the Novartis corridor. 11 postdocs and data scientists under her. Not 3 projects — 30 open projects, research cycles of 5, 10, 20 years.
Five days ago she opens Nature. The headline grabs her:
"AI cracks an 80-year-old mathematical challenge."
She reads it. Reads it again. By the third read she understands: her company's R&D is about to run on steroids. Not because of the math problem itself — but because of the method.
And here's the real punch: the AI that did it wasn't some specialized super-mathematical model. It was ChatGPT. Yes, your ChatGPT. (OK, the reasoning model, GPT-5.4 Pro — but still.)
🧮 Who the hell was Paul Erdős?
Hungarian mathematician, born 1913. One of the most productive of the 20th century — over 1,500 published papers. Restless. No apartment. No fixed office. Today we'd call him a digital nomad — back then, an analog one. He went from university to university with two suitcases.
His passion wasn't solving problems. It was formulating them. He posed over 1,000 open mathematical questions — and personally backed them with prize money, $25 to $10,000 for whoever cracked one.
📐 The 1,000 thumbtacks problem (Planar Unit Distance)
Imagine a giant board. You take 1,000 thumbtacks. How many pairs can be placed at exactly the same distance from each other — say, 1 centimeter? Sounds simple. It isn't.
In 1984, Spencer & Trotter calculated the upper bound: n to the 4/3 power. That ceiling hasn't moved in 40 years. Noga Alon (Princeton): "It was one of Erdős's favorite problems."
💸 How ChatGPT solved it — for ~$1,000 in tokens
Step one — which ChatGPT? Not the one that messes up your email. The reasoning model — GPT-5.4 Pro. You actually have to click the model selector. Don't use Auto.
The prompt was almost unassuming: "Could Erdős be wrong? Could the reasoning behind this bound be flawed?"
And then the model worked. Completely autonomously. 125 pages. Around 100,000 tokens. Cost: somewhere between $100 and $1,000.
Reality check: tomorrow I'm flying to an oil & gas company in Hannover. Zurich → Hannover one-way: $800. So the token cost of solving an 80-year-old mathematical problem is in the order of a single business trip.
🔧 The trick: not a better screwdriver — a different wrench entirely
For 40 years mathematicians attacked this with geometric tools: incidence geometry, Szemerédi-Trotter, crossing number method. Those tools hit a natural ceiling — the n^(4/3) bound.
The AI did something else. It pulled a completely different key out of the toolbox: algebraic number theory. CM fields. Complex multiplication. Infinite Galois towers.
It didn't solve the problem. It reformulated it — from a geometric problem to a number-theoretic one. And suddenly the answer became much more concrete.
🤖 The DeepMind counter-punch: AlphaProof Nexus + Lean
Then Google DeepMind dropped the receipts. Their system AlphaProof Nexus claims to have solved:
- 9 open Erdős problems
- 44 additional open conjectures
- A 15-year-old problem in algebraic geometry
And here's where it gets architectural. AlphaProof Nexus combines AI reasoning with a formal verification tool called Lean. The AI doesn't just spit out an answer — it produces a step-by-step proof, and Lean mechanically verifies every single step. Every logical leap is checked. Incorrect assumptions are rejected. The final proof meets strict mathematical standards.
Cost per problem: a few hundred dollars in compute.
⚖️ Two religions: human-verified vs machine-verified
This is now a genuine philosophical split in the AI math community:
- OpenAI's approach: let the LLM produce the proof, then send it to 9 of the world's top mathematicians — including Fields Medal winners like Noga Alon, Daniel Litt, Melanie Wood — to verify by hand. Slow. Authoritative.
- DeepMind's approach: let the AI prove it AND let the machine (Lean) verify it. Fast. Reproducible. But — you have to trust Lean.
Both approaches address the hallucination problem: AI models can invent unproven statements, skip difficult parts, present incomplete proofs as finished. Human review and machine verification are two different solutions to the same fundamental risk.
🛑 The Hassabis caveat: AGI is still far
Demis Hassabis (DeepMind CEO) reminds everyone: "For an AI, this wasn't actually that hard." The problem is extremely difficult to solve, but it's bounded. AGI would require:
- Creativity across multiple fields simultaneously
- Independent reasoning
- Original idea generation
Today's systems are powerful specialized tools — not minds.
But here's the catch: the most clever thing the AI did wasn't the solution. It was the cross-domain reformulation. And that's exactly where your R&D department needs to wake up.
🧬 Why your R&D needs this — silos, Da Vinci, AlphaFold
Pharma R&D is the textbook silo problem:
- Medicinal chemists define and find targets
- Biologists know the pathways
- Statisticians wade through the data
They work in their silos. They don't talk on the level where breakthroughs happen.
Leonardo da Vinci could. Math + chemistry + physics + anatomy — all in one head, all connected. Today that's impossible for a human because of information overload. But an AI? An AI has exactly that cross-domain synthesis ability.
Side note: Google DeepMind already won the Nobel Prize 10 years ago — for AlphaFold solving the protein-folding problem. Pure cross-domain AI. If pharma had taken that seriously, they'd be a decade ahead today.
🦴 The uncomfortable truth about your senior researchers
Who are the most expensive people in any R&D department? Not the juniors. The 30-year veterans earning three-quarters of a million euros a year.
And they are the worst AI users. Because they fundamentally say: "I've done research like this for 40 years. I don't need ChatGPT."
When you hire a postdoc in 2026, "is he good in his domain?" is no longer the only question. The new questions:
- Can he prompt a reasoning model correctly?
- Can he ask cross-domain questions? "How would a biologist see this? How would an economist see this?"
- Does he click "Auto" or does he deliberately choose GPT-5.4 Reasoning?
⚖️ The legal department will be your next blocker
Imagine: you've found something genius with ChatGPT. You want to patent it. Who stops you first? Legal.
- Does it belong to us? Or to OpenAI?
- Does it belong to Microsoft (if you used Copilot)?
- Who holds the patent?
The answers aren't clarified yet. Your discoveries may sit in legal review for 2 years. Plan for it.
🎯 Three Monday Actions for every R&D leader
- Roll out reasoning models NOW. Not "Auto" mode. Train your researchers to deliberately pick GPT-5.4 Pro or Claude Opus. Quality triples.
- Dramatically raise per-researcher token budgets. €100/month is 2020 thinking. Give serious researchers €100,000 per year. If an 80-year math problem cost €1,000, that's 100 such attempts per researcher per year.
- Make cross-domain prompting a mandatory skill. Every researcher must learn to say: "How would a mathematician see this? How would a biologist? How would an economist?" That's the tool-swap question that cracked the Erdős problem.
🎬 Bottom line
Paul Erdős posed a problem. For 80 years the best mathematicians in the world couldn't break through. Today, for around €1,000, a ChatGPT reasoning model solved it — by reformulating the entire problem.
Your R&D problems are probably stuck at some disciplinary boundary right now. Don't let a single process in your company run without AI support.
⏱️ Timestamps
- 00:00 — Cold open: Dr. Katharina Hess and the Nature headline
- 03:00 — Who was Paul Erdős? 1,500 papers, digital nomad, prize money
- 06:00 — The 1,000 thumbtacks problem — the 40-year n^(4/3) ceiling
- 09:30 — How ChatGPT cracked it — 125 pages, 100k tokens, $1,000
- 13:30 — The DeepMind counter-punch — AlphaProof Nexus + Lean (9 Erdős problems, 44 conjectures)
- 17:00 — Human-verified vs machine-verified — two religions
- 19:00 — The Hassabis caveat — AGI still far
- 20:30 — Why R&D needs this — silos, Da Vinci, AlphaFold
- 24:00 — Your senior researchers are the worst AI users
- 26:00 — The legal department headache
🎙️ About the Host
Malcolm Werchota runs AI adoption programs for companies across Europe. After 15+ years at Novartis and Schlumberger, today's focus: AI without the bullshit. Lecturer at ESADE and HSLU. Studied at Leoben — applied physics / materials science.
🚀 Resources for Executives
📬 Contact
📰 Sources
- Nature — "AI cracks an 80-year-old mathematical challenge" (May 2026)
- OpenAI — Reasoning model GPT-5.4 Pro
- Google DeepMind — AlphaProof Nexus + Lean verification announcement
- Noga Alon (Princeton), Daniel Litt (Toronto), Melanie Wood (Harvard) — verification paper
- Demis Hassabis / Google DeepMind — AGI commentary
- Spencer & Trotter (1984) — original Planar Unit Distance upper bound
Tags: #ErdosProblem #ChatGPT #ReasoningModels #AlphaProofNexus #DeepMind #LeanProver #Pharma #ResearchAndDevelopment #CrossDomainAI #AlgebraicNumberTheory #FieldsMedal #DemisHassabis #AGI #werchota #TheAICookbookShow