The AI Cookbook Show by Malcolm Werchota

Nature headline 5 days ago: 'AI cracks an 80-year-old mathematical challenge.' But the wild part isn't the problem — it's the method. ChatGPT (yes, YOUR ChatGPT) cracked Paul Erdős's Planar Unit Distance problem. Not with better geometry — by reformulating the entire problem into algebraic number theory. Cross-domain synthesis. Cost: ~$1,000 in tokens (= a business trip Zurich-Hannover). Verified by 9 Fields Medal-level mathematicians. Plus: DeepMind's AlphaProof Nexus + Lean counter-punch (9 Erdős problems, 44 conjectures), and what this means for your R&D department.

Show Notes

Picture Dr. Katharina Hess — she runs the Computational Chemistry Group at one of the big pharma companies in the Novartis corridor. 11 postdocs and data scientists under her. Not 3 projects — 30 open projects, research cycles of 5, 10, 20 years.
Five days ago she opens Nature. The headline grabs her:
"AI cracks an 80-year-old mathematical challenge."
She reads it. Reads it again. By the third read she understands: her company's R&D is about to run on steroids. Not because of the math problem itself — but because of the method.
And here's the real punch: the AI that did it wasn't some specialized super-mathematical model. It was ChatGPT. Yes, your ChatGPT. (OK, the reasoning model, GPT-5.4 Pro — but still.)
🧮 Who the hell was Paul Erdős?
Hungarian mathematician, born 1913. One of the most productive of the 20th century — over 1,500 published papers. Restless. No apartment. No fixed office. Today we'd call him a digital nomad — back then, an analog one. He went from university to university with two suitcases.
His passion wasn't solving problems. It was formulating them. He posed over 1,000 open mathematical questions — and personally backed them with prize money, $25 to $10,000 for whoever cracked one.
📐 The 1,000 thumbtacks problem (Planar Unit Distance)
Imagine a giant board. You take 1,000 thumbtacks. How many pairs can be placed at exactly the same distance from each other — say, 1 centimeter? Sounds simple. It isn't.
In 1984, Spencer & Trotter calculated the upper bound: n to the 4/3 power. That ceiling hasn't moved in 40 years. Noga Alon (Princeton): "It was one of Erdős's favorite problems."
💸 How ChatGPT solved it — for ~$1,000 in tokens
Step one — which ChatGPT? Not the one that messes up your email. The reasoning model — GPT-5.4 Pro. You actually have to click the model selector. Don't use Auto.
The prompt was almost unassuming: "Could Erdős be wrong? Could the reasoning behind this bound be flawed?"
And then the model worked. Completely autonomously. 125 pages. Around 100,000 tokens. Cost: somewhere between $100 and $1,000.
Reality check: tomorrow I'm flying to an oil & gas company in Hannover. Zurich → Hannover one-way: $800. So the token cost of solving an 80-year-old mathematical problem is in the order of a single business trip.
🔧 The trick: not a better screwdriver — a different wrench entirely
For 40 years mathematicians attacked this with geometric tools: incidence geometry, Szemerédi-Trotter, crossing number method. Those tools hit a natural ceiling — the n^(4/3) bound.
The AI did something else. It pulled a completely different key out of the toolbox: algebraic number theory. CM fields. Complex multiplication. Infinite Galois towers.
It didn't solve the problem. It reformulated it — from a geometric problem to a number-theoretic one. And suddenly the answer became much more concrete.
🤖 The DeepMind counter-punch: AlphaProof Nexus + Lean
Then Google DeepMind dropped the receipts. Their system AlphaProof Nexus claims to have solved:
  • 9 open Erdős problems
  • 44 additional open conjectures
  • A 15-year-old problem in algebraic geometry
And here's where it gets architectural. AlphaProof Nexus combines AI reasoning with a formal verification tool called Lean. The AI doesn't just spit out an answer — it produces a step-by-step proof, and Lean mechanically verifies every single step. Every logical leap is checked. Incorrect assumptions are rejected. The final proof meets strict mathematical standards.
Cost per problem: a few hundred dollars in compute.
⚖️ Two religions: human-verified vs machine-verified
This is now a genuine philosophical split in the AI math community:
  • OpenAI's approach: let the LLM produce the proof, then send it to 9 of the world's top mathematicians — including Fields Medal winners like Noga Alon, Daniel Litt, Melanie Wood — to verify by hand. Slow. Authoritative.
  • DeepMind's approach: let the AI prove it AND let the machine (Lean) verify it. Fast. Reproducible. But — you have to trust Lean.
Both approaches address the hallucination problem: AI models can invent unproven statements, skip difficult parts, present incomplete proofs as finished. Human review and machine verification are two different solutions to the same fundamental risk.
🛑 The Hassabis caveat: AGI is still far
Demis Hassabis (DeepMind CEO) reminds everyone: "For an AI, this wasn't actually that hard." The problem is extremely difficult to solve, but it's bounded. AGI would require:
  • Creativity across multiple fields simultaneously
  • Independent reasoning
  • Original idea generation
Today's systems are powerful specialized tools — not minds.
But here's the catch: the most clever thing the AI did wasn't the solution. It was the cross-domain reformulation. And that's exactly where your R&D department needs to wake up.
🧬 Why your R&D needs this — silos, Da Vinci, AlphaFold
Pharma R&D is the textbook silo problem:
  • Medicinal chemists define and find targets
  • Biologists know the pathways
  • Statisticians wade through the data
They work in their silos. They don't talk on the level where breakthroughs happen.
Leonardo da Vinci could. Math + chemistry + physics + anatomy — all in one head, all connected. Today that's impossible for a human because of information overload. But an AI? An AI has exactly that cross-domain synthesis ability.
Side note: Google DeepMind already won the Nobel Prize 10 years ago — for AlphaFold solving the protein-folding problem. Pure cross-domain AI. If pharma had taken that seriously, they'd be a decade ahead today.
🦴 The uncomfortable truth about your senior researchers
Who are the most expensive people in any R&D department? Not the juniors. The 30-year veterans earning three-quarters of a million euros a year.
And they are the worst AI users. Because they fundamentally say: "I've done research like this for 40 years. I don't need ChatGPT."
When you hire a postdoc in 2026, "is he good in his domain?" is no longer the only question. The new questions:
  • Can he prompt a reasoning model correctly?
  • Can he ask cross-domain questions? "How would a biologist see this? How would an economist see this?"
  • Does he click "Auto" or does he deliberately choose GPT-5.4 Reasoning?
⚖️ The legal department will be your next blocker
Imagine: you've found something genius with ChatGPT. You want to patent it. Who stops you first? Legal.
  • Does it belong to us? Or to OpenAI?
  • Does it belong to Microsoft (if you used Copilot)?
  • Who holds the patent?
The answers aren't clarified yet. Your discoveries may sit in legal review for 2 years. Plan for it.
🎯 Three Monday Actions for every R&D leader
  1. Roll out reasoning models NOW. Not "Auto" mode. Train your researchers to deliberately pick GPT-5.4 Pro or Claude Opus. Quality triples.
  2. Dramatically raise per-researcher token budgets. €100/month is 2020 thinking. Give serious researchers €100,000 per year. If an 80-year math problem cost €1,000, that's 100 such attempts per researcher per year.
  3. Make cross-domain prompting a mandatory skill. Every researcher must learn to say: "How would a mathematician see this? How would a biologist? How would an economist?" That's the tool-swap question that cracked the Erdős problem.
🎬 Bottom line
Paul Erdős posed a problem. For 80 years the best mathematicians in the world couldn't break through. Today, for around €1,000, a ChatGPT reasoning model solved it — by reformulating the entire problem.
Your R&D problems are probably stuck at some disciplinary boundary right now. Don't let a single process in your company run without AI support.
⏱️ Timestamps
  • 00:00 — Cold open: Dr. Katharina Hess and the Nature headline
  • 03:00 — Who was Paul Erdős? 1,500 papers, digital nomad, prize money
  • 06:00 — The 1,000 thumbtacks problem — the 40-year n^(4/3) ceiling
  • 09:30 — How ChatGPT cracked it — 125 pages, 100k tokens, $1,000
  • 13:30 — The DeepMind counter-punch — AlphaProof Nexus + Lean (9 Erdős problems, 44 conjectures)
  • 17:00 — Human-verified vs machine-verified — two religions
  • 19:00 — The Hassabis caveat — AGI still far
  • 20:30 — Why R&D needs this — silos, Da Vinci, AlphaFold
  • 24:00 — Your senior researchers are the worst AI users
  • 26:00 — The legal department headache
🎙️ About the Host
Malcolm Werchota runs AI adoption programs for companies across Europe. After 15+ years at Novartis and Schlumberger, today's focus: AI without the bullshit. Lecturer at ESADE and HSLU. Studied at Leoben — applied physics / materials science.
🚀 Resources for Executives
📬 Contact
📰 Sources
  • Nature — "AI cracks an 80-year-old mathematical challenge" (May 2026)
  • OpenAI — Reasoning model GPT-5.4 Pro
  • Google DeepMind — AlphaProof Nexus + Lean verification announcement
  • Noga Alon (Princeton), Daniel Litt (Toronto), Melanie Wood (Harvard) — verification paper
  • Demis Hassabis / Google DeepMind — AGI commentary
  • Spencer & Trotter (1984) — original Planar Unit Distance upper bound
Tags: #ErdosProblem #ChatGPT #ReasoningModels #AlphaProofNexus #DeepMind #LeanProver #Pharma #ResearchAndDevelopment #CrossDomainAI #AlgebraicNumberTheory #FieldsMedal #DemisHassabis #AGI #werchota #TheAICookbookShow

What is The AI Cookbook Show by Malcolm Werchota?

Malcolm Werchota's AI Cookbook Show is where artificial intelligence meets authentic business transformation. Known for his direct style and willingness to show AI in action—even during live presentations—Malcolm helps organizations understand that AI isn't about replacing humans but amplifying their capabilities. From voice-note productivity hacks to real-time meeting intelligence, this podcast delivers actionable insights for immediate implementation.