LLMs and Napkin Problems
by Pradipta Mitra
On May 20th, Tim Gowers advised fellow mathematicians to sit down before reading the tweet that was to follow. In it, he declared that AI had solved Erdős’s unit-distance problem, a celebrated problem in discrete geometry first posed by Paul Erdős.
The OpenAI announcement, the proof write-up, the companion remarks, and even a rewritten chain of thought are available online.
Encouraged by this, I tried GPT-5.5 with extra-high reasoning on a couple of open problems from my own old papers on network algorithms. I had tried this sort of thing before, with no real success, but after the Erdős result it seemed worth another attempt. The model could not solve them.
For context, here is a GPT-rephrasing of the two questions I threw at it (I did write up precise conjectures for the model prompt):
GPT-rephrased, not a formal theorem statement.
- In Towards Tight Bounds for Local Broadcasting, can we really pin down the right complexity of local broadcast in the physical SINR model, especially whether a stubborn
log^2 nterm is inherent rather than an artifact of the algorithm or proof?- In Wireless Connectivity and Capacity, is the paper’s
O(log n)algorithm for strong connectivity in the SINR model actually tight, or is there a faster topology/scheduling construction hiding somewhere?
This failure is interesting because although OpenAI apparently used an internal model for the Erdős result, it is hard to believe that, even after accounting for that, my problems are anywhere near as deep or difficult as the unit-distance problem. So what explains the difference?
Prompt engineering skills are unlikely to be the difference, since the Erdős prompt is available and is as clear as day – it is hardly more complex than “Solve this problem please”. In any case, I asked my GPT to look at their prompt and rewrite the prompt to match its style and content.
Ok, so what is it then?
I asked ChatGPT. In cringey LLM-ese, it suggested that the Erdős problem was “culturally saturated” while mine wasn’t. While looking down my nose at this phrase, I rather dejectedly realized that this may simply be a euphemism for “Noga Alon cares about the Unit Distance problem and doesn’t give two hoots about yours”. As far as I am concerned, ChatGPT is not sycophantic enough.
Jilted ego aside, there’s something in what is said – as ChatGPT expounded – “Model difficulty is about whether the right idea lies in a high-probability path through the model’s learned space of arguments.”
This seems plausible. The deep jump to algebraic number theory that has impressed many mathematicians may not have been an alien jump in the training distribution. Examples, analogies, and partial results may already sit near one another in the model’s learned space. And my problem being “shallower” may precisely be the issue – the eventual solution might involve basic mathematics: a clever observation, a careful construction, some high-school calculus, and college-level discrete math. But that also means there may not be a long trail of deep analogies and field-crossing proofs for the model to exploit.
Perhaps this is of consolation to those who are concerned about the imminent death of that great intellectual endeavor?
Those mathematicians who have sat down, perhaps would like stand up ala Slim Shady, and help solve some of our more humble problems.
tags: