tl;dr: Bruss and Paindaveine did all of this already in December 2025, and better.
The initial title of this post was “If This Does Not Blow Your Mind, Nothing Will”, but we didn’t want to be accused of producing clickbait. Still, if this does not blow your mind, nothing will. Hang on to your hats, there are several turns and twists to this story.
First of all, we recently became interested in a simple coin tossing scenario. Toss a fair coin until the number of heads first exceeds the number of tails. At that stopping time, record the proportion of heads. The expected value of that proportion is exactly $\pi/4$ (!). This is already a mind-blowing result, as $\pi$ makes a sudden appearance with no circle in sight. This result was proved in a recent preprint by Jim Propp. In an accompanying blog post, Propp provides more detail and describes how he obtained the result from ChatGPT. The ChatGPT derivation wasn’t optimal, some of the references did not exist, and it also turned out that the key result had been derived before. This is all honestly noted in Propp’s post and the revision of his preprint.
Secondly, the revision of the Propp preprint describes how a referee had asked a natural follow-up question: what if we stop not when heads first leads by $1$, but when it first leads by a general surplus $m$? Propp responds:
It turns out that if you toss a coin until the number of heads first exceeds the number of tails by +2 rather than +1, then the expected proportion of heads is ln 2 rather than $\pi$/4. Replacing +2 by a generic positive integer a appears to give rise to expressions involving $\pi$ when a is odd and ln 2 when a is even, but we have not checked ChatGPT’s derivations and therefore regard its claims and formulas as conjectural. We leave further investigation to others.
What!? This result blew our minds even more. To attempt a proof of this fascinating conjecture, we decided to let Claude Opus 4.6 have a go, with ChatPGT-5.4 Pro occasionally playing the role of a critic. Through a series of simple prompts Claude handled the task in minutes. Claude also produced a table summarizing the results, with a supporting simulation to boot:
| $m$ | Exact value | Decimal | Simulation |
|---|---|---|---|
| $1$ | $\pi/4$ | 0.785398 | 0.7858 |
| $2$ | $\ln 2$ | 0.693147 | 0.6939 |
| $3$ | $3-\tfrac{3}{4}\pi$ | 0.643806 | 0.6442 |
| $4$ | $2-2\ln 2$ | 0.613706 | 0.6144 |
| $5$ | $-\tfrac{10}{3}+\tfrac{5}{4}\pi$ | 0.593657 | 0.5946 |
| $6$ | $-\tfrac{3}{2}+3\ln 2$ | 0.579442 | 0.5817 |
| $7$ | $\tfrac{91}{15}-\tfrac{7}{4}\pi$ | 0.568880 | 0.5696 |
| $8$ | $\tfrac{10}{3}-4\ln 2$ | 0.560745 | 0.5617 |
Exact values, decimal approximations, and Monte Carlo simulation results for the expected proportion of heads if tossing continues until the surplus of heads over tails equals m.
We will not elaborate on the mathematical underpinning of this result, and why it is that the $\pi/4$ term and the ln(2) term alternate with $m$. All of these “details” can be found in a technical appendix. This appendix is almost entirely a joint product of Claude and ChatGPT. We did steer the process and prompted the LLMs to explain and explore certain avenues, but on the whole our impact was relatively minimal. Creating the document and checking its contents took about two days from start to finish. To begin with we fed Claude the Propp paper and blog post; when Claude was happy with a particular derivation we gave it for ChatGPT for feedback.
There is one final, crucial twist. After the derivation was in hand, we asked both Claude and ChatGPT whether or not their proof of Propp’s conjectures was new. Claude could not find any relevant prior work, but after some searching ChatGPT pointed us to a December 2025 preprint by Bruss and Paindaveine. These authors had already obtained the formulas in a more general biased-random-walk setting, and they also derived the equations for the variance. What is more, the route that Bruss and Paindaveine take to get to the result is highly similar to the one proposed by ChatGPT, suggesting that the LLMs may not have derived the result independently. When asked, Claude Opus 4.6 indicates that it was trained to data up until May 2025, whereas ChatGPT Pro says is was trained to data up to August 2025. Therefore we cautiously suggest that the LLMs did derive the result without access to the Bruss and Paindaveine preprint. Regardless, we find it fascinating that a simple coin tossing scenario can spawn an alternating dance between $\pi$ and ln(2) the way it does. Mind = blown!
References
Bruss, F. T. and Paindaveine, D. (2025). Win rates at first-passage times for biased simple random walks. Available at https://arxiv.org/abs/2512.21254. December 26, 2025.
Propp, J. (2026) Estimating $\pi$ with a coin. Available at https://arxiv.org/abs/2602.14487. November 24, 2025; revised March 10, 2026.
Propp, J. (2026) In praise of stupid questions. Blog post available at https://mathenchant.wordpress.com/2026/03/12/in-praise-of-stupid-questions/.
Eric-Jan Wagenmakers and Lourens Waldorp
Psychological Methods Group, University of Amsterdam.



