82000: Episode III — Revenge of the Nerd

This is the third post in a series about the sequence 2, 3, 4, 82000, …

Post 1 — what the sequence is
Post 2 — the search algorithm
Code on GitHub

How the search scales

The candidates I’m checking are now around 14 million digits long, so I’m using the GNU Multiple Precision Arithmetic Library and running 20 threads via OpenMP.

One diagnostic I track is the number of decimal digits in the current candidate versus wall-clock time. Fitting a curve to this data gives:

$$n(t) \propto \sqrt{t}$$

Taking the derivative with respect to time and converting to a function of $n$:

$$\frac{dn}{dt}(n) \propto \frac{1}{n}$$

The search rate is inversely proportional to the digit count — the problem scales linearly in digits. Two factors explain this:

Checking a single candidate is a linear operation in digit count.
The skip size is proportional to the size of the candidate.

The second point is the key insight. In base 6, each digit independently has a probability $P = 2/3$ of being a “bad” digit (2, 3, 4, or 5). The expected number of digits we check before finding a bad one is:

$$\langle n \rangle = \sum_{i=1}^{\infty} i \cdot P^i = 6$$

So even for huge numbers, we only expect to inspect ~6 digits before triggering a skip. The skip jumps us forward by roughly $6^{n-6}$ — a number proportional to the candidate itself. This is why adding more digits doesn’t make the search geometrically harder.

Threading

The threading strategy:

Maintain a shared counter $n_\text{current}$ (current digit count).
Each thread grabs a range of n_check_range digit-counts, increments the counter, and checks all candidates in that range.
If a thread generates a candidate outside its range, it ignores it and grabs a new range.

I set n_check_range to 1000 binary bits (~300 decimal digits). The number of candidate guesses within any 300-digit window stays roughly constant as the search progresses — consistent with the linear scaling analysis. This means threads spend the right amount of time per range and aren’t thrashing the shared counter.

How I check individual digits

Rather than converting the number to a different base, I use a mask technique that stays in base-10 throughout.

Example: is 48 (= 120 in base-6) valid?

Start mask at $6^2 = 36$ (the leading digit position in base-6).
Compare mask to guess (36 ≤ 48): the leading digit is ≥ 1. Subtract: guess = 48 − 36 = 12, mask still 36.
Compare again (36 > 12): the leading digit was exactly 1. ✓
Divide mask by 6: mask = 6. Compare to guess (6 ≤ 12): digit is ≥ 1. Subtract: guess = 12 − 6 = 6, mask still 6.
Compare again (6 ≤ 6): digit is ≥ 1. Subtract: guess = 0. Compare (6 > 0): digit was exactly 1… wait, that gives 120 which has a 2.

Let me redo with 48 = 120 in base-6 properly: digits are 1, 2, 0. At step 4, after the first digit check succeeds, we do the second digit: mask = 6, guess = 12. Subtract once: guess = 6. Mask ≤ guess still: subtract again: guess = 0. Mask is now equal to original guess twice — meaning the digit was 2. Fail immediately. Skip to next candidate.

This mask approach uses only fast integer division and subtraction on GMP integers — no base conversion required.

Status

By the time I finished writing this, the search had reached ~15 million digits. I’ll post again if I find something.