I recently became obsessed with a fairly unique integer sequence that I
introduced and explained in an earlier post.
The sequence is **2, 3, 4, 82000** and, to my knowledge, no one has found
the next number in the sequence.
It’s quite possible that there does not exist a larger number!

I’m presenting some code I wrote to try and find the next number of the
sequence in a semi-intelligent brute-force manner.
For anyone interested, the code can be found
here, and utilizes the
GNU Multiple Precision Arithmetic Library.
Unfortunately, the library was written in straight C.
I wrote a wrapper for the integer part of the library called **BigInt**,
which utilizes C++11 operations and resource management, so that I didn’t
lose my mind.
**BigInt** does not wrap all the possible integer functions, just many
normal arithmetic operations and some other useful functions.
It can be found in the github repository.

Enough chit-chat, let’s talk about the code, eh?
The core algorithm for finding the nth integer of the sequence is to
take a guess and check if it satisfies all the bases from **n** down to **3**.
Note that checking base **2** is unnecessary, since integers can only be
constructed from 0s and 1s in base **2** by definition.
(Also note a definitional subtlety of the sequence:
the first number in the sequence is
2, but this corresponds to the n=2 number by the
sequence definition.
From here on out, I will call 2 the 2nd number of the sequence)
As soon as we find a base for which the integer representation of our guess
contains a digit which is not 0 or 1, we do not need to check other bases and
can generate our next guess.

Let’s take a look at an example, trying to verify if the number 10 is the 5th number in our sequence:

Base 10 | Base 5 | Base 4 | Base 3 |
---|---|---|---|

10 | 20 | 22 | 101 |

I have colored in green the representations that only use 0s and 1s, and red
(and underlined)
those that are not only 0s and 1s.
In base **5**, we start with the most significant digit, and see that it’s
a 2.
Immediately we can stop our search and generate the next number in base **5**
that satisfies our criterion.
That number, in its base **5** representation is 100, or the decimal number 25.

With this method, we do not check every number, but skip a number roughly
proportional to the size of the number (This will be explained in the next
post).
What happens if base **5** passes?
We move on to base **4** and do the same procedure, checking if it satisfies,
or else generating the next number in base **4** that does satisfy the
criterion.
If we move through base **3**, then we’re done!
It’s worth noting that, if we used base **4** to generate the next guess,
we do not need to check base **4** for that guess.
This optimization saves a considerable amount of time.

Let’s look at all the guesses that my code visited before finding the 5th number in the sequence, 82000:

Base 10 | Base 5 | Base 4 | Base 3 |
---|---|---|---|

10 | 20 | 22 | 101 |

25 | 100 | 121 | 221 |

64 | 224 | 1000 | 2101 |

125 | 1000 | 1331 | 11122 |

256 | 2011 | 10000 | 100111 |

625 | 10000 | 21301 | 212011 |

1024 | 13044 | 100000 | 1101221 |

3125 | 100000 | 300311 | 11021202 |

4096 | 112341 | 1000000 | 12121201 |

15625 | 1000000 | 3310021 | 210102201 |

16384 | 1011014 | 10000000 | 211110211 |

16400 | 1011100 | 10000100 | 211111102 |

19683 | 1112213 | 10303203 | 1000000000 |

78125 | 10000000 | 103010231 | 10222011112 |

81920 | 10110140 | 110000000 | 11011101002 |

82000 | 10111000 | 110001100 | 11011111001 |

Starting with the largest base, we immediately abort the guess as soon as
a base fails the criterion (despite the fact that all the bases are colored
in the above table).
When checking the guess 25, we know that base **5** is satisfied, so we
skip that check and move onto base **4**, which fails.
This causes us to guess the next larger number that satisfies the base **4**
test, being 1000 in base **4**, or 64.
Continuing the process, we quickly converge to the solution 82000 and stop.

So how about the next number in the sequence? I haven’t found it yet, and my code has been running for a week on my desktop. I’m currently checking numbers that have almost 14 million digits (as of September 7th, 2015), whereas the previous number 82000 only had 5 digits!

Before this post gets much longer, I’ll end. My next post will highlight some pieces of the code, show some diagnostic plots, and give an update on my progress.