I recently became obsessed with a fairly unique integer sequence that I introduced and explained in an earlier post. The sequence is 2, 3, 4, 82000 and, to my knowledge, no one has found the next number in the sequence. It’s quite possible that there does not exist a larger number!
I’m presenting some code I wrote to try and find the next number of the sequence in a semi-intelligent brute-force manner. For anyone interested, the code can be found here, and utilizes the GNU Multiple Precision Arithmetic Library. Unfortunately, the library was written in straight C. I wrote a wrapper for the integer part of the library called BigInt, which utilizes C++11 operations and resource management, so that I didn’t lose my mind. BigInt does not wrap all the possible integer functions, just many normal arithmetic operations and some other useful functions. It can be found in the github repository.
Enough chit-chat, let’s talk about the code, eh? The core algorithm for finding the nth integer of the sequence is to take a guess and check if it satisfies all the bases from n down to 3. Note that checking base 2 is unnecessary, since integers can only be constructed from 0s and 1s in base 2 by definition. (Also note a definitional subtlety of the sequence: the first number in the sequence is 2, but this corresponds to the n=2 number by the sequence definition. From here on out, I will call 2 the 2nd number of the sequence) As soon as we find a base for which the integer representation of our guess contains a digit which is not 0 or 1, we do not need to check other bases and can generate our next guess.
Let’s take a look at an example, trying to verify if the number 10 is the 5th number in our sequence:
|Base 10||Base 5||Base 4||Base 3|
I have colored in green the representations that only use 0s and 1s, and red (and underlined) those that are not only 0s and 1s. In base 5, we start with the most significant digit, and see that it’s a 2. Immediately we can stop our search and generate the next number in base 5 that satisfies our criterion. That number, in its base 5 representation is 100, or the decimal number 25.
With this method, we do not check every number, but skip a number roughly proportional to the size of the number (This will be explained in the next post). What happens if base 5 passes? We move on to base 4 and do the same procedure, checking if it satisfies, or else generating the next number in base 4 that does satisfy the criterion. If we move through base 3, then we’re done! It’s worth noting that, if we used base 4 to generate the next guess, we do not need to check base 4 for that guess. This optimization saves a considerable amount of time.
Let’s look at all the guesses that my code visited before finding the 5th number in the sequence, 82000:
|Base 10||Base 5||Base 4||Base 3|
Starting with the largest base, we immediately abort the guess as soon as a base fails the criterion (despite the fact that all the bases are colored in the above table). When checking the guess 25, we know that base 5 is satisfied, so we skip that check and move onto base 4, which fails. This causes us to guess the next larger number that satisfies the base 4 test, being 1000 in base 4, or 64. Continuing the process, we quickly converge to the solution 82000 and stop.
So how about the next number in the sequence? I haven’t found it yet, and my code has been running for a week on my desktop. I’m currently checking numbers that have almost 14 million digits (as of September 7th, 2015), whereas the previous number 82000 only had 5 digits!
Before this post gets much longer, I’ll end. My next post will highlight some pieces of the code, show some diagnostic plots, and give an update on my progress.