Cache And Carry

Calculating sequences like the Fibonacci numbers sounds straightforward enough. If we call the Nth Fibonacci number F(n), then they’re defined as F(0)=F(1)=1; F(n), for N greater than 1, is F(n-1)+F(n-2). So, F(2)=F(1)+F(0)=2; F(3)=F(2)+F(1)=3; etc.

The sequence is recursively defined, but if you code it up that way, you run into a problem. Naïve use of the recursive Fibonacci algorithm is horrendously inefficient. Say you’re computing F(5). Well, that’s F(4)+F(3). Easy. Except you need F(4) and F(3). F(4) is F(3)+F(2) and F(3) is F(2)+F(1). With each level, the number of calls to the function doubles.

This is O(2^N), which means that computing the function for large values of N will be prohibitively slow for some reasonable value of N, even if you have an amazingly fast computer. Recursive Fibonaccis are manageable for F(5), hairy for F(10), and F(40) and above will slow down even a modern Core i9 workstation to a crawl.

The solution is to cache your results. Whenever you calculate a Fibonacci number, store its result in memory. Now, when calculating F(5), we have F(4) and F(3) in memory, and merely have to perform an addition to calculate F(5), instead of the whole pyramid of recursive lookups. Recursive O(2^N) calculations like this are simply not feasible if done in the usual straightforward way — calculating one term of the sequence could take longer than the projected lifetime of the universe. With caching, each term is simply two array lookups and an addition, and happens in almost no time at all.

The drawback is that these arrays can get very large. Exploring the Hofstadter Q sequence, I read that the existence of numbers Q(n) in the sequence has been proven up to 10^10. Ten billion is a fairly low number in terms of these things — but that’s still some 80GB of memory. Each term in the sequence has to be stored, which uses eight bytes of memory for a 64-bit unsigned integer. (32-bit numbers aren’t large enough.)

The Hofstadter Q sequence is more convoluted than the Fibonaccis. It is defined as Q(1)=Q(2)=1, and Q(n>2)=Q(n-Q(n-1))+Q(n-Q(n-2)). So the preceding two terms tell the algorithm how far back in the sequence to go to find the two terms to add. With Fibonaccis, it’s enough to keep the previous two terms in memory. Not so for the Hofstadter Q sequence — it’s unclear as yet just how far back you would need to go, so you may need the whole sequence.

It’s apparently not currently known whether Q(n) is even defined for all n (although it most likely is). According to the Online Encyclopedia of Integer Sequences (OEIS) page, Roman Pearce had computed that a(n) exists for n <= 10^10 as of August 2014.

I was able to extend this result to 2×10^10, and confirm that Q(n) does extend at least that far. It may be possible to extend this somewhat farther, but the cache is getting unwieldy for a workstation with “only” 64GB of physical memory — the commit size is over 200GB.

I’m reminded of Seymour Cray’s (NSFW but 100% true) quote about memory

This entry was posted in Algorithms, BASIC, Coding, Math. Bookmark the permalink.

Leave a Reply