[Cuis-dev] Really fast Fibonacci function

Andres Valloud ten at smallinteger.com
Fri Dec 6 22:11:00 PST 2019


I played with these things quite a bit over the years.  The identity 
F(a+b) = F(a+1) F(b) + F(a) F(b-1) can be used to speed up calculations. 
  In particular, the idea is to split the index n = a+b so that a >= b 
and a = 2^k.  Then you can cache F(2^k + 1), F(2^k), and F(2^k - 1) to 
speed things up (once you have two in any of these triplets, the other 
one is essentially free).  Since these indices are known in advance, no 
dictionary is necessary.  Note this approach requires very little 
storage space, as opposed to the more naive way of breaking n into e.g. 
(n / 2) ceiling and (n / 2) floor.

I saw this about 20 years ago and implemented it in Squeak 1.x or 2.x. 
Another bit that came from that work: look at F(5^n) mod 100 and 1000. 
Fascinating, no?

Calculating Fn using that recursion plus caching was a bit faster than 
Dijkstra's recursion (which, by virtue of its shape, is not as cache 
friendly).  However, when starting from scratch, Dijsktra's recursion 
was a bit faster.  I did some of this around 2008.

A very important implementation strategy for this kind of work is to 
implement Karatsuba's multiplication.  I did that in the HPS world, and 
it paid off once you got past 384 bit x 384 bit multiplications or so. 
Very quickly, Karatsuba's multiplication complexity of about n^1.6 
soundly beats the usual elementary n^2 multiplication algorithm, even if 
Karatsuba is implemented in the image while the n^2 algorithm is a VM 
primitive (that, in the HPS case, I also (re)wrote).  I have no idea 
what the breakeven point is in the Cog world.

By the way, the simple grade school multiplication algorithm is just 
fine as a default VM implementation because the usual multiplication 
case involves relatively small large integers, and in that case the 
simple algorithm wins because of its simplicity.

As a suggestion, you can reimplement LargeInteger>>* to do a size test 
then delegate to the usual primitives if the sizes are small enough, 
otherwise you do Karatsuba to get smaller integers then try again. 
However, I think it's cleaner to implement something like 
Integer>>karatsubaTimes: (refining it in the subclasses appropriately), 
then you can send that new message from the places you know matter.

A VM implementation of Karatsuba's algorithm comes with subtle problems 
because it needs to allocate multiple intermediate objects.  This means 
such multiplications will tax the memory manager.  It's possible that 
allocations will fail at any point, and having to stop the calculation 
in the middle will induce stability problems (e.g. Karatsuba can almost 
fill memory completely and then fail --- now the image doesn't have 
enough space to even react to the memory emergency, and your system 
crashes).  It's not that trivial to make a robust VM implementation.  In 
contrast, the grade school multiplication has optimal memory allocation 
behavior, so if it fails due to an out of memory condition then you know 
why and what to do.  Moreover, it's not that hard to implement Karatsuba 
in the image, and again the big performance gain comes from n^1.6 vs 
n^2.  Soon enough even the constants won't matter.

On 12/6/19 20:26, Agustín Sansone via Cuis-dev wrote:
> Hi all,
> 
> I was exploring the /mathematical functions/ category of the class 
> /Integer/ and a I didn't find the very well-known Fibonacci function 
> ("/f(n)=f(n-1)+f(n-2)"/) , so I decided to implement it. I think this 
> will be interesting due to the complexity I found (wich is O(log(n))!!).
> 
>   * A simple way to implement it is by writing the direct recursive
>     mathematical recurrence relation. This would be exponential.
>   * An iterative solution wouldn't do all this repeated work, therefore
>     it would be lineal.
>   * I used a dynamic-programming-divide-and-conquer-aproach wich
>     computes the n-th Fibonacci number in O(log(n)). I think from this
>     comment you can see why it works:
>     https://math.stackexchange.com/questions/1834242/derivation-of-fibonacci-logn-time-sequence.
> 
> Here are some running times:
> exponential --> [1 to: 40 do: [:elem | elem fibonacciNumber]] timeToRun 
>    38536
> lineal -->           [1 to: 10000 do: [:elem | elem fibonacciNumber]] 
> timeToRun.   25558
> logarithmic -->  [1 to: 10000 do: [:elem | elem fibonacciNumber]] 
> timeToRun.   1471
> 
> I leave you all three implementations, let me know what you think,
> Agustín
> 
> 
> 
> 


More information about the Cuis-dev mailing list