This is based on the previous dojo-20150131-memoization notebook.


In [1]:
from __future__ import print_function

We start with a function that calculates fibonacci numbers. Its code is very much like the definition of Fibonacci numbers.


In [2]:
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

It is correct. Check the output in the following cell.


In [3]:
[fib(i) for i in range(10)]


Out[3]:
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

However, it takes an exponential amount of time to calculate fibonacci numbers. This is not so bad for small values, but exponential things become big quickly.


In [4]:
n = 33

In [5]:
%timeit fib(n)
fib(n)


1 loops, best of 3: 3.92 s per loop
Out[5]:
3524578

The problem is that the naïve function recalculates Fibonacci numbers. Some code is added to keep track of how many times the function is called for each input value.


In [6]:
from collections import Counter

In [7]:
def fib(n):
    global c
    
    c[n] += 1
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

In [8]:
c = Counter()
fib(n)


Out[8]:
3524578

In [9]:
for i in sorted(c, reverse=True):
    print(i, c[i])
print()
print(sum(c.values()))


33 1
32 1
31 2
30 3
29 5
28 8
27 13
26 21
25 34
24 55
23 89
22 144
21 233
20 377
19 610
18 987
17 1597
16 2584
15 4181
14 6765
13 10946
12 17711
11 28657
10 46368
9 75025
8 121393
7 196418
6 317811
5 514229
4 832040
3 1346269
2 2178309
1 3524578
0 2178309

11405773

No wonder it is so slow. By the way, do you see a pattern in the numbers?

By the way, here's another way of sorting and iterating through the counts.


In [10]:
for item in sorted(c.items(), key=(lambda x: x[0]), reverse=True):
    print(item)
print()
print(sum(c.values()))


(33, 1)
(32, 1)
(31, 2)
(30, 3)
(29, 5)
(28, 8)
(27, 13)
(26, 21)
(25, 34)
(24, 55)
(23, 89)
(22, 144)
(21, 233)
(20, 377)
(19, 610)
(18, 987)
(17, 1597)
(16, 2584)
(15, 4181)
(14, 6765)
(13, 10946)
(12, 17711)
(11, 28657)
(10, 46368)
(9, 75025)
(8, 121393)
(7, 196418)
(6, 317811)
(5, 514229)
(4, 832040)
(3, 1346269)
(2, 2178309)
(1, 3524578)
(0, 2178309)

11405773

The above output has parentheses, because item is a tuple. We can avoid the parentheses, by explicitly indexing the elements of the tuple as in the next cell.


In [11]:
for item in sorted(c.items(), key=(lambda x: x[0]), reverse=True):
    print(item[0], item[1])
print()
print(sum(c.values()))


33 1
32 1
31 2
30 3
29 5
28 8
27 13
26 21
25 34
24 55
23 89
22 144
21 233
20 377
19 610
18 987
17 1597
16 2584
15 4181
14 6765
13 10946
12 17711
11 28657
10 46368
9 75025
8 121393
7 196418
6 317811
5 514229
4 832040
3 1346269
2 2178309
1 3524578
0 2178309

11405773

Another example of how to avoid the parentheses, without having to explicitly index each element of the tuple, is to use a leading '*' to unpack the tuple into separate arguments to the print function.


In [12]:
for item in sorted(c.items(), key=(lambda x: x[0]), reverse=True):
    print(*item)
print()
print(sum(c.values()))


33 1
32 1
31 2
30 3
29 5
28 8
27 13
26 21
25 34
24 55
23 89
22 144
21 233
20 377
19 610
18 987
17 1597
16 2584
15 4181
14 6765
13 10946
12 17711
11 28657
10 46368
9 75025
8 121393
7 196418
6 317811
5 514229
4 832040
3 1346269
2 2178309
1 3524578
0 2178309

11405773

We can make the function quicker, by calculating each Fibonacci number only once. When a previously calculated value is needed, we just return that previously calculated number.


In [13]:
def fib(n):
    global c
    global cache
    
    c[n] += 1
    if n in cache:
        return cache[n]
    if n == 0:
        f = 0
    elif n == 1:
        f = 1
    else:
        f = fib(n-1) + fib(n-2)
    cache[n] = f
    return f

In [14]:
# just to make them global
c = None
cache = None

In [15]:
def foo(n):
    global c
    global cache
    
    c = Counter()
    cache = {}
    fib(n)
%timeit foo(n)


10000 loops, best of 3: 94.8 us per loop

That is over 40,000 times faster than the naïve version. Let's look at the details.


In [16]:
c = Counter()
cache = {}
fib(n)
for i in sorted(c.keys(), reverse=True):
    print(i, cache[i], c[i])
print()
print(sum(c.values()))


33 3524578 1
32 2178309 1
31 1346269 2
30 832040 2
29 514229 2
28 317811 2
27 196418 2
26 121393 2
25 75025 2
24 46368 2
23 28657 2
22 17711 2
21 10946 2
20 6765 2
19 4181 2
18 2584 2
17 1597 2
16 987 2
15 610 2
14 377 2
13 233 2
12 144 2
11 89 2
10 55 2
9 34 2
8 21 2
7 13 2
6 8 2
5 5 2
4 3 2
3 2 2
2 1 2
1 1 2
0 0 1

65

Now let's clean up the code. First, let's remove the counting stuff, since we know the answer for that now.


In [17]:
def fib(n):
    global cache
    
    if n in cache:
        return cache[n]
    if n == 0:
        f = 0
    elif n == 1:
        f = 1
    else:
        f = fib(n-1) + fib(n-2)
    cache[n] = f
    return f

In [18]:
def foo(n):
    global cache
    
    cache = {}
    fib(n)
%timeit foo(n)


10000 loops, best of 3: 35.9 us per loop

Taking out the diagnostic counting code made it faster yet. The caching code makes the fib() function ugly and I don't like needing to remember to initialize the cache, so let's separate the caching into a separate function. Note that memoize() doesn't know anything about the fib() function, except that it takes a single argument. memoize() can be applied to many single argument functions.


In [19]:
def memoize(f):
    results = {}
    def helper(x):
        if x not in results:
            results[x] = f(x)
        return results[x]
    return helper

In [20]:
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

In [21]:
fib = memoize(fib)

In [22]:
def foo(n):
    def fib(n):
        if n == 0:
            return 0
        elif n == 1:
            return 1
        else:
            return fib(n-1) + fib(n-2)
    fib = memoize(fib)
    %timeit -n 1 fib(n)

In [23]:
for i in range(20):
    foo(n)


1 loops, best of 3: 3.1 us per loop
1 loops, best of 3: 3.1 us per loop
1 loops, best of 3: 3.1 us per loop
1 loops, best of 3: 3.1 us per loop
1 loops, best of 3: 2.15 us per loop
1 loops, best of 3: 2.86 us per loop
1 loops, best of 3: 954 ns per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 2.15 us per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 954 ns per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 2.15 us per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 1.91 us per loop
1 loops, best of 3: 954 ns per loop
1 loops, best of 3: 954 ns per loop
1 loops, best of 3: 1.91 us per loop

That yielded yet another big speed increase. It's now over a million times faster than the original function.


In [24]:
%timeit fib(n)


1000000 loops, best of 3: 399 ns per loop

That's faster yet, but perhaps deceptively so. %timeit did fib(n) many times, but initialized the results cache only once.

Next, let's use the syntactic sugar of a function decorator to eliminate the fib = memoize(fib) statement. The @memoize decorator can be used for many single argument functions.


In [25]:
@memoize
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

In [26]:
%timeit fib(n)


1000000 loops, best of 3: 400 ns per loop

That yielded about the same speed. The code is now both simple and fast.

How would one measure the speed of starting with an empty cache each time?

Review

What are the limitations of memoize() and @memoize?

  • Only works for functions that accept a single argument.
  • The function argument must be acceptable as a dictionary key.
  • Only works for functions, not generators.
  • Size of results dictionary might become too large.

How would you work around those limitations?

  • Only works for functions that accept a single argument.
    • Try *args and **kwargs stuff.
  • The function argument(s) must be acceptable as a dictionary key.
    • Might try a partial solution. Save the result when the input can be used as a dictionary key, otherwise, just calculate the result.
  • Only works for functions, not generators.
    • Could use cache internal to generator (but would lack elegant general wrapper function or decorator like memoize).
  • Size of results dictionary might become too large.
    • Might limit the size of the dictionary.
      • Beyond that, just calculate the result.
      • Might also save only the results for the most common input, or the results which are the most "expensive" to calculate.
    • Might have multi-stage caching, where first cache is dictionary within Python, and secondary or tertiary caches are saved outside of Python, such as in a file, a database, or a server.

In [27]:
# By the way, this computer is modest.
!lscpu


Architecture:          i686
CPU op-mode(s):        32-bit
Byte Order:            Little Endian
CPU(s):                2
On-line CPU(s) list:   0,1
Thread(s) per core:    1
Core(s) per socket:    2
Socket(s):             1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 14
Stepping:              8
CPU MHz:               2000.000
BogoMIPS:              3994.62
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              2048K

After the 2015-02-23 CohPy presentation of the above, it was suggested to use the .get() method of dictionaries to avoid the if statement inside helper(). The point is to make the code simpler and more elegant. Unfortunately, the .get() method would not change the dictionary, so the return value from the .get() method would have to be saved. That new memoize function would look like the following. It does succeed in getting rid the if statement, but the 'results[x] = results' looks ugly. Even worse, f(x) is always called, defeating the memoization.


In [28]:
def memoize(f):
    results = {}
    def helper(x):
        results[x] = results.get(x, f(x))
        return results[x]
    return helper

In [29]:
@memoize
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

In [30]:
%timeit fib(n)
fib(n)


1 loops, best of 3: 10.9 s per loop
Out[30]:
3524578

Using the .setdefault() method instead of the .get() method looks better, but suffers the same flaw of defeating the memoization by always calling f(x).


In [31]:
def memoize(f):
    results = {}
    def helper(x):
        return results.setdefault(x, f(x))
    return helper

In [32]:
@memoize
def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)

In [33]:
%timeit fib(n)
fib(n)


1 loops, best of 3: 9.16 s per loop
Out[33]:
3524578

2017-08-04 Addendum

My memoize decorator is fun and good for exploring closures and decorators. For real projects, use functools.lru_cache instead of my memoize decorator.