Recursion, Memoization and Dynamic Programming

Remember how we talk about using recursion and dynamic programming. One interesting thing to do is to implement the solution to a common problem called Fibonnaci numbers on these two styles and compare the compute time.

The Fibonacci series looks something like: 0, 1, 1, 2, 3, 5, 8, 13, 21 … and so on. Any person can quickly notice the pattern. f(n) = f(n-1) + f(n-2) So, let's walk through a recursive implementation that solves this problem.



In [1]:

    
def fib(n):
    if n < 2:
        return n
    return fib(n-2) + fib(n-1)



In [2]:

    
%time fib(30)









    



CPU times: user 296 ms, sys: 0 ns, total: 296 ms
Wall time: 295 ms






    Out[2]:





832040

Now, the main problem of this algorithm is that we are computing some of the subproblems more than once. For instance, to compute fib(4) we would compute fib(3) and fib(2). However, to compute fib(3) we also have to compute fib(2). Say hello to memoization.

A technique called memoization we are cache the results of previously computed sub problems to avoid unnecessary computations.



In [3]:

    
m = {}
def fibm(n):
    if n in m:
        return m[n]
    m[n] = n if n < 2 else fibm(n-2) + fibm(n-1)
    return m[n]



In [4]:

    
%time fibm(30)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 17.6 µs






    Out[4]:





832040

But the question is, can we do better than this? The use of the array is helpful, but when calculating very large numbers, or perhaps on memory contraint environments it might not be desirable. This is where Dynamic Programming fits the bill.

In DP we take a bottom-up approach. Meaning, we solve the next Fibonacci number we can with the information we already have.



In [13]:

    
def fibdp(n):
    if n == 0: return 0
    prev, curr = (0, 1)
    for i in range(2, n+1):
        newf = prev + curr
        prev = curr
        curr = newf
    return curr



In [25]:

    
%time fibdp(30)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 6.44 µs






    Out[25]:





832040

In this format, we don’t need to recurse or keep up with the memory intensive cache dictionary. These, add up to an even better performance.

Let's now give it a try with factorials. Remember 4! = 4 * 3 * 2 * 1 = 24. Can you give it try?



In [1]:

    
def factr(n):
    if n < 3:
        return n
    return n * factr(n - 1)



In [5]:

    
%time factr(30)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 43.2 µs






    Out[5]:





265252859812191058636308480000000



In [99]:

    
m = {}
def factm(n):
    if n in m:
        return m[n]
    m[n] = n if n < 3 else n * factr(n - 1)
    return m[n]



In [100]:

    
%time factm(30)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 47.2 µs






    Out[100]:





265252859812191058636308480000000



In [10]:

    
def factdp(n):
    if n < 3: return n
    fact = 2
    for i in range(3, n + 1):
        fact *= i
    return fact



In [11]:

    
%time factdp(30)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 7.87 µs






    Out[11]:





265252859812191058636308480000000

Let's think of a slightly different problem. Imagine that you want to find the cheapest way to go from city A to city B, but when you are about to buy your ticket, you see that you could hop in different combinations of route and get a much cheaper price than if you go directly. How do you efficiently calculate the best possible combination of tickets and come up with the cheapest route? We will start with basic recursion and work on improving it until we reach dynamic programming.

For this last problem in dynamic programming, create 2 functions that calculates the cheapest route from city A to B. I will give you the recursive solution, you will build one with memoization and the one with dynamic programming.



In [1]:

    
import numpy as np

Utility function to get fares between cities



In [2]:

    
def get_fares(n_cities, max_fare):
    np.random.seed(123456)
    fares = np.sort(np.random.random((n_cities, n_cities)) * max_fare).astype(int)
    for i in range(len(fares)):
        fares[i] = np.roll(fares[i], i + 1)
    np.fill_diagonal(fares, 0)
    for i in range(1, len(fares)):
        for j in range(0, i):
            fares[i][j] = -1
    return fares

Let's try it out with 4 cities and random fares with a max of 1000.



In [5]:

    
n_cities = 4
max_fare = 1000
fares = get_fares(n_cities, max_fare)
fares[1][2] = 50
fares









    Out[5]:





array([[  0, 126, 260, 897],
       [ -1,   0,  50, 376],
       [ -1,  -1,   0, 123],
       [ -1,  -1,  -1,   0]])

Here is the recursive solution:



In [7]:

    
def cheapestr(s, d, c):
    if s == d or s == d - 1:
        return c[s][d]
    
    cheapest = c[s][d]
    for i in range(s + 1, d):
        tmp = cheapestr(s, i, c) + cheapestr(i, d, c)
        cheapest = tmp if tmp < cheapest else cheapest
    return cheapest



In [8]:

    
%time cheapestr(0, len(fares[0]) - 1, fares)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 68.7 µs






    Out[8]:





299

Now, you build the memoization one:



In [9]:

    
m = {}
def cheapestm(s, d, c):
    """ YOU WRITE THIS FUNCTION """
    return 0



In [10]:

    
%time cheapestm(0, len(fares[0]) - 1, fares)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 22.6 µs






    Out[10]:





299

Faster, you see?

Now, do the dynamic programming version.



In [11]:

    
def cheapestdp(s, d, c):
    """ YOU WRITE THIS FUNCTION """
    return 0



In [12]:

    
%time cheapestdp(0, len(fares[0]) - 1, fares)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 61.5 µs






    Out[12]:





299

Let's now try with a larger example:



In [337]:

    
n_cities = 18 # this will take a little before 20 seconds. Try not to make it any larger :)
max_fare = 1000
fares = get_fares(n_cities, max_fare)
fares









    Out[337]:





array([[  0, 123, 126, 129, 228, 260, 336, 352, 373, 376, 447, 451, 543,
        776, 820, 840, 859, 897],
       [ -1,   0,  37,  61, 137, 146, 235, 245, 340, 343, 405, 574, 589,
        590, 594, 753, 852, 861],
       [ -1,  -1,   0,  16,  99, 117, 170, 199, 274, 342, 394, 401, 414,
        462, 481, 595, 610, 641],
       [ -1,  -1,  -1,   0,  94,  95, 134, 138, 155, 433, 471, 497, 560,
        630, 639, 683, 732, 758],
       [ -1,  -1,  -1,  -1,   0,  85, 140, 149, 329, 370, 386, 395, 477,
        544, 562, 566, 619, 634],
       [ -1,  -1,  -1,  -1,  -1,   0,  29,  30,  44, 113, 187, 207, 247,
        249, 249, 356, 409, 630],
       [ -1,  -1,  -1,  -1,  -1,  -1,   0,  22,  60, 168, 216, 277, 279,
        372, 419, 449, 606, 690],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,   0,  30,  36, 273, 321, 355,
        415, 419, 421, 497, 500],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,   0,  19, 123, 400, 412,
        418, 535, 547, 559, 591],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,   0, 110, 120, 140,
        204, 220, 395, 454, 488],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,   0,  87, 169,
        219, 257, 312, 487, 527],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,   0,   7,
         11,  22, 244, 250, 281],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,   0,
         10, 116, 123, 259, 287],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,
          0,  22,  93, 109, 236],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,
         -1,   0,  29, 157, 185],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,
         -1,  -1,   0,  18,  78],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,
         -1,  -1,  -1,   0,   4],
       [ -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,  -1,
         -1,  -1,  -1,  -1,   0]])



In [338]:

    
%time cheapestr(0, len(fares[0]) - 1, fares)









    



CPU times: user 18.3 s, sys: 6.67 ms, total: 18.3 s
Wall time: 18.4 s






    Out[338]:





480



In [339]:

    
%time cheapestm(0, len(fares[0]) - 1, fares)









    



CPU times: user 14.7 s, sys: 3.33 ms, total: 14.7 s
Wall time: 14.7 s






    Out[339]:





480



In [340]:

    
%time cheapestdp(0, len(fares[0]) - 1, fares)









    



CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 75.8 µs






    Out[340]:





480

BAAAAAM! See how much faster dynamic programming is?

Well, there you have it!!! This is the power of dynamic programming.

As mentioned in the tutorials, reinforcement learning leverages the power of dynamic programming in many algorithms. Value Iteration, Q-Learning, etc have a similar take on calculation. The bottom line is to think sequentially instead of recursively. And bottom-up instead of top-down. Let's continue this journey.