rtviz Tutorial

Introduction

rtviz is a small library for visualizing the runtimes of certain functions in python.

rtviz works by mapping a function over a list of lists and plotting the runtime of each function call agains the length of the list element.

Quickstart


In [4]:
from rtviz import rtviz
%matplotlib inline

In [3]:
help(rtviz)


Help on function rtviz in module rtviz.main:

rtviz(func, *args, max_size=1000, num_samples=500, viz=True, verbose=True)
    Takes in a function that receives an iterable as input.
    Returns a plot of the runtimes over iterables of random integers of increasing length.
    
    func: a function that acts on an iterable


In [4]:
rtviz(sorted)


sorted took 110.733ms.

Extended explanation

*The below table is borrowed from the python time complexity page.

The big O runtime values correspond to increasing the input size that we run our algorithms against. We get the average case of runtimes if we run our algorithms on randomly generated lists.


In [2]:



list

Operation

Average Case

Amortized Worst Case

Copy

O(n)

O(n)

Append[1]

O(1)

O(1)

Insert

O(n)

O(n)

Get Item

O(1)

O(1)

Set Item

O(1)

O(1)

Delete Item

O(n)

O(n)

Iteration

O(n)

O(n)

Get Slice

O(k)

O(k)

Del Slice

O(n)

O(n)

Set Slice

O(k+n)

O(k+n)

Extend[1]

O(k)

O(k)

Sort

O(n log n)

O(n log n)

Multiply

O(nk)

O(nk)

x in s

O(n)

min(s), max(s)

O(n)

Get Length

O(1)

O(1)

The big O runtime values can be visualized as the runtime of an algorithm vs. the size of a data structure (in our case a list) the algorithm runs on.

We can check that the big O runtime values are true by explicitly plotting the runtime of an algorithm against the length of randomly generated lists.

For instance, from the above table we see calling the max function on a list runs on the order of the length of the list ($O(n)$), while calling the len function (which returns the length of a list) should be constant time ($O(1)$) no matter what size list we call it on.

Let's see if this is true:

We first call rtviz with the function max.


In [3]:
rtviz(max) # Linear


max took 30.693ms.

Sure enough, we see a linear relationship between runtime of the max function and list length.

Next we call len, which we would expect to be constant time ($O(1)$).


In [11]:
rtviz(len)


len took 30.307ms.

We see that the runtimes of the len method are indeed flat, therefore the runtime is constant.

There is however something a bit unexpected about our graph of the len function: we see multiple lines!

If we instead try calling __len__ which returns the length of an object, we still get the same behavior.


In [6]:
def length(lst):
    return lst.__len__()

assert length([1,2,3]) == 3

In [7]:
rtviz(length)


length took 17.286ms.

Use of rtviz often reveals interesting insights into python's actual implementation. In this case, by visualizing the len method, we are actually seeing the lookup time of a hash table implemented in C. Although the average runtime is constant (as evidenced by the flat lines), depending on how good the hash of our __len__ attribute is we get faster or slower performance (as evidenced by the multiple stacked lines). See this blog post and the original python object implementation in c for more details.


In [ ]: