In this sample script we present a simple benchmark that compares Numpy, sci20 and Python native loop approaches for calculating accumulative sum for a given array.

Let's start with imports. Make sure you have installed sci20 correctly.


In [7]:
import functools
import timeit

import numpy as np

from sci20.core import Array, FromNumPy
from sci20.core.unittest import AccumulateArrayReturning

The following blocks will be used to define the functions that computes the accumulative sum by the chosen three approaches.

Native implementation using Python for command


In [2]:
def native_acc(x):
    """
    Calculate accumulative sum using Python native loop approach
    :param x: Numpy array
    :return: Accumulative sum (list)
    """
    native_acc_sum = []

    sum_aux = 0
    for val in x:
        native_acc_sum.append(val + sum_aux)
        sum_aux += val

    return native_acc_sum

Just a binding for numpy.cumsum


In [3]:
def np_acc(x):
    """
    Calculate accumulative sum using numpy.cumsum
    :param x: Numpy array
    :return: Accumulative sum (Numpy array)
    """
    return np.cumsum(x)

Now using sci20 implementation


In [4]:
def sci20_acc(x):
    """
    Calculate accumulative sum using sci20 Array
    :param x: Numpy array
    :return: Accumulative sum (sci20 array)
    """
    x_array = Array(FromNumPy(x))
    return AccumulateArrayReturning(x_array)

Here we use the timeit standard lib to obtain the elapsed time of each method


In [5]:
def do_benchmark(n=1000, k=10):
    """
    Compute elapsed time for each accumulative sum implementation (Native loop, Numpy and sci20).
    :param n: Number of array elements. Default is 1000.
    :param k: Number of averages used for timing. Default is 10.
    :param dtype: Array data type. Default is np.double.
    :return: A tuple (dt_native, dt_sci20, dt_np) containing the elapsed time for each method.
    """

    x = np.linspace(1, 100, n)

    dt_native = timeit.Timer(functools.partial(native_acc, x,)).timeit(k)
    dt_sci20 = timeit.Timer(functools.partial(sci20_acc, x,)).timeit(k)
    dt_np = timeit.Timer(functools.partial(np_acc, x,)).timeit(k)

    return dt_native, dt_sci20, dt_np

And finally we have the results. sci20 and Numpy implementations are almost equivalent, while native loop has demonstrated its inefficiency


In [6]:
"""
Computes and prints the elapsed time for each accumulative sum implementation (Native loop, Numpy and sci20).
"""
n_values = [10**x for x in range(1, 8)]  # [10, 100, ..., 10^7]

for n in n_values:
    print("Computing benchmark for n={}...".format(n))
    dt_native, dt_sci20, dt_np = do_benchmark(n)
    print("Native: {:.8f}s / sci20: {:.8f}s / Numpy: {:.8f}s.".format(dt_native, dt_sci20, dt_np))


Computing benchmark for n=10...
Native: 0.00044343s / sci20: 0.00137527s / Numpy: 0.00004437s.
Computing benchmark for n=100...
Native: 0.00087931s / sci20: 0.00009992s / Numpy: 0.00003139s.
Computing benchmark for n=1000...
Native: 0.00690351s / sci20: 0.00017659s / Numpy: 0.00006822s.
Computing benchmark for n=10000...
Native: 0.05423526s / sci20: 0.00039181s / Numpy: 0.00030850s.
Computing benchmark for n=100000...
Native: 0.56866048s / sci20: 0.00286645s / Numpy: 0.00264851s.
Computing benchmark for n=1000000...
Native: 5.58011361s / sci20: 0.04359414s / Numpy: 0.04290680s.
Computing benchmark for n=10000000...
Native: 56.68339517s / sci20: 0.45964209s / Numpy: 0.46603757s.