In [1]:
%load_ext cython

Plain Python: modulo (%) in a loop


In [2]:
def p(n, m):
    output = 0
    for i in range(n):
        output += i % m
    return output

In [3]:
%timeit p(1000000, 42)


10 loops, best of 3: 112 ms per loop

Still Python, but inside a Cython cell

There are still no types declared, but you get a surprising speedup.


In [4]:
%%cython

def f(n, m):
    output = 0
    for i in range(n):
        output += i % m
    return output

In [5]:
%timeit f(1000000, 42)


10 loops, best of 3: 82.7 ms per loop

Cython: give types to all variables


In [6]:
%%cython

def c(int n):
    cdef int i, output
    for i in range(n):
        output += i % 42
    return output

In [7]:
%timeit c(1000000)


1000 loops, best of 3: 1.89 ms per loop

Example of a while loop


In [8]:
%%cython

def w(int n):
    cdef int i = 0, output
    while i < n:
        output += i % 42
        i += 1
    return output

In [9]:
%timeit w(1000000)


100 loops, best of 3: 1.85 ms per loop

Application: custom random number generator

Numpy only generates 32-bit random integers. Let's make a 64-bit random integer generator!


In [10]:
%%cython
cimport cython

# This code was translated to Cython from a 
# wikipedia article at:
#
# https://en.wikipedia.org/wiki/Xorshift
#
#         "Xorshift random number generators are a 
#         class of pseudorandom number generators that 
#         was discovered by George Marsaglia.[1] They 
#         generate the next number in their sequence 
#         by repeatedly taking the exclusive or of a 
#         number with a bit shifted version of itself. 
#         This makes them extremely fast on modern 
#         computer architectures"
#
#         "A naive C implementation of a xorshift+ generator 
#         that passes all tests from the BigCrush suite 
#         (with an order of magnitude fewer failures than 
#         Mersenne Twister or WELL) typically takes fewer 
#         than 10 clock cycles on x86 to generate a random 
#         number, thanks to instruction pipelining."


cdef unsigned long long s[2]  # Seed: initialize to nonzero
cdef inline unsigned long long xorshiftplus():
    """ Direct translation from Wikipedia page """
    cdef unsigned long long x = s[0]
    cdef unsigned long long y = s[1]
    s[0] = y
    x ^= x << 23  # a
    x ^= x >> 17  # b
    x ^= y^(y>>26) # c
    s[1] = x
    return x + y

@cython.boundscheck(False)
def random_array(unsigned long long[:] output):
    """ Array must be already be sized """
    s[0] = 1  # Set the seed
    s[1] = 2
    cdef int i, n = output.shape[0] 
    for i in range(n):
        output[i] = xorshiftplus()

Quick demo


In [11]:
# Create storage for our 8-byte random numbers
import numpy
output = numpy.zeros(10, dtype=numpy.uint64)
random_array(output)
print(output)


[            8388677            33554692      70368777736387
     211106267148357     281509366091972     360777324180299
  288538377073858266  865509272901433454 1155350481850751274
 1191592583033944276]

Speed test - compare Cython and Numpy

Note: Numpy generates only 32-bit integers, but uses the Mersenne Twister algorithm. Our Cython function generates 64-bit integers, but uses the Xorshift+ algorithm.


In [12]:
n = int(1e8)  # 100 million random numbers
output = numpy.zeros(n, dtype=numpy.uint64)
%timeit random_array(output)
%timeit y = numpy.random.randint(low=0, high=2**31 - 1, size=n)


1 loop, best of 3: 143 ms per loop
1 loop, best of 3: 715 ms per loop

In [ ]: