Cython Demo


In [ ]:
import timeit

In [ ]:
# There are two packages, one containing regular Python modules and
# the other containing corresponding Cython modules

Cythonization


In [ ]:
# Let's create a C extension from the `hello` module
! rm -f awesome_cython_stuff/hello.c awesome_cython_stuff/hello*.so awesome_cython_stuff/hello.html
! ls awesome_cython_stuff/hello*

In [ ]:
# The Cython module contains code that is unmodified Python code (in this case)
! more awesome_cython_stuff/hello.pyx

C Extensions

To generate the C extension, there are a couple ways:

  1. Run cython manually to generate the .c file and then run gcc to generate the .so:

    cython -a awesome_cython_stuff/hello.pyx
    gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I${PYTHON_HEADER_DIR} -o awesome_cython_stuff/hello.so awesome_cython_stuff/hello.c
  2. Run setup.py (using distutils.extension.Extension and Cython.Distutils.build_ext or Cython.Build.cythonize, etc. to generate the extensions):

    python setup.py install

In [ ]:
# Example setup.py
! head -25 setup.py

In [ ]:
# After generating the C extension, there will be a .c file and a .so file,
# the latter being the more important since that is what actually gets
# imported

In [ ]:
%%bash
ROOTENV=$(conda info | grep "root environment :" | awk '{print $4}')
PYTHON_HEADER_DIR=${ROOTENV}/pkgs/python-3.4.3-2/include/python3.4m
cython -a awesome_cython_stuff/hello.pyx
gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I${PYTHON_HEADER_DIR} -o awesome_cython_stuff/hello.so awesome_cython_stuff/hello.c
ls awesome_cython_stuff/hello*

Hello World in Python vs. Cython


In [ ]:
from regular_old_yet_fine_python_stuff.hello import say_hello as say_hello_python
from awesome_cython_stuff.hello import say_hello as say_hello_cython

In [ ]:
say_hello_python()

In [ ]:
say_hello_cython()

In [ ]:
# Let's see what kind of difference there is in terms of speed

In [ ]:
t = timeit.Timer("say_hello_python()", "from regular_old_yet_fine_python_stuff.hello import say_hello as say_hello_python")
print("Python function: {} seconds".format(t.timeit(100000)))

In [ ]:
t = timeit.Timer("say_hello_cython()", "from awesome_cython_stuff.hello import say_hello as say_hello_cython")
print("Cython function: {} seconds".format(t.timeit(100000)))

In [ ]:
# The C extension version is *usually* at least marginally faster even though
# the original code is unchanged

"Great Circle" Function in Python vs. Cython


In [ ]:
# The "great circle" function calculates the distance between two points on
# the surface of the earth
# Source: http://blog.perrygeo.net/2008/04/19/a-quick-cython-introduction/

In [ ]:
# Python version
! cat regular_old_yet_fine_python_stuff/great_circle.py

In [ ]:
# Cython version (with modified Python code)
# All that's different is we're using C data types
! cat awesome_cython_stuff/great_circle.pyx

In [ ]:
from regular_old_yet_fine_python_stuff.great_circle import great_circle as great_circle_python
from awesome_cython_stuff.great_circle import great_circle as great_circle_cython

lon1, lat1, lon2, lat2 = -72.345, 34.323, -61.823, 54.826
args = "lon1, lat1, lon2, lat2"

print("great_circle_python({1}) = {0}".format(great_circle_python(lon1, lat1, lon2, lat2),
                                              args))
print("great_circle_cython({1}) = {0}".format(great_circle_cython(lon1, lat1, lon2, lat2),
                                              args))

In [ ]:
# Test
num = 100000

t1 = timeit.Timer("great_circle_python(%f, %f, %f, %f)" % (lon1, lat1, lon2, lat2), 
                  "from regular_old_yet_fine_python_stuff.great_circle import great_circle as great_circle_python")
t2 = timeit.Timer("great_circle_cython(%f, %f, %f, %f)" % (lon1, lat1, lon2, lat2), 
                  "from awesome_cython_stuff.great_circle import great_circle as great_circle_cython")
print("Pure python function: {} seconds".format(t1.timeit(num)))
print("Cython function: {} seconds".format(t2.timeit(num)))

In [ ]:
# The C extension function is faster by a small, but consistent amount
# (though the difference decreases as num increases)

In [ ]:
# Let's try to speed up the C extension by using more C code

In [ ]:
! cat awesome_cython_stuff/great_circle_2.pyx

In [ ]:
# Now, it does not use the Python `math` library at all to do the calculations

In [ ]:
from awesome_cython_stuff.great_circle_2 import great_circle_2

print("great_circle_2({1}) = {0}".format(great_circle_2(lon1, lat1, lon2, lat2),
                                         args))

In [ ]:
# Test
t3 = timeit.Timer("great_circle_2(%f, %f, %f, %f)" % (lon1, lat1, lon2, lat2), 
                  "from awesome_cython_stuff.great_circle_2 import great_circle_2")
print("Pure python function: {} seconds".format(t1.timeit(num)))
print("Cython function #2: {} seconds".format(t3.timeit(num)))

In [ ]:
# Now the difference really shows as the Cython module is 10x faster

End