Cython tutorial

If you are using Anaconda, Cython comes with the build. If you are not, there are two really easy options. To find out if you have it, open ipython and type


In [ ]:
import cython

If it gave you some grief, you don't have it. In the terminal you can type one of these two choices depending on your build: The first, if you aren't using Anaconda


In [ ]:
pip install cython

The second, if you are using Anaconda but Cython is not available:


In [ ]:
conda install cython

Housekeeping with setup.py

This first set of code should be saved into the same file as the script you're working from. This script should be saved as 'setup.py'.


In [ ]:
from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("<name_of_script.pyx>")
)

Getting to the script itself

The first step is to make the file itself. Over the course of the next couple scripts, you will be making variations of a file called circle. Let's start with circle.py. In the circle.py file, type the following:


In [ ]:
import math

def great_circle(lon1,lat1,lon2,lat2):
    radius = 3956 #miles
    x = math.pi/180.0

    a = (90.0-lat1)*(x)
    b = (90.0-lat2)*(x)
    theta = (lon2-lon1)*(x)
    c = math.acos((math.cos(a)*math.cos(b)) +
                  (math.sin(a)*math.sin(b)*math.cos(theta)))
    return radius*c

open an ipython console and type the following:


In [ ]:
import os
# this is only needed if you didn't open ipython from the directory that contains circle.py
os.chdir("<path/to/circle.py>") 
import circle

In [ ]:
lon1, lat1, lon2, lat2 = -72.345, 34.323, -61.823, 54.826
num = 500000

In [ ]:
# Through the magic of ipython magics, let's test out our circle.py
%timeit -n3 -r100 circle.great_circle(lon1, lat1, lon2, lat2)

How did it go?

There are some things that need to happen. First, change your circle.py filename to circle.pyx. This can be done in the terminal with this command:


In [ ]:
mv ./circle.py ./circle.pyx

Go into that setup.py file I had you make and change the script name to circle.pyx. Save & exit, and type the following into the terminal:


In [ ]:
python setup.py build_ext --inplace

You just compiled your first python script...kind of. Now, go back to your ipython console and run it again. Anything change? This was just cythonizing your pure python code. Often, this step alone can speed some things up...but we can do better.

To make things a little faster, we can do what is called type setting. Don't know what that is? Don't worry, all it means is that we are explicitly telling the computer what kind of data it is processing instead of it cycling through each type until it finds the right one. So, to do that, make the following changes to your circle.pyx file:


In [ ]:
# ...
def great_circle(float lon1,float lat1,float lon2,float lat2):
    cdef float radius = 3956.0 
    cdef float pi = 3.14159265
    cdef float x = pi/180.0
    cdef float a,b,theta,c
# ...

Save & exit, recompile, go back to your ipython console, and run it again. Anything change?

There is still some bottlenecking happening because python isn't really that good at math. We can fix this by using the C standard library. To do this, append & save your circle.pyx file with the following code:


In [ ]:
# Replace import math with this code
cdef extern from "math.h":
    float cosf(float theta)
    float sinf(float theta)
    float acosf(float theta)
# ...

Compile and run again from the ipython console. Anything change?

There is really only one more thing to do to REALLY speed things up. Since we made the data more 'C'-like, and we are using the C standard library, the only thing left is to make the function itself more 'C'-like. Make & save the following changes to your circle.pyx file:


In [ ]:
cdef float _great_circle(float lon1,float lat1,float lon2,float lat2):
    cdef float radius = 3956.0 
    cdef float pi = 3.14159265
    cdef float x = pi/180.0
    cdef float a,b,theta,c

    a = (90.0-lat1)*(x)
    b = (90.0-lat2)*(x)
    theta = (lon2-lon1)*(x)
    c = acosf((cosf(a)*cosf(b)) + (sinf(a)*sinf(b)*cosf(theta)))
    return radius*c

def great_circle(float lon1,float lat1,float lon2,float lat2,int num):
    cdef int i
    cdef float x
    for i from 0 < = i < num:
        x = _great_circle(lon1,lat1,lon2,lat2)
    return x

Compile and run in ipython console. Did anything change? The only thing we can really do past this is to write the code strictly in C. That is a bit outside the scope of this class-let alone lecture. Suffice to say, the difference between our code and the C code is unnoticably small.


In [ ]: