Scientific Applications

Following are important components of a typical scientific application:

  • Numerical computations, matrix operations
  • Data visualization through 2D and 3D graphs
  • GUI with event-driven architecture
  • Reports and output in print-friendly format
  • Access to databases and datastores
  • Access to Internet services

While all of the above are important components, the speed of numerical computations determine the success of a scientific application. While Python is quite strong in the rest of the requirements, it is severely deficient in speed. There are many reasons for this:

  • Python is interpreted
  • Python is a weakly typed language. Hence, it is acceptable to dynamically change the data type of an object at run-time. This makes determining Run Time Type Information (RTTI) necessary to perform operations on the data, thereby slowing the performance

Improving Performance of Python Programs

It is possible to improve performance of a program by choosing an efficient algorithm. But improvement in performance that can be achieved by writing code in a faster programming languae than Python, such as, C/C++ can enhance performance even of the best algorithm. Therefore the best approach to improving performance is by writing code in C, in such a way that it can be called in Python. The typical workflow for this is as follows:

  • Write code in C/C++
  • Include required Python header files and write required wrapper code so that it can be called in Python
  • Compile the C/C++ code to a importable shared library using a C/C++ compiler such as gcc
  • Import the module in Python and test the code

Cython

The Cython project takes a slightly different approach but essentially does the same. Cython is a programming language based on Python but with additional syntax allowing for optional static type declarations, essentially making it a strongly typed language. The typical workflow is as follows:

  • Code is written in Cython, which is a superset of Python, with additional static type declarations which the programmer thinks will speed up performance. Files have the extension .pyx
  • Cython program is compiled to a C program by the Cython compiler. Files have the same first name as the .pyx file from which they are derived and an extension .c
  • The resulting C program is compiled to a shared importable Python library using gcc C compiler. The resulting file has the same first name as the original .pyx file with an extension .so (GNU/Linux), .pyd (Windows).
  • The resulting Python is imported into Python and tested

Except for the first step, the workflow is the same as before. But there is a definite advantage in writing code in Python itself. As a result, any Python program is also a Cython program and is a potential candidate for performance enhancement. As anadded advantage, Cython makes it easy to integrate Python with external C libraries.

Cython Requirements

To use Cython, the following are required:

  • Cython module
  • Python development package containing header files required to compile C/C++ code to be compatible with Python
  • GNU C compiler gcc or GNU C++ compiler g++
  • Depending on how you intend to use Cython, one of the following may be required:
    • Python distutils module - The steps of compiling Cython code to C and then compiling it to importable Python module can be automated by writing a distutils setup.py file
    • Python pyximport module - Automation can be further be simplified using pyximport module

Building a Cython Module using distutils

Following are the typical steps to use distutils:

  • Code is written in Cython, which is a superset of Python. The programmer may choose to include static type declarations. Let us assume that a Cython file hello.pyx contains a simple say_hello() function
  • A distutils setup.py file is written which contains the name to be used for the importable Python module and the Cython file where code resides
  • The Python importable module is built

Here is the Cython file hello.pyx containing the famous funtion that greets the World:

def say_hello():
    print "Hello world!"

The corresponding setup.py file that uses distutils is given below:

from distutils.core import setup
from Cython.Build import cythonize

setup(
  ext_modules = cythonize("helloworld.pyx")
)

The setup.py file imports the required setup function and Extension class from the submodules distutils.core and distutils.extension, respectively. It imports build_ext class from Cython.Distutils submodule.

The name of the Cython source file hello.pyx and the name to be assigned to the Python importable module, namely, hello are also specified.

The hello module is built using the following command:

$ python setup.py build_ext --inplace

This produces the hello.so file which can be imported into Python and used with as follows:

import hello
hello.say_hello()

This produces the output:

Hello, World!

Building a Cython Module from the Command Line

Run the Cython compiler to convert the .pyx file to generate a .c file

$ cython -a hello.pyx

The -a switch, which is optional, generates the hello.html HTML file which can be opened in a browser. This contains the code with the lines highlighted in different shades of yellow. Lines in bright yellow highlighting require optimisation while lines with white background translate into direct C code. This generates the C file hello.c.

The generated C code hello.c must be compiled to an importable Python module hello.so (hello.pyd in Windows) using the C compiler:

gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing \
  -I/usr/include/python2.7 -o hello.so hello.c

It is important to tell where the Python.h header file is located using the -I switch. The generated shared object file hello.so can then be imported into a Python program using the import hello statement and used just as you would, any other module.

import hello

hello.say_hello('World')

Running this Python program generates the following output on the console:

Hello, World!

Building a Cython Module using pyximport

Using pyximport module makes it unnecessary to write a setup.py file. The typical import statements while using pyximport are as follows:

import pyximport
pyximport.install()

import hello
print hello.say_hello('World')

The import hello statement actually carries out the task of compiling the .pyx file to a .c file. This C file is then compiled to a .so (or a .pyd) file by the import hello statement. In fact, everythin happens transparently and you will not see the .c and .so files at all. If the first two lines where you import pyximport and the next line where we call the install() function, you would not realise that several things are happening behind the scenes and you would think you were running an ordinary Python script.


In [1]:
import pyximport
pyximport.install()

import hello
hello.say_hello()


Hello world!

Cython Example

Let us demonstrate how Cython can improve performance of Python code by starting with a simple pure Python function and bring in speed improvements in phases.

Pure Python Code

Let us first write a pure Python script file quad.py, containing two functions. The first function def f(a, b, N):

The IPython shell provides a magic function %timeit to measure the time of execution of selected functions. It executes the specified function a certain number of times and reports the best time. This feature of IPython is used as shown on the last line above.


In [2]:
# file: quad.py - Pure Python code
from math import sin

def f(x):
  return sin(x**2)

def integrate(a, b, N):
  s = 0
  dx = float(b - a) / N
  for i in range(N+1):
    s += f(a + i*dx)
  return s * dx

%timeit integrate(0, 1, 100000)


10 loops, best of 3: 50.3 ms per loop

Cythonize the Python Script (No Changes to Python Code)

As afirst attempt, let us simply Cythonize this script without attempting any optimizations. Let us copy this Python file quad.py as it is into quad0.pyx file and using pyximport, import and time the function and compare the execution speeds:


In [3]:
import pyximport
pyximport.install()

import quad0
%timeit quad0.integrate(0, 1, 100000)


10 loops, best of 3: 24.3 ms per loop

The execution time reduced from 50.6 ms to 26.8 ms, a reduction of 47% compared to the pure Python code.

Static Type Declaration of Function Parameters and Function Local Variables

Performance can be further improved if the data types of the variables can be specified. Here is the quad0.pyx modified and named as quad1.pyx

# file: quad1.pyx - With type declaration
from math import sin

def f(double x):
  return sin(x*x)

def integrate(double a, double b, int N):
  cdef int i
  cdef double s, dx
  s = 0.0
  dx = float(b - a) / N
  for i in range(N):
    s += f(a + i*dx)
  return s * dx

The only changes to the pure Python code that we have made are the static data type declarations for the function parameters and local variables inside the function integrate(). To compile and execute this script, we will again use pyximport


In [4]:
import pyximport
pyximport.install()

import quad1
%timeit quad1.integrate(0, 1, 100000)


100 loops, best of 3: 9.51 ms per loop

The execution time is now reduced to 10.2 ms, a reduction of 79% compared to the pure Python script.

Identify What to Speed Up

Cython can help us in identifying which part of the script needs improvement. To do this, run the cython compiler with the -a switch on the command line as follows:

cython -a quad1.pyx

This generates an HTML file quad1.html which you can open in a browser. Lines marked with yellow background are potential candidates where speed improvements can be achieved. Thus, the return type of the function f() can help speed up the program.


In [5]:
from IPython.display import HTML
HTML(filename='quad1.html')


Out[5]:

Generated by Cython 0.21.1

Raw output: quad1.c

+01: from math import sin
  __pyx_t_1 = PyList_New(1); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_1);
  __Pyx_INCREF(__pyx_n_s_sin);
  PyList_SET_ITEM(__pyx_t_1, 0, __pyx_n_s_sin);
  __Pyx_GIVEREF(__pyx_n_s_sin);
  __pyx_t_2 = __Pyx_Import(__pyx_n_s_math, __pyx_t_1, -1); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_2);
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __pyx_t_1 = __Pyx_ImportFrom(__pyx_t_2, __pyx_n_s_sin); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_1);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_sin, __pyx_t_1) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 1; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
 02: 
+03: def f(double x):
/* Python wrapper */
static PyObject *__pyx_pw_5quad1_1f(PyObject *__pyx_self, PyObject *__pyx_arg_x); /*proto*/
static PyMethodDef __pyx_mdef_5quad1_1f = {"f", (PyCFunction)__pyx_pw_5quad1_1f, METH_O, 0};
static PyObject *__pyx_pw_5quad1_1f(PyObject *__pyx_self, PyObject *__pyx_arg_x) {
  double __pyx_v_x;
  PyObject *__pyx_r = 0;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("f (wrapper)", 0);
  assert(__pyx_arg_x); {
    __pyx_v_x = __pyx_PyFloat_AsDouble(__pyx_arg_x); if (unlikely((__pyx_v_x == (double)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 3; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
  }
  goto __pyx_L4_argument_unpacking_done;
  __pyx_L3_error:;
  __Pyx_AddTraceback("quad1.f", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __Pyx_RefNannyFinishContext();
  return NULL;
  __pyx_L4_argument_unpacking_done:;
  __pyx_r = __pyx_pf_5quad1_f(__pyx_self, ((double)__pyx_v_x));
  int __pyx_lineno = 0;
  const char *__pyx_filename = NULL;
  int __pyx_clineno = 0;

  /* function exit code */
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

static PyObject *__pyx_pf_5quad1_f(CYTHON_UNUSED PyObject *__pyx_self, double __pyx_v_x) {
  PyObject *__pyx_r = NULL;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("f", 0);
/* … */
  /* function exit code */
  __pyx_L1_error:;
  __Pyx_XDECREF(__pyx_t_1);
  __Pyx_XDECREF(__pyx_t_2);
  __Pyx_XDECREF(__pyx_t_3);
  __Pyx_XDECREF(__pyx_t_4);
  __Pyx_XDECREF(__pyx_t_5);
  __Pyx_AddTraceback("quad1.f", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __pyx_r = NULL;
  __pyx_L0:;
  __Pyx_XGIVEREF(__pyx_r);
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}
/* … */
  __pyx_tuple_ = PyTuple_Pack(2, __pyx_n_s_x, __pyx_n_s_x); if (unlikely(!__pyx_tuple_)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 3; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_tuple_);
  __Pyx_GIVEREF(__pyx_tuple_);
/* … */
  __pyx_t_2 = PyCFunction_NewEx(&__pyx_mdef_5quad1_1f, NULL, __pyx_n_s_quad1); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 3; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_2);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_f, __pyx_t_2) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 3; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  __pyx_codeobj__2 = (PyObject*)__Pyx_PyCode_New(1, 0, 2, 0, 0, __pyx_empty_bytes, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_tuple_, __pyx_empty_tuple, __pyx_empty_tuple, __pyx_kp_s_home_satish_Documents_pybelgaum, __pyx_n_s_f, 3, __pyx_empty_bytes); if (unlikely(!__pyx_codeobj__2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 3; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
+04:   return sin(x*x)
  __Pyx_XDECREF(__pyx_r);
  __pyx_t_2 = __Pyx_GetModuleGlobalName(__pyx_n_s_sin); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_2);
  __pyx_t_3 = PyFloat_FromDouble((__pyx_v_x * __pyx_v_x)); if (unlikely(!__pyx_t_3)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_3);
  __pyx_t_4 = NULL;
  if (CYTHON_COMPILING_IN_CPYTHON && unlikely(PyMethod_Check(__pyx_t_2))) {
    __pyx_t_4 = PyMethod_GET_SELF(__pyx_t_2);
    if (likely(__pyx_t_4)) {
      PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_2);
      __Pyx_INCREF(__pyx_t_4);
      __Pyx_INCREF(function);
      __Pyx_DECREF_SET(__pyx_t_2, function);
    }
  }
  if (!__pyx_t_4) {
    __pyx_t_1 = __Pyx_PyObject_CallOneArg(__pyx_t_2, __pyx_t_3); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
    __Pyx_GOTREF(__pyx_t_1);
  } else {
    __pyx_t_5 = PyTuple_New(1+1); if (unlikely(!__pyx_t_5)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_5);
    PyTuple_SET_ITEM(__pyx_t_5, 0, __pyx_t_4); __Pyx_GIVEREF(__pyx_t_4); __pyx_t_4 = NULL;
    PyTuple_SET_ITEM(__pyx_t_5, 0+1, __pyx_t_3);
    __Pyx_GIVEREF(__pyx_t_3);
    __pyx_t_3 = 0;
    __pyx_t_1 = __Pyx_PyObject_Call(__pyx_t_2, __pyx_t_5, NULL); if (unlikely(!__pyx_t_1)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 4; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_1);
    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;
  }
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
  __pyx_r = __pyx_t_1;
  __pyx_t_1 = 0;
  goto __pyx_L0;
 05: 
+06: def integrate(double a, double b, int N):
/* Python wrapper */
static PyObject *__pyx_pw_5quad1_3integrate(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds); /*proto*/
static PyMethodDef __pyx_mdef_5quad1_3integrate = {"integrate", (PyCFunction)__pyx_pw_5quad1_3integrate, METH_VARARGS|METH_KEYWORDS, 0};
static PyObject *__pyx_pw_5quad1_3integrate(PyObject *__pyx_self, PyObject *__pyx_args, PyObject *__pyx_kwds) {
  double __pyx_v_a;
  double __pyx_v_b;
  int __pyx_v_N;
  PyObject *__pyx_r = 0;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("integrate (wrapper)", 0);
  {
    static PyObject **__pyx_pyargnames[] = {&__pyx_n_s_a,&__pyx_n_s_b,&__pyx_n_s_N,0};
    PyObject* values[3] = {0,0,0};
    if (unlikely(__pyx_kwds)) {
      Py_ssize_t kw_args;
      const Py_ssize_t pos_args = PyTuple_GET_SIZE(__pyx_args);
      switch (pos_args) {
        case  3: values[2] = PyTuple_GET_ITEM(__pyx_args, 2);
        case  2: values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
        case  1: values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
        case  0: break;
        default: goto __pyx_L5_argtuple_error;
      }
      kw_args = PyDict_Size(__pyx_kwds);
      switch (pos_args) {
        case  0:
        if (likely((values[0] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_a)) != 0)) kw_args--;
        else goto __pyx_L5_argtuple_error;
        case  1:
        if (likely((values[1] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_b)) != 0)) kw_args--;
        else {
          __Pyx_RaiseArgtupleInvalid("integrate", 1, 3, 3, 1); {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
        }
        case  2:
        if (likely((values[2] = PyDict_GetItem(__pyx_kwds, __pyx_n_s_N)) != 0)) kw_args--;
        else {
          __Pyx_RaiseArgtupleInvalid("integrate", 1, 3, 3, 2); {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
        }
      }
      if (unlikely(kw_args > 0)) {
        if (unlikely(__Pyx_ParseOptionalKeywords(__pyx_kwds, __pyx_pyargnames, 0, values, pos_args, "integrate") < 0)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
      }
    } else if (PyTuple_GET_SIZE(__pyx_args) != 3) {
      goto __pyx_L5_argtuple_error;
    } else {
      values[0] = PyTuple_GET_ITEM(__pyx_args, 0);
      values[1] = PyTuple_GET_ITEM(__pyx_args, 1);
      values[2] = PyTuple_GET_ITEM(__pyx_args, 2);
    }
    __pyx_v_a = __pyx_PyFloat_AsDouble(values[0]); if (unlikely((__pyx_v_a == (double)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
    __pyx_v_b = __pyx_PyFloat_AsDouble(values[1]); if (unlikely((__pyx_v_b == (double)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
    __pyx_v_N = __Pyx_PyInt_As_int(values[2]); if (unlikely((__pyx_v_N == (int)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
  }
  goto __pyx_L4_argument_unpacking_done;
  __pyx_L5_argtuple_error:;
  __Pyx_RaiseArgtupleInvalid("integrate", 1, 3, 3, PyTuple_GET_SIZE(__pyx_args)); {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L3_error;}
  __pyx_L3_error:;
  __Pyx_AddTraceback("quad1.integrate", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __Pyx_RefNannyFinishContext();
  return NULL;
  __pyx_L4_argument_unpacking_done:;
  __pyx_r = __pyx_pf_5quad1_2integrate(__pyx_self, __pyx_v_a, __pyx_v_b, __pyx_v_N);
  int __pyx_lineno = 0;
  const char *__pyx_filename = NULL;
  int __pyx_clineno = 0;

  /* function exit code */
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}

static PyObject *__pyx_pf_5quad1_2integrate(CYTHON_UNUSED PyObject *__pyx_self, double __pyx_v_a, double __pyx_v_b, int __pyx_v_N) {
  int __pyx_v_i;
  double __pyx_v_s;
  double __pyx_v_dx;
  PyObject *__pyx_r = NULL;
  __Pyx_RefNannyDeclarations
  __Pyx_RefNannySetupContext("integrate", 0);
/* … */
  /* function exit code */
  __pyx_L1_error:;
  __Pyx_XDECREF(__pyx_t_3);
  __Pyx_XDECREF(__pyx_t_4);
  __Pyx_XDECREF(__pyx_t_5);
  __Pyx_XDECREF(__pyx_t_6);
  __Pyx_XDECREF(__pyx_t_7);
  __Pyx_XDECREF(__pyx_t_8);
  __Pyx_AddTraceback("quad1.integrate", __pyx_clineno, __pyx_lineno, __pyx_filename);
  __pyx_r = NULL;
  __pyx_L0:;
  __Pyx_XGIVEREF(__pyx_r);
  __Pyx_RefNannyFinishContext();
  return __pyx_r;
}
/* … */
  __pyx_tuple__3 = PyTuple_Pack(6, __pyx_n_s_a, __pyx_n_s_b, __pyx_n_s_N, __pyx_n_s_i, __pyx_n_s_s, __pyx_n_s_dx); if (unlikely(!__pyx_tuple__3)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_tuple__3);
  __Pyx_GIVEREF(__pyx_tuple__3);
/* … */
  __pyx_t_2 = PyCFunction_NewEx(&__pyx_mdef_5quad1_3integrate, NULL, __pyx_n_s_quad1); if (unlikely(!__pyx_t_2)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_2);
  if (PyDict_SetItem(__pyx_d, __pyx_n_s_integrate, __pyx_t_2) < 0) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 6; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_DECREF(__pyx_t_2); __pyx_t_2 = 0;
 07:   cdef int i
 08:   cdef double s, dx
+09:   s = 0.0
  __pyx_v_s = 0.0;
+10:   dx = float(b - a) / N
  if (unlikely(__pyx_v_N == 0)) {
    #ifdef WITH_THREAD
    PyGILState_STATE __pyx_gilstate_save = PyGILState_Ensure();
    #endif
    PyErr_SetString(PyExc_ZeroDivisionError, "float division");
    #ifdef WITH_THREAD
    PyGILState_Release(__pyx_gilstate_save);
    #endif
    {__pyx_filename = __pyx_f[0]; __pyx_lineno = 10; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  }
  __pyx_v_dx = ((__pyx_v_b - __pyx_v_a) / __pyx_v_N);
+11:   for i in range(N):
  __pyx_t_1 = __pyx_v_N;
  for (__pyx_t_2 = 0; __pyx_t_2 < __pyx_t_1; __pyx_t_2+=1) {
    __pyx_v_i = __pyx_t_2;
+12:     s += f(a + i*dx)
    __pyx_t_3 = PyFloat_FromDouble(__pyx_v_s); if (unlikely(!__pyx_t_3)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_3);
    __pyx_t_5 = __Pyx_GetModuleGlobalName(__pyx_n_s_f); if (unlikely(!__pyx_t_5)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_5);
    __pyx_t_6 = PyFloat_FromDouble((__pyx_v_a + (__pyx_v_i * __pyx_v_dx))); if (unlikely(!__pyx_t_6)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_6);
    __pyx_t_7 = NULL;
    if (CYTHON_COMPILING_IN_CPYTHON && unlikely(PyMethod_Check(__pyx_t_5))) {
      __pyx_t_7 = PyMethod_GET_SELF(__pyx_t_5);
      if (likely(__pyx_t_7)) {
        PyObject* function = PyMethod_GET_FUNCTION(__pyx_t_5);
        __Pyx_INCREF(__pyx_t_7);
        __Pyx_INCREF(function);
        __Pyx_DECREF_SET(__pyx_t_5, function);
      }
    }
    if (!__pyx_t_7) {
      __pyx_t_4 = __Pyx_PyObject_CallOneArg(__pyx_t_5, __pyx_t_6); if (unlikely(!__pyx_t_4)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
      __Pyx_DECREF(__pyx_t_6); __pyx_t_6 = 0;
      __Pyx_GOTREF(__pyx_t_4);
    } else {
      __pyx_t_8 = PyTuple_New(1+1); if (unlikely(!__pyx_t_8)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
      __Pyx_GOTREF(__pyx_t_8);
      PyTuple_SET_ITEM(__pyx_t_8, 0, __pyx_t_7); __Pyx_GIVEREF(__pyx_t_7); __pyx_t_7 = NULL;
      PyTuple_SET_ITEM(__pyx_t_8, 0+1, __pyx_t_6);
      __Pyx_GIVEREF(__pyx_t_6);
      __pyx_t_6 = 0;
      __pyx_t_4 = __Pyx_PyObject_Call(__pyx_t_5, __pyx_t_8, NULL); if (unlikely(!__pyx_t_4)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
      __Pyx_GOTREF(__pyx_t_4);
      __Pyx_DECREF(__pyx_t_8); __pyx_t_8 = 0;
    }
    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;
    __pyx_t_5 = PyNumber_InPlaceAdd(__pyx_t_3, __pyx_t_4); if (unlikely(!__pyx_t_5)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_GOTREF(__pyx_t_5);
    __Pyx_DECREF(__pyx_t_3); __pyx_t_3 = 0;
    __Pyx_DECREF(__pyx_t_4); __pyx_t_4 = 0;
    __pyx_t_9 = __pyx_PyFloat_AsDouble(__pyx_t_5); if (unlikely((__pyx_t_9 == (double)-1) && PyErr_Occurred())) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 12; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
    __Pyx_DECREF(__pyx_t_5); __pyx_t_5 = 0;
    __pyx_v_s = __pyx_t_9;
  }
+13:   return s * dx
  __Pyx_XDECREF(__pyx_r);
  __pyx_t_5 = PyFloat_FromDouble((__pyx_v_s * __pyx_v_dx)); if (unlikely(!__pyx_t_5)) {__pyx_filename = __pyx_f[0]; __pyx_lineno = 13; __pyx_clineno = __LINE__; goto __pyx_L1_error;}
  __Pyx_GOTREF(__pyx_t_5);
  __pyx_r = __pyx_t_5;
  __pyx_t_5 = 0;
  goto __pyx_L0;
 14: 

Static Type Declaration of Function Return Value

The modified script is now named quad2.pyx and is shown below:

# file: quad2.pyx - Type declaration and function return value

from math import sin

cpdef double f(double x):
  return sin(x*x)

def integrate(double a, double b, int N):
  cdef int i
  cdef double s, dx
  s = 0.0
  dx = float(b - a) / N
  for i in range(N):
    s += f(a + i*dx)
  return s * dx

Let us now execute this script and see what improvement it can make to the code:


In [6]:
import pyximport
pyximport.install()

import quad2
%timeit quad1.integrate(0, 1, 100000)


100 loops, best of 3: 9.56 ms per loop

The execution time is now down to 9.86 ms, a reduction of 80.5% compared to the pure Python script.

Import Functions from C Standard Library

Another possibility is to call the sin function from the C math library directly. Here is the modified quad3.pyx file with the changes made to the import statement on the first line:

# file: quad3.pyx - Import function from C standard library

from libc.math cimport sin

cpdef double f(double x):
  return sin(x*x)

def integrate(double a, double b, int N):
  cdef int i
  cdef double s, dx
  s = 0.0
  dx = float(b - a) / N
  for i in range(N):
    s += f(a + i*dx)
  return s * dx

Let us test this modified script and see what happens:


In [7]:
import pyximport
pyximport.install()

import quad3
%timeit quad1.integrate(0, 1, 100000)


100 loops, best of 3: 9.51 ms per loop

The execution time does not show any significant improvement. It can be seen that speed can be increased dramatically by using Cython by statically declaring the types of function parameters, function local variables, function return values and calling functions directly from the C library.

Cython Magic Functions in IPython

IPython provices a cythonmagic extension that contains a number of magic functions to simplify working with Cython. This extension is loaded with the %load_ext magic as follows:

%load_ext cythonmagic

Once this extension is loaded, several Cython related magic become available.


In [8]:
%load_ext cythonmagic

In [9]:
%%cython_pyximport foo
def f(x):
    return 4.0 * x

In [10]:
print f(10)


40.0

Points to Remember Before Attempting to Improve Performance of Code

  • Test and debug the code completely. Spending time on improving performance and then realising that the code has bugs can be frustrating. One way to ensure this is to automate testing and running all test cases to ensure that all aspects of the code have been tested
  • Study the algorithm and see if it can be improved in any way
  • Profile the code and understand which functions are called more often and how much time your program spends in each of the functions
  • Use the cython -a switch to identify potential candidates for code optimization

A Python program can be profiled with the following command from the command line:

python -m cProfile -s time pythonfile.py

For this to work, cProfile Python module must be installed.

References

Further Refrences


In [ ]: