Glue compiled libraries to Python

How to speed up loop ?

Ctypes

Ctypes, available from Python's standart library, provides C compatible data types, and allows calling functions or shared libraries. It can be used to wrap these libraries in pure Python... but requires (some) additionnal work

Numpy's array have a ctypes attribute which contains the address of the underlying buffer. It is the user's responsiblity to ensure data are contiguous and properly alligned !


In [1]:
import numpy
a = numpy.arange(10)
print(a)
print(a.strides)
print("Address of the buffer: %s" % a.ctypes.data)
print(a.flags)
b=a[::2]
print(b)
print(b.strides)
print("Address of the buffer: %s" % b.ctypes.data)
print(b.flags)


[0 1 2 3 4 5 6 7 8 9]
(8,)
Address of the buffer: 40848944
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
[0 2 4 6 8]
(16,)
Address of the buffer: 40848944
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

F2Py

Fortran to Python interface generator, provided by NumPy, which allows:

  • Calling Fortran 77/90/95, Fortran 90/95 module, and C functions from Python
  • Accessing Fortran 77 COMMON blocks and Fortran 90/95 module data from Python
  • Call-back Python functions from Fortran or C
  • Automatically handling the difference in the data storage order of multi-dimensional Fortran and Numerical Python (i.e. C) arrays.

Of course F2Py needs a ForTran Compiler to compile Fortran code which can be an issue on some platforms.


In [2]:
src = """
C FILE: FIB3.F
      SUBROUTINE FIB(A,N)
C
C     CALCULATE FIRST N FIBONACCI NUMBERS
C
      INTEGER N
      INTEGER A(N)
Cf2py intent(in) n
Cf2py intent(out) a
Cf2py depend(n) a
      DO I=1,N
         IF (I.EQ.1) THEN
            A(I) = 0
         ELSEIF (I.EQ.2) THEN
            A(I) = 1
         ELSE 
            A(I) = A(I-1) + A(I-2)
         ENDIF
      ENDDO
      END
C END FILE FIB3.F"""

Note: Fortran has no "functions" only "subroutines" (or procedure), so all output needs to be given by calling module.

The 3 comment lines staring with Cf2py declare which variable are input, and which are output


In [3]:
from numpy import f2py

In [4]:
f2py.compile(src, "fibo")


Out[4]:
0

In [5]:
import fibo
#reload(fibo)

In [6]:
fibo.fib(19)


Out[6]:
array([   0,    1,    1,    2,    3,    5,    8,   13,   21,   34,   55,
         89,  144,  233,  377,  610,  987, 1597, 2584], dtype=int32)

Other binary modules interfaces

Weave

Weave is part of SciPy, it is a runtime compiler of C/C++ code to make loop go fast but it is deprecated now and remains only for compatibility reasons. While working effectively under UNIX, weave is had to set-up on windows computer (due to the absence of any kind of compiler by default).

Cython

As Python is written in C, any line of Python can be translated to its equivalent C part, using metaprogramming. The Pyrex project aimed at infering types to make the generated C-code easier to optimize for the compiler. Cython is the continuation of the Pyrex project with support for NumPy nd-arrays.

Cython is the weave killer: many project replaced their hand-written binding or C code by Cython code. Used by LXML and most of the scikits

Boost-Python

C++ binding for python from the famous C++ boost library. Very large but also very efficient. It is used by many projects: PyOpenCL, PyCuda.

SWIG

General purpose binding for any kind of interpreted programming language

SIP

The Python-C++ binding from PyQt.

Shibokken

The Python-C++ binding from PySide.

CFFI

The Python-C interface invented by PyPy

Well number crunching is exciting for geeks but most of the data-analysts spent their time at looking at data and thinking how to represent them. Is there a "good" visualization toolkit in Python ? Matplotlib


In [ ]: