Optimized scientific modules of NumPy

NumPy provides most of the numerical methods on ndarrays like:

  • addition / subtraction
  • multiplication
  • division / modulus

In addition to this, there are also some optimized modules, all written in C, which provides it some number-crunching capabilities. This is why a C-compiler is needed for NumPy building.

Random number generator

NumPy has a random sub-module which handles the random number generation.

Uniform distribution are obtained with the random function


In [1]:
%pylab inline


Populating the interactive namespace from numpy and matplotlib

In [2]:
s = numpy.random.random(10000)
hist(s, 10, normed=True)


Out[2]:
(array([ 0.99223655,  1.03424656,  0.92722105,  0.99723774,  0.98423464,
         1.03624704,  1.02124346,  0.96923106,  1.00123869,  1.03924775]),
 array([  6.81801181e-05,   1.00044340e-01,   2.00020501e-01,
          2.99996661e-01,   3.99972821e-01,   4.99948981e-01,
          5.99925141e-01,   6.99901302e-01,   7.99877462e-01,
          8.99853622e-01,   9.99829782e-01]),
 <a list of 10 Patch objects>)

In [3]:
i = numpy.random.random_integers(0, 65000, size=128*128).reshape((128,128))
imshow(i)


Out[3]:
<matplotlib.image.AxesImage at 0x7f8524fd2810>

In [4]:
hist(i.flat, 10, normed=True)


Out[4]:
(array([  1.54860556e-05,   1.45093729e-05,   1.53827526e-05,
          1.50164966e-05,   1.55236203e-05,   1.54015350e-05,
          1.53921438e-05,   1.60307440e-05,   1.60683088e-05,
          1.50540613e-05]),
 array([  1.00000000e+00,   6.50020000e+03,   1.29994000e+04,
          1.94986000e+04,   2.59978000e+04,   3.24970000e+04,
          3.89962000e+04,   4.54954000e+04,   5.19946000e+04,
          5.84938000e+04,   6.49930000e+04]),
 <a list of 10 Patch objects>)

Many univariate or multivariate distribution are available ... The best is just to refer to the numpy.random documentation.

Here is an example of Poisson distribution


In [5]:
p = numpy.random.poisson(10, 10000)
hist(p, 10, normed=True)


Out[5]:
(array([ 0.00113043,  0.01134783,  0.044     ,  0.14491304,  0.10204348,
         0.07143478,  0.04843478,  0.00852174,  0.00230435,  0.00065217]),
 array([  0. ,   2.3,   4.6,   6.9,   9.2,  11.5,  13.8,  16.1,  18.4,
         20.7,  23. ]),
 <a list of 10 Patch objects>)

Linear Algebra

Linear algebra function are available in the numpy.linalg submodule which is basically a Python wrapping of the BLAS and LAPACK Fortran packages which have been translated to C using f2c. Once again no Fortran compiler is needed but Fortran memory alignment may be needed.

The version shipped by default with NumPy is based on the BLAS and LAPACK from netlib and rely on the cleverness of your compiler to get good performances. As the API of BLAS and LAPACK is fixed, it is possible to link against highly optimized libraries like the MKL from Intel ...

Linalg contains tools to calculate:

  • Determinant and norm of a matrix
  • Inverse a matrix
  • Eigenvalues and vectors of a matrix

In [6]:
import numpy
numpy.linalg?

Fast Fourier Transform

Fourier analysis is fundamentally a method for expressing a function as a sum of periodic components, and for recovering the signal from those components. When both the function and its Fourier transform are replaced with discretized counterparts, it is called the discrete Fourier transform (DFT). The DFT has become a mainstay of numerical computing in part because of a very fast algorithm for computing it, called the Fast Fourier Transform (FFT), which was known to Gauss (1805) and was brought to light in its current form by Cooley and Tukey (1965).

NumPy's FFT is based on the FFTPACK ForTran code, translated to C using f2c and wrapped in Pythonic way. It provides 1d, 2d and nd, direct and inverse functions.


In [7]:
numpy.fft?

Load an image and calculate its power spectrum:


In [8]:
import scipy.misc
img = scipy.misc.lena()
imshow(img, cmap="gray")


Out[8]:
<matplotlib.image.AxesImage at 0x7f8524115110>

In [9]:
f2 = numpy.fft.fft2(img)
psd = numpy.fft.fftshift(10 * numpy.log10(abs(f2) ** 2))
imshow(psd)


Out[9]:
<matplotlib.image.AxesImage at 0x7f85240c4e90>

Note that the FFTw library is not (no more) used is NumPy for licensing issues. While faster and multi-threaded, FFTw is also harder to use as it requires the generation of a plan ...

My personnal experience with FFT in high perfomance computer is that it is worth going to GPU using cuFFT when performances matters.

What can we do if we need to access to other optimized, compiled libraries? f2py


In [ ]: