SciPy

The SciPy library is a collection of mathematical algorithms and convenience functions built on top of the NumPy library. It is done by the same people but the aim is slightly different:

  • Numpy needs to be available everywhere and easy to compile
  • Scipy wrapps many fortran libraries (ARPACK, FITPACK, LAPACK): a Fortran compiler and many external libraries are needed.

Because SciPy is a very large compilation of numerical routines, there are many sub-modules, which are all loaded separately (just avoid too long import time). To allow faster development, some developer made scikits which are focusing on a specific subject rather then staying generic.

The default behavour of SciPy is sometimes different from Numpy:


In [1]:
import numpy
numpy.sqrt(-1)


-c:2: RuntimeWarning: invalid value encountered in sqrt
Out[1]:
nan

In [2]:
import scipy
scipy.sqrt(-1)


Out[2]:
1j

In [3]:
%pylab inline


Populating the interactive namespace from numpy and matplotlib

Interpolate

While numpy's interpolater is limited to (multi-)linear interpolation, scipy extends to cubic or higher order interpolation, thanks to the FITPACK ForTran library.


In [5]:
#The exect function
f = lambda x:(x-1.)*(x-0.5)*(x+0.5)

#My sampling:
x_sparse = scipy.linspace(-1,1,num=5)
x_dense  = scipy.linspace(-1 , 1 , num=40)

from scipy.interpolate import interp1d
f0 = interp1d(x_sparse, f(x_sparse), kind='zero')
f1 = interp1d(x_sparse, f(x_sparse), kind='linear')
f2 = interp1d(x_sparse, f(x_sparse), kind='quadratic')
f3 = interp1d(x_sparse, f(x_sparse), kind='cubic')
f4 = interp1d(x_sparse, f(x_sparse), kind='nearest')
plot(x_sparse, f(x_sparse) , 'D', label='data')
plot(x_dense, f0(x_dense), ':', label='zero')
plot(x_dense, f1(x_dense), '-.', label='linear')
plot(x_dense, f2(x_dense), '-.', label='quadratic')
plot(x_dense, f3(x_dense), 's--', label='cubic')
plot(x_dense, f4(x_dense), '-' , label='nearest')
plot(x_dense, f(x_dense), linewidth = 2, label='exact')
legend(loc='lower right')
axis([x_dense.min(), x_dense.max(), f(x_dense).min(), f(x_dense).max()+0.1])


Out[5]:
[-1.0, 1.0, -1.5, 0.36350747652522797]

Integration

scipy.integrate provides integration and differential equation solvers. Those functions are wrappers from the QUADPACK ForTran library. It provides 3 kinds of functions:

  • Integrating functions, given function object, like quad(func, a, b)
  • Integrating functions, given fixed samples, like Cumulatively integrate cumtrapz(y)
  • Integrators of ODE systems, like odeint(func, y0, t)

In [6]:
from scipy import integrate
f = lambda x: 2 / (1 + x**2)
integrate.quad(f, 0, inf)


Out[6]:
(3.141592653589793, 5.155583041103855e-10)

In [7]:
f = lambda x: exp(-x**2)
x = linspace(-10, 10 , 1001)
integrate.quad(f,-10,10)[0] - integrate.trapz(f(x),x)


Out[7]:
-4.4408920985006262e-16

Optimization

The scipy.optimize package provides several commonly used optimization algorithms:

  • Unconstrained and constrained minimization of multivariate scalar functions (minimize) using a variety of algorithms (e.g. BFGS, Nelder-Mead simplex, Newton Conjugate Gradient, COBYLA or SLSQP)
  • Global (brute-force) optimization routines (e.g., anneal, basinhopping)
  • Least-squares minimization (leastsq) and curve fitting (curve_fit) algorithms
  • Scalar univariate functions minimizers (minimize_scalar) and root finders (newton)
  • Multivariate equation system solvers (root) using a variety of algorithms (e.g. hybrid Powell, Levenberg-Marquardt or large-scale methods such as Newton-Krylov).

Fourier transform

scipy.fftpack contains Python bindings to the FFTPACK ForTran library. Like NumPy 1d, 2d and nd transform at available in forward and reverse directions. In addition to NumPy, SciPy offers convolution in Fourier space, Discrete Sine and Cosine Transforms (used in JPEG encoding) and some caching of the prime factorization of length of the array.


In [8]:
from scipy.fftpack import fft
# Number of samplepoints
N = 600
# sample spacing
T = 1.0 / 800.0
x = linspace(0.0, N*T, N)
y = sin(50.0 * 2.0*pi*x) + 0.5*sin(80.0 * 2.0*pi*x)
yf = fft(y)
xf = linspace(0.0, 1.0/(2.0*T), N/2)
figure()
subplot(1,2,1)
plot(x,y,label="direct space")
subplot(1,2,2)
plot(xf, 2.0/N * abs(yf[0:N/2]),label="frequency space")
grid()
legend()


Out[8]:
<matplotlib.legend.Legend at 0x7fc3f3aa3510>

signal processing toolbox

The signal processing toolbox, scipy.signal, contains some filtering functions which can be classified in:

  • B-spline continuous representation of the discrete function
  • Convolution filters, including separable 2D filtering
  • Other kind of filter (median, Wiener, ...)
  • Spectral analysis

In [9]:
from scipy import misc, signal
image = misc.lena()
w = signal.gaussian(50, 5.0)
image_new = signal.sepfir2d(image, w, w)
figure()
subplot(1,2,1)
imshow(image, cmap="gray")
title("Original image")
subplot(1,2,2)
imshow(image_new, cmap="gray")
title("Blured image")


Out[9]:
<matplotlib.text.Text at 0x7fc3eef7bf50>

Linear Algebra

Scipy.linalg contains the linear algebra routines from LAPACK wrapped in Python. Unlike Numpy, it need this library to be installed on your computer. One of the most common optimized version is ATLAS but OpenBLAS is also popular and the MKL (from Intel) is the fastest. So linalg from Scipy can be 20x faster than the default implementation provided by NumPy.

sparse matrices

Sparse matrix representation and diagonalization are also handled in scipy using the ForTran ARPACK library.

Multidimensional image processing

The scipy.ndimage is the image manipulation toolkit for n-dimentional images.

Today, ndimage tends to be replaces by scikit-image which is easyer to maintain (thanks to Cython)

Spatial data structures

scipy.spatial can compute triangulations, Voronoi diagrams, and convex hulls of a set of points, by leveraging the Qhull library.

Moreover, it contains KDTree implementations for nearest-neighbor point queries, and utilities for distance computations in various metrics.

Statistical functions

The scipy.stats module contains a large number of probability distributions as well as a growing library of statistical functions.

Input / Output

Scipy.io can read and write files in Matlab/IDL/netcdf format.

Conclusion

Scipy is a huge compilation of numerical libraries. It is too large to fit into a single brain, this is why there is an excellent documentation.

More advanced toolkits are available, those are often research subjects of the respective authors.


In [ ]: