Scipy

Whereas numpy contains the basic building blocks for numerical computing, scipy contains routines built upon numpy.

You should be trying to avoid programming algorithms as much as possible. The first step to check if an algorithm you want to use already exists is to check scipy.

scipy consists of several modules. It is not usually imported all at once like numpy; rather, import the specific module or function that you need.

  • scipy.special
  • scipy.integrate
  • scipy.optimize
  • scipy.interpolate
  • scipy.fftpack
  • scipy.signal
  • scipy.linalg
  • scipy.sparse
  • scipy.spatial
  • scipy.stats
  • scipy.ndimage
  • scipy.io
  • scipy.weave

scipy.stats

scipy.stats contains at core a set of distributions with a common interface. The main methods on these are:

  • rvs: random variables
  • pdf: probability density function
  • cdf: cumulative distribution function
  • sf: survival function (1 - cdf)
  • ppf: percent point function (inverse of cdf)
  • isf: inverse survival function (inverse of sf)
  • stats: returns mean, variance, skew, kurtosis
  • moment: non-central moments of the distribution

In [ ]:
from scipy.stats import norm

In [ ]:
norm.cdf(0)

In [ ]:
norm.cdf([-1, 0, 1])

In [ ]:
norm.mean()

In [ ]:
norm.std()

In [ ]:
norm.rvs(size=10)

The default norm, we see, has mean 0 and standard deviation 1. We can also create our own "frozen" norm. The two primary keyword arguments are standardized across all distributions as loc and scale.


In [ ]:
big_norm = norm(scale=10, loc=0.5)

In [ ]:
big_norm

In [ ]:
big_norm.rvs(size=10)

In [ ]:
big_norm.cdf([-3, 0, 3])

We can get the same behavior by supplying the keyword arguments to the first call as well.


In [ ]:
norm.rvs(loc=50, scale=10, size=5)

In [ ]:
from scipy.stats import uniform
uniform.cdf(range(6), loc=1, scale=4)

It is recommended to always use keyword arguments with scipy.stats or you may get unexpected results:


In [ ]:
norm.rvs(5)

What's happening is that 5 is getting passed to the scale parameter, not the shape parameter as you might have expected.

Some distributions have parameters other than scale and loc. The gamma distribution, for example, is

$$ \gamma(x, a) = \frac{\lambda(\lambda x)^{a-1}} {\Gamma (a)} e^{-\gamma x}. $$

For gamma, the scale keyword sets $1/\lambda$. $a$ is set as an addiional argument.


In [ ]:
norm.numargs

In [ ]:
from scipy.stats import gamma

gamma.numargs

In [ ]:
gamma.shapes

In [ ]:
gamma(1, scale=2).stats(moments='mvs')  # mean, variance, skew

You can also specify the additional shape parameter(s) as keyword arguments.


In [ ]:
gamma(a=1, scale=2).stats(moments='mvs')

scipy.integrate

scipy.integrate contains numerical integration routines. The one I use most often is odeint. This integrates a system of differential equations

$$ \frac{d \mathbf{y} } {dt} = \mathbf{f} (\mathbf{y} ,t). $$

You are required to supply the function $\mathbf{f}(\mathbf{y},t)$, so there is a functional programming element here.

The signature of this function should be

def f(y, t, *params):

Let's do an example, a master equation with $n$ slots and two parameters: the rate at which stuff is moving to the left, and the rate at which stuff is moving to the right.


In [ ]:
import numpy as np

def master_equation(probability_dist, t, rate_left, rate_right):
    n_move_left = rate_left * probability_dist
    n_move_left[0] = 0  # Nothing can move left from the left-most spot.
    
    n_move_right = rate_right * probability_dist
    n_move_right[-1] = 0  # Ditto.
    
    additions = np.roll(n_move_left, -1) + np.roll(n_move_right, 1)
    subtractions = n_move_left + n_move_right
    
    return additions - subtractions

We can simulate our model using scipy.integrate.odeint. It takes four arguments:

  • The function that defines the rates of change.
  • The initial conditions.
  • A vector of time steps.
  • A tuple of any additional parameters the function needs (in our case, rate_left and rate_right).

In [ ]:
from scipy.integrate import odeint

y = odeint(master_equation, np.ones(5)/5, np.arange(200), (2, 8))

In [ ]:
y.shape

In [ ]:
y[-1]

In [ ]:
import matplotlib.pyplot as plt
%matplotlib inline

In [ ]:
plt.bar(np.arange(5), y[-1], align='center')

In [ ]:
y = odeint(master_equation, np.ones(5)/5, np.arange(200), (2, 2))
plt.bar(np.arange(5), y[-1], align='center')

In [ ]:
def simulate_master_equation(n, rate_left, rate_right):
    y = odeint(master_equation, np.ones(n)/n, np.arange(200),
               (rate_left, rate_right))
    plt.bar(np.arange(n), y[-1], align='center')

In [ ]:
simulate_master_equation(10, 3, 2)

In [ ]:
simulate_master_equation(7, 0, 2)