Scipy

Whereas numpy contains the basic building blocks for numerical computing, scipy contains routines built upon numpy.

You should be trying to avoid programming algorithms as much as possible. The first step to check if an algorithm you want to use already exists is to check scipy.

scipy consists of several modules. It is not usually imported all at once like numpy; rather, import the specific module or function that you need.

scipy.special
scipy.integrate
scipy.optimize
scipy.interpolate
scipy.fftpack
scipy.signal
scipy.linalg
scipy.sparse
scipy.spatial
scipy.stats
scipy.ndimage
scipy.io
scipy.weave

`scipy.stats`

scipy.stats contains at core a set of distributions with a common interface. The main methods on these are:

rvs: random variables
pdf: probability density function
cdf: cumulative distribution function
sf: survival function (1 - cdf)
ppf: percent point function (inverse of cdf)
isf: inverse survival function (inverse of sf)
stats: returns mean, variance, skew, kurtosis
moment: non-central moments of the distribution



In [ ]:

    
from scipy.stats import norm



In [ ]:

    
norm.cdf(0)



In [ ]:

    
norm.cdf([-1, 0, 1])



In [ ]:

    
norm.mean()



In [ ]:

    
norm.std()



In [ ]:

    
norm.rvs(size=10)

The default norm, we see, has mean 0 and standard deviation 1. We can also create our own "frozen" norm. The two primary keyword arguments are standardized across all distributions as loc and scale.



In [ ]:

    
big_norm = norm(scale=10, loc=0.5)



In [ ]:

    
big_norm



In [ ]:

    
big_norm.rvs(size=10)



In [ ]:

    
big_norm.cdf([-3, 0, 3])

We can get the same behavior by supplying the keyword arguments to the first call as well.



In [ ]:

    
norm.rvs(loc=50, scale=10, size=5)



In [ ]:

    
from scipy.stats import uniform
uniform.cdf(range(6), loc=1, scale=4)

It is recommended to always use keyword arguments with scipy.stats or you may get unexpected results:



In [ ]:

    
norm.rvs(5)

What's happening is that 5 is getting passed to the scale parameter, not the shape parameter as you might have expected.

Some distributions have parameters other than scale and loc. The gamma distribution, for example, is

$$ \gamma(x, a) = \frac{\lambda(\lambda x)^{a-1}} {\Gamma (a)} e^{-\gamma x}. $$

For gamma, the scale keyword sets $1/\lambda$. $a$ is set as an addiional argument.



In [ ]:

    
norm.numargs



In [ ]:

    
from scipy.stats import gamma

gamma.numargs



In [ ]:

    
gamma.shapes



In [ ]:

    
gamma(1, scale=2).stats(moments='mvs')  # mean, variance, skew

You can also specify the additional shape parameter(s) as keyword arguments.



In [ ]:

    
gamma(a=1, scale=2).stats(moments='mvs')

`scipy.integrate`

scipy.integrate contains numerical integration routines. The one I use most often is odeint. This integrates a system of differential equations

$$ \frac{d \mathbf{y} } {dt} = \mathbf{f} (\mathbf{y} ,t). $$

You are required to supply the function $\mathbf{f}(\mathbf{y},t)$, so there is a functional programming element here.

The signature of this function should be

def f(y, t, *params):

Let's do an example, a master equation with $n$ slots and two parameters: the rate at which stuff is moving to the left, and the rate at which stuff is moving to the right.



In [ ]:

    
import numpy as np

def master_equation(probability_dist, t, rate_left, rate_right):
    n_move_left = rate_left * probability_dist
    n_move_left[0] = 0  # Nothing can move left from the left-most spot.
    
    n_move_right = rate_right * probability_dist
    n_move_right[-1] = 0  # Ditto.
    
    additions = np.roll(n_move_left, -1) + np.roll(n_move_right, 1)
    subtractions = n_move_left + n_move_right
    
    return additions - subtractions

We can simulate our model using scipy.integrate.odeint. It takes four arguments:

The function that defines the rates of change.
The initial conditions.
A vector of time steps.
A tuple of any additional parameters the function needs (in our case, rate_left and rate_right).



In [ ]:

    
from scipy.integrate import odeint

y = odeint(master_equation, np.ones(5)/5, np.arange(200), (2, 8))



In [ ]:

    
y.shape



In [ ]:

    
y[-1]



In [ ]:

    
import matplotlib.pyplot as plt
%matplotlib inline



In [ ]:

    
plt.bar(np.arange(5), y[-1], align='center')



In [ ]:

    
y = odeint(master_equation, np.ones(5)/5, np.arange(200), (2, 2))
plt.bar(np.arange(5), y[-1], align='center')



In [ ]:

    
def simulate_master_equation(n, rate_left, rate_right):
    y = odeint(master_equation, np.ones(n)/n, np.arange(200),
               (rate_left, rate_right))
    plt.bar(np.arange(n), y[-1], align='center')



In [ ]:

    
simulate_master_equation(10, 3, 2)



In [ ]:

    
simulate_master_equation(7, 0, 2)