Alpha-stable distributions are a fascinating family of distributions. Some basic practices of statistics like hypothesis testing likely wouldn't exist if there were no $\alpha$-stable distributions (we'll see why as we continue). This notebook will serve as an extremely brief introduction and "learning by doing" exercise, particularly focused on Python implementations of this family of distributions.
Whenever possible I will draw discussion from external (open) sources, and focus energy and time on code.
Let's start with a quick description from wikipedia:
In probability theory, a distribution or a random variable is said to be stable if a linear combination of two independent copies of a random sample has the same distribution, up to location and scale parameters. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.
More description from U Huntsville's great Random online notes, specifically the stable distribution section:
Stable distributions are an important general class of probability distributions on $\mathbb{R}$ that are defined in terms of location-scale transformations. Stable distributions occur as limits (in distribution) of scaled and centered sums of independent, identically distributed variables. Such limits generalize the central limit theorem, and so stable distributions generalize the normal distribution in a sense. The pioneering work on stable distributions was done by Paul Lévy.
...
Recall that two distributions on $\mathbb{R}$ that are related by a location-scale transformation are said to be of the same type, and that being of the same type defines an equivalence relation on the class of distributions on $\mathbb{R}$. With this terminology, the definition of stability has a more elegant expression: X has a stable distribution if the sum of a finite number of independent copies of X is of the same type as X.
This family of distributions is defined by four parameters, which have a variety of different names depending on the text you reference. To be consistent with Dominicy and Veredas, I'll use the following:
An instance of this family may be denoted $S(\alpha, \beta, \sigma, \mu)$. Be careful about the last two $\sigma$ and $\mu$ parameters -- these do not correspond immediately to typical mean and variance. For example, for normal distribution special case, $2 \sigma^{2}_{stable} = \sigma^{2}_{normal}$.
There are three special cases of the family of stable distributions:
Directly quoting Domicy and Veredas again:
"A related property of the $\alpha$-stable distribution is that it is a domain of attraction – this is often known as the Generalized Central Limit Theorem – which states that the only distribution that arises as limit from sums of i.i.d. random variables (suitably scaled and centered) is the $\alpha$-stable distribution."
Combine this with the observation that the Normal distribution is the only stable distribution with finite variance. May it be that many distributions we care about have no second moment? (This should be of particular interest to anyone wishing to use ABMs, which typically have convolutions of multiple distributions.)
Fortunately SciPy has a (somewhat limited) levy-stable distribution in the stats library. Let's test out the special cases of levy-stable.
In [1]:
# Import libraries
from __future__ import print_function, division
import scipy.stats as stats
import pylab as plt
import numpy as np
In [3]:
# Let's draw some rvs from the gaussian special case, and compare to the
# appropriate guassian from the stats library
# Form: rvs(alpha, beta, loc=0, scale=1, size=1, random_state=None)
n = 10000
mu=0.0
sigma=1.0
gaussian_special_case_rvs = stats.levy_stable.rvs(alpha=2, beta=1, loc=mu, scale=sigma/np.sqrt(2), size=n)
gaussian_mean = np.mean(gaussian_special_case_rvs)
gaussian_std = np.std(gaussian_special_case_rvs)
print("mean:", gaussian_mean)
print("stdev:", gaussian_std)
x_range = np.linspace(-4,4,1000) #random.normal.rvs(loc=test_mean, scale=test_std)
plt.hist(gaussian_special_case_rvs, bins=21, normed=True)
plt.plot(x_range, stats.norm.pdf(x_range, loc=gaussian_mean, scale=gaussian_std) )
plt.show()
In [4]:
# Let's look at a cauchy:
cauchy_special_case_rvs = stats.levy_stable.rvs(alpha=1, beta=0, loc=mu, scale=sigma/2.0, size=n)
cauchy_mean = np.mean(cauchy_special_case_rvs)
#cauchy_std = np.std(cauchy_special_case_rvs)
print("mean:", cauchy_mean)
#print("stdev:", cauchy_std)
#x_range = np.linspace(-4,4,1000) #random.normal.rvs(loc=test_mean, scale=test_std)
# Let's trim to look at in hist:
trimval = 10
trimndx = abs(cauchy_special_case_rvs) < trimval
print("fraction vals < trimval="+str(trimval)+ ": ", sum(trimndx)/len(trimndx) )
plot_rvs = cauchy_special_case_rvs[trimndx]
x_range = np.linspace(-trimval,trimval,1000)
plt.hist(plot_rvs, bins=51, normed=True)
plt.plot(x_range, stats.cauchy.pdf(x_range, loc=mu, scale=sigma/2.0) )
plt.show()
In [5]:
# Let's look at a Levy:
levy_special_case_rvs = stats.levy_stable.rvs(alpha=0.5, beta=1.0, loc=mu, scale=sigma/2.0, size=n)
levy_mean = np.mean(levy_special_case_rvs)
print("mean:", levy_mean)
# Let's trim to look at in hist:
trimval = 10
trimndx = abs(levy_special_case_rvs) < trimval
print("fraction vals < trimval="+str(trimval)+ ": ", sum(trimndx)/len(trimndx) )
plot_rvs = levy_special_case_rvs[trimndx]
x_range = np.linspace(0.0,trimval,1000)
plt.hist(plot_rvs, bins=51, normed=True)
plt.plot(x_range, stats.levy.pdf(x_range, loc=mu, scale=sigma/2.0) )
plt.show()
Excellent. Not shown: small perturbations of the $\alpha, \beta$ parameters show how the distributions "move away" from the special cases as the paramaterization moves away.
Next steps: generate the data needed for the MSQ estimation experiments.
To explore further:
In [5]:
In [ ]: