paramnormal.In addition to explicitly creating distributions from known parameters, paramnormal.[dist].fit provides a similar, interface to scipy.stats maximum-likelihood estimatation methods.
Again, we'll demonstrate with a lognormal distribution and compare parameter estimatation with scipy.
In [ ]:
%matplotlib inline
In [ ]:
import warnings
warnings.simplefilter('ignore')
import numpy as np
import matplotlib.pyplot as plt
import seaborn
import paramnormal
clean_bkgd = {'axes.facecolor':'none', 'figure.facecolor':'none'}
seaborn.set(style='ticks', rc=clean_bkgd)
Let's start by generating a reasonably-sized random dataset and plotting a histogram.
The primary method of creating a distribution from named parameters is shown below.
The call to paramnormal.lognornal translates the parameter to be compatible with scipy. We then chain a call to the rvs (random variates) method of the returned scipy distribution.
In [ ]:
np.random.seed(0)
x = paramnormal.lognormal(mu=1.75, sigma=0.75).rvs(370)
Here's a histogram to illustrate the distribution.
In [ ]:
bins = np.logspace(-0.5, 1.75, num=25)
fig, ax = plt.subplots()
_ = ax.hist(x, bins=bins, normed=True)
ax.set_xscale('log')
ax.set_xlabel('$X$')
ax.set_ylabel('Probability')
seaborn.despine()
fig
Pretending for a moment that we didn't generate this dataset with explicit distribution parameters, how would we go about estimating them?
Scipy provides a maximum-likelihood estimation for estimating parameters:
In [ ]:
from scipy import stats
print(stats.lognorm.fit(x))
Unfortunately those parameters don't really make any sense based on what we know about our articifical dataset.
That's where paramnormal comes in:
In [ ]:
params = paramnormal.lognormal.fit(x)
print(params)
This matches well with our understanding of the distribution.
The returned params variable is a namedtuple that we can easily use to create a distribution via the .from_params methods. From there, we can create a nice plot of the probability distribution function with our histogram.
In [ ]:
dist = paramnormal.lognormal.from_params(params)
# theoretical PDF
x_hat = np.logspace(-0.5, 1.75, num=100)
y_hat = dist.pdf(x_hat)
bins = np.logspace(-0.5, 1.75, num=25)
fig, ax = plt.subplots()
_ = ax.hist(x, bins=bins, normed=True, alpha=0.375)
ax.plot(x_hat, y_hat, zorder=2, color='g')
ax.set_xscale('log')
ax.set_xlabel('$X$')
ax.set_ylabel('Probability')
seaborn.despine()
In [ ]:
params = paramnormal.lognormal.fit(x)
print(params)
In [ ]:
paramnormal.lognormal(mu=1.75, sigma=0.75, offset=0)
From fit parameters:
In [ ]:
paramnormal.lognormal.from_params(params)