In [ ]:
from scipy.stats import norm, uniform
from iis import IIS, Model
def mymodel(params):
"""User-defined model with two parameters
Parameters
----------
params : numpy.ndarray 1-D
Returns
-------
state : float
return value (could also be an array)
"""
return params[0] + params[1]*2
likelihood = norm(loc=1, scale=1) # normal, univariate distribution mean 1, s.d. 1
prior = [norm(loc=0, scale=10), uniform(loc=-10, scale=20)]
model = Model(mymodel, likelihood, prior=prior) # define the model
In [ ]:
solver = IIS(model)
ensemble = solver.estimate(size=500, maxiter=10)
The IIS class has two attributes of interests:
ensemble : current ensemble history : list of previous ensemblesAnd a to_panel method to vizualize the data as a pandas Panel.
The Ensemble class has following attributes of interest:
state : 2-D ndarray (samples x state variables)params : 2-D ndarray (samples x parameters)model : the model defined above, with target distribution and forward integration functionsFor convenience, it is possible to extract these field as pandas DataFrame or Panel, combining params and state. See in-line help for methods Ensemble.to_dataframe and IIS.to_panel. This feature requires having
pandas installed.
Two plotting methods are also provided: Ensemble.scatter_matrix and IIS.plot_history.
The first is simply a wrapper around pandas' function, but it is so frequently used that it is added
as a method.
In [ ]:
# Use pandas to check out the quantiles of the final ensemble
ensemble.to_dataframe().quantile([0.5, 0.05, 0.95])
In [ ]:
# or the iteration history
solver.to_panel(quantiles=[0.5, 0.05, 0.95])
In [ ]:
# Plotting methods
%matplotlib inline
solver.plot_history(overlay_dists=True)
In [ ]:
ensemble.scatter_matrix() # result
In [ ]:
from pandas.tools.plotting import parallel_coordinates, radviz, andrews_curves
import matplotlib.pyplot as plt
# create clusters of data
categories = []
for i in xrange(ensemble.size):
if ensemble.params[i,0]>0:
cat = 'p0 > 0'
elif ensemble.params[i,0] > -5:
cat = 'p0 < 0 and |p0| < 5'
else:
cat = 'rest'
categories.append(cat)
# Create a DataFrame with a category name
class_column = '_CatName'
df = ensemble.to_dataframe(categories=categories, class_column=class_column)
plt.figure()
parallel_coordinates(df, class_column)
plt.title("parallel_coordinates")
plt.figure()
radviz(df, class_column)
plt.title("radviz")