In [1]:
%matplotlib inline
from pymc3 import Normal, Model
import pymc3 as pm
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['axes.labelsize'] = 22
import seaborn
This is a follow up to a previous post, extending to the case where we have multiple responces from multiple respondants. I will be intentionally brief with the plan to follow up with a complete and thought out post.
In [2]:
Nrespondants = 5
Nresponces = 10
a_val = 45
mu_b_val = 0.1
sigma_b_val = 3
b = np.random.normal(mu_b_val, sigma_b_val, Nrespondants)
xobs_stacked = np.random.uniform(0, 10, (Nresponces, Nrespondants))
yobs_stacked = a_val + b * xobs_stacked + np.random.normal(0, 1.0, (Nresponces, Nrespondants))
plt.plot(xobs_stacked, yobs_stacked, "o")
plt.ylabel(r"$y_\mathrm{obs}$")
plt.xlabel(r"$x_\mathrm{obs}$")
plt.show()
In the following code we flatten the data, but create a set of indexes which maps the responces to the respondant. For example if our data data consisted of 2 repondants, with 3 responces from the first and 2 from the second, then the data above would be:
xobs_stacked = [[1.1, 2.2, 4.5],
[0.5, 0.4]]
yobs_stacked = [[10.2, 10.3, 10.8],
[12.5, 12.5]]
Then we flatten these to get
xobs = [1.1, 2.2, 4.5, 0.5, 0.4]
and create an index as follows
idxs = [0, 0, 0, 1, 1, ]
which says the first, second, and thrid entries below to the 0th respondant, while the last two are from the second. The importance of this will become apparent in a moment. In this instance, we always have the same number of responces from each respondant, so we can use the following trick:
In [3]:
idxs = np.array(list(np.arange(Nrespondants)) * Nresponces)
xobs = xobs_stacked.flatten()
yobs = yobs_stacked.flatten()
idxs
Out[3]:
In [4]:
with pm.Model() as hierarchical_model:
# hyperparameters
mu_b = pm.Normal('mu_b', mu=0., sd=100)
sigma_b = pm.Uniform('sigma_b', lower=0, upper=100)
# Common effects
a = pm.Normal('a', mu=45, sd=100, testval=10)
eps = pm.Uniform('eps', lower=0, upper=10)
# Group effects
b = pm.Normal('b', mu=mu_b, sd=sigma_b, shape=Nrespondants)
mu = a*1 + b[idxs] * xobs
# Likelihood (sampling distribution) of observations
Y_obs = Normal('Y_obs', mu=mu, sd=eps, observed=yobs)
Now we generate samples using the Metropolis algorithm
In [10]:
nsamples = 100000
nburn = 50000
with hierarchical_model:
step = pm.Metropolis()
hierarchical_trace = pm.sample(nsamples, step, progressbar=True)
Now let's use the handy traceplot to inspect the chains and the posteriors having discarded the first half of the samples.
In [9]:
pm.traceplot(hierarchical_trace[nburn:],
vars=['mu_b', 'sigma_b', 'a', 'eps'],
lines={'mu_b': mu_b_val,
'sigma_b': sigma_b_val,
'a': a_val}
)
plt.show()
The posterior distributions (in blue) can be compared with vertical (red) lines indicating the "true" values used to generate the data. This shows that we have not fully captured the features of the model, but compared to the diffuse prior we have learnt a great deal. Note that in generating the data $\epsilon$ was effectively zero: so the fact it's posterior is non-zero supports our understanding that we have not fully converged onto the idea solution.
Finally we will plot a few of the data points along with straight lines from several draws of the posterior. We color code 5 random data points, then draw 100 realisations of the parameters from the posteriors and plot the corresponding straight lines. This shows that the posterior is doing an excellent job at inferring the individual $b_i$ values.
In [7]:
npp = 5
repeats = 1
fig, ax = plt.subplots(figsize=(10, 10))
xfit = np.linspace(0, 10, 100)
for i in np.random.randint(0, Nrespondants, npp):
color = np.random.uniform(0, 1, 3)
for j in range(repeats):
s = hierarchical_trace[np.random.randint(nburn, nsamples)]
yfit = s['a'] + xfit * s['b'][i]
ax.plot(xfit, yfit, "-", lw=1.1, color=color)
ax.plot(xobs_stacked[:, i], yobs_stacked[:, i], "o", color=color, markersize=10)
plt.show()
In [ ]: