Toy model ch3 page 82


In [1]:
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import seaborn as sns
import pymc3 as pm
import pandas as pd

%matplotlib inline
sns.set(font_scale=1.5)

In [2]:
N_samples = [30, 30, 30]
G_samples = [18, 18, 18]

In [3]:
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))

In [6]:
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h = trace_j[200:]


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:07<00:00, 1252.18draws/s]
There were 59 divergences after tuning. Increase `target_accept` or reparameterize.
There were 87 divergences after tuning. Increase `target_accept` or reparameterize.
There were 66 divergences after tuning. Increase `target_accept` or reparameterize.
There were 91 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

In [8]:
pm.traceplot(chain_h);



In [13]:
N_samples = [30, 30, 30]
G_samples = [18, 18, 18]
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))
    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h1 = trace_j[200:]

N_samples = [30, 30, 30]
G_samples = [3, 3, 3]
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))
    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h2 = trace_j[200:]

N_samples = [30, 30, 30]
G_samples = [18, 3, 3]
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))
    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h3 = trace_j[200:]


Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:07<00:00, 1262.50draws/s]
There were 56 divergences after tuning. Increase `target_accept` or reparameterize.
There were 119 divergences after tuning. Increase `target_accept` or reparameterize.
There were 54 divergences after tuning. Increase `target_accept` or reparameterize.
There were 227 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6815369954624018, but should be close to 0.8. Try to increase the number of tuning steps.
The gelman-rubin statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
The estimated number of effective samples is smaller than 200 for some parameters.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:07<00:00, 1269.47draws/s]
There were 130 divergences after tuning. Increase `target_accept` or reparameterize.
There were 139 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6822183959027014, but should be close to 0.8. Try to increase the number of tuning steps.
There were 182 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.7127859848786474, but should be close to 0.8. Try to increase the number of tuning steps.
There were 207 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6756399019649242, but should be close to 0.8. Try to increase the number of tuning steps.
The estimated number of effective samples is smaller than 200 for some parameters.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:05<00:00, 1853.19draws/s]
There were 6 divergences after tuning. Increase `target_accept` or reparameterize.
There were 15 divergences after tuning. Increase `target_accept` or reparameterize.
There were 10 divergences after tuning. Increase `target_accept` or reparameterize.
There were 24 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

In [14]:
pm.summary(chain_h1)


Out[14]:
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
alpha 29.539383 33.167079 2.730963 1.255616 97.147452 32.225035 1.047596
beta 21.245659 27.726566 2.480212 0.687856 71.947771 24.414707 1.071072
theta__0 0.593542 0.072822 0.002191 0.449409 0.729651 373.206477 1.008268
theta__1 0.595040 0.072599 0.002412 0.453384 0.732143 287.165044 1.015320
theta__2 0.595279 0.072772 0.002071 0.454269 0.735834 681.864551 1.006583

In [15]:
pm.summary(chain_h2)


Out[15]:
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
alpha 7.780281 7.156001 0.439887 0.368760 23.115267 224.885163 1.002464
beta 67.243630 68.216357 4.264855 0.666425 208.832413 205.989137 1.001227
theta__0 0.107099 0.042570 0.000800 0.031901 0.192322 3201.615947 1.001163
theta__1 0.106358 0.042824 0.000936 0.030853 0.191940 2227.658647 1.003464
theta__2 0.106682 0.042869 0.000952 0.028014 0.188118 2099.571609 1.002173

In [16]:
pm.summary(chain_h3)


Out[16]:
mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat
alpha 2.752115 2.211112 0.088040 0.231809 6.636957 616.009567 1.004581
beta 7.763315 7.230675 0.346449 0.381257 19.051035 427.041322 1.007023
theta__0 0.521428 0.091596 0.001980 0.347865 0.707953 2239.278934 1.000525
theta__1 0.138719 0.059558 0.000809 0.036621 0.259363 5047.910410 0.999746
theta__2 0.138695 0.059737 0.000835 0.029596 0.251135 5028.159793 1.000390

Look at the estimated priors


In [24]:
fog, ax = plt.subplots(figsize=(10,5))

x = np.linspace(0, 1, 100)

for i in np.random.randint(0, len(chain_h), size=100):
    pdf = stats.beta(chain_h['alpha'][i], chain_h['beta'][i]).pdf(x)
    plt.plot(x, pdf, 'g', alpha=0.1)

dist = stats.beta(chain_h['alpha'].mean(), chain_h['beta'].mean())
pdf = dist.pdf(x)
mode = x[np.argmax(pdf)]
mean = dist.moment(1)
plt.plot(x, pdf, label='mode = {:.2f}\nmean = {:.2f}'.format(mode, mean), lw=3)
plt.legend()
plt.xlabel(r'$\theta_{prior}$', )


Out[24]:
Text(0.5, 0, '$\\theta_{prior}$')

In [36]:
plt.figure(figsize=(8,5))
pm.plot_posterior(chain_h,)


Out[36]:
array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f87f36a1d90>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87f3676cd0>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87f36735d0>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87a208fa90>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87f4ee6d90>],
      dtype=object)
<Figure size 576x360 with 0 Axes>

In [ ]: