Toy model ch3 page 82



In [1]:

    
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import seaborn as sns
import pymc3 as pm
import pandas as pd

%matplotlib inline
sns.set(font_scale=1.5)



In [2]:

    
N_samples = [30, 30, 30]
G_samples = [18, 18, 18]



In [3]:

    
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))



In [6]:

    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h = trace_j[200:]









    



Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:07<00:00, 1252.18draws/s]
There were 59 divergences after tuning. Increase `target_accept` or reparameterize.
There were 87 divergences after tuning. Increase `target_accept` or reparameterize.
There were 66 divergences after tuning. Increase `target_accept` or reparameterize.
There were 91 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.



In [8]:

    
pm.traceplot(chain_h);



In [13]:

    
N_samples = [30, 30, 30]
G_samples = [18, 18, 18]
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))
    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h1 = trace_j[200:]

N_samples = [30, 30, 30]
G_samples = [3, 3, 3]
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))
    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h2 = trace_j[200:]

N_samples = [30, 30, 30]
G_samples = [18, 3, 3]
group_idx = np.repeat(np.arange(len(N_samples)), N_samples)
data = []
for i in range(0, len(N_samples)):
    data.extend(np.repeat([1, 0], [G_samples[i], N_samples[i]-G_samples[i]]))
    
with pm.Model() as model_h:
    alpha = pm.HalfCauchy('alpha', beta=10)
    beta = pm.HalfCauchy('beta', beta=10)
    theta = pm.Beta('theta', alpha, beta, shape=len(N_samples))
    y = pm.Bernoulli('y', p=theta[group_idx], observed=data)
    trace_j = pm.sample(2000, chains=4)
chain_h3 = trace_j[200:]









    



Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:07<00:00, 1262.50draws/s]
There were 56 divergences after tuning. Increase `target_accept` or reparameterize.
There were 119 divergences after tuning. Increase `target_accept` or reparameterize.
There were 54 divergences after tuning. Increase `target_accept` or reparameterize.
There were 227 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6815369954624018, but should be close to 0.8. Try to increase the number of tuning steps.
The gelman-rubin statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
The estimated number of effective samples is smaller than 200 for some parameters.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:07<00:00, 1269.47draws/s]
There were 130 divergences after tuning. Increase `target_accept` or reparameterize.
There were 139 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6822183959027014, but should be close to 0.8. Try to increase the number of tuning steps.
There were 182 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.7127859848786474, but should be close to 0.8. Try to increase the number of tuning steps.
There were 207 divergences after tuning. Increase `target_accept` or reparameterize.
The acceptance probability does not match the target. It is 0.6756399019649242, but should be close to 0.8. Try to increase the number of tuning steps.
The estimated number of effective samples is smaller than 200 for some parameters.
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [theta, beta, alpha]
Sampling 4 chains: 100%|██████████| 10000/10000 [00:05<00:00, 1853.19draws/s]
There were 6 divergences after tuning. Increase `target_accept` or reparameterize.
There were 15 divergences after tuning. Increase `target_accept` or reparameterize.
There were 10 divergences after tuning. Increase `target_accept` or reparameterize.
There were 24 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.



In [14]:

    
pm.summary(chain_h1)



In [15]:

    
pm.summary(chain_h2)









    Out[15]:







  
    
      
      mean
      sd
      mc_error
      hpd_2.5
      hpd_97.5
      n_eff
      Rhat
    
  
  
    
      alpha
      7.780281
      7.156001
      0.439887
      0.368760
      23.115267
      224.885163
      1.002464
    
    
      beta
      67.243630
      68.216357
      4.264855
      0.666425
      208.832413
      205.989137
      1.001227
    
    
      theta__0
      0.107099
      0.042570
      0.000800
      0.031901
      0.192322
      3201.615947
      1.001163
    
    
      theta__1
      0.106358
      0.042824
      0.000936
      0.030853
      0.191940
      2227.658647
      1.003464
    
    
      theta__2
      0.106682
      0.042869
      0.000952
      0.028014
      0.188118
      2099.571609
      1.002173



In [16]:

    
pm.summary(chain_h3)









    Out[16]:







  
    
      
      mean
      sd
      mc_error
      hpd_2.5
      hpd_97.5
      n_eff
      Rhat
    
  
  
    
      alpha
      2.752115
      2.211112
      0.088040
      0.231809
      6.636957
      616.009567
      1.004581
    
    
      beta
      7.763315
      7.230675
      0.346449
      0.381257
      19.051035
      427.041322
      1.007023
    
    
      theta__0
      0.521428
      0.091596
      0.001980
      0.347865
      0.707953
      2239.278934
      1.000525
    
    
      theta__1
      0.138719
      0.059558
      0.000809
      0.036621
      0.259363
      5047.910410
      0.999746
    
    
      theta__2
      0.138695
      0.059737
      0.000835
      0.029596
      0.251135
      5028.159793
      1.000390

Look at the estimated priors



In [24]:

    
fog, ax = plt.subplots(figsize=(10,5))

x = np.linspace(0, 1, 100)

for i in np.random.randint(0, len(chain_h), size=100):
    pdf = stats.beta(chain_h['alpha'][i], chain_h['beta'][i]).pdf(x)
    plt.plot(x, pdf, 'g', alpha=0.1)

dist = stats.beta(chain_h['alpha'].mean(), chain_h['beta'].mean())
pdf = dist.pdf(x)
mode = x[np.argmax(pdf)]
mean = dist.moment(1)
plt.plot(x, pdf, label='mode = {:.2f}\nmean = {:.2f}'.format(mode, mean), lw=3)
plt.legend()
plt.xlabel(r'$\theta_{prior}$', )









    Out[24]:





Text(0.5, 0, '$\\theta_{prior}$')



In [36]:

    
plt.figure(figsize=(8,5))
pm.plot_posterior(chain_h,)









    Out[36]:





array([<matplotlib.axes._subplots.AxesSubplot object at 0x7f87f36a1d90>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87f3676cd0>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87f36735d0>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87a208fa90>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x7f87f4ee6d90>],
      dtype=object)






    





<Figure size 576x360 with 0 Axes>



In [ ]:

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
alpha	29.539383	33.167079	2.730963	1.255616	97.147452	32.225035	1.047596
beta	21.245659	27.726566	2.480212	0.687856	71.947771	24.414707	1.071072
theta__0	0.593542	0.072822	0.002191	0.449409	0.729651	373.206477	1.008268
theta__1	0.595040	0.072599	0.002412	0.453384	0.732143	287.165044	1.015320
theta__2	0.595279	0.072772	0.002071	0.454269	0.735834	681.864551	1.006583

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
alpha	7.780281	7.156001	0.439887	0.368760	23.115267	224.885163	1.002464
beta	67.243630	68.216357	4.264855	0.666425	208.832413	205.989137	1.001227
theta__0	0.107099	0.042570	0.000800	0.031901	0.192322	3201.615947	1.001163
theta__1	0.106358	0.042824	0.000936	0.030853	0.191940	2227.658647	1.003464
theta__2	0.106682	0.042869	0.000952	0.028014	0.188118	2099.571609	1.002173

	mean	sd	mc_error	hpd_2.5	hpd_97.5	n_eff	Rhat
alpha	2.752115	2.211112	0.088040	0.231809	6.636957	616.009567	1.004581
beta	7.763315	7.230675	0.346449	0.381257	19.051035	427.041322	1.007023
theta__0	0.521428	0.091596	0.001980	0.347865	0.707953	2239.278934	1.000525
theta__1	0.138719	0.059558	0.000809	0.036621	0.259363	5047.910410	0.999746
theta__2	0.138695	0.059737	0.000835	0.029596	0.251135	5028.159793	1.000390