Developing probabilistic models using grid methods and MCMC.

Thanks to Chris Fonnesback for his help with this example, and to Colin Carroll, who added features to pymc3 to support some of these examples.

To install the most current version of pymc3 from source, run

`pip3 install -U git+https://github.com/pymc-devs/pymc3.git`

Copyright 2018 Allen Downey

MIT License: https://opensource.org/licenses/MIT

```
In [1]:
```from __future__ import print_function, division
%matplotlib inline
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'
import numpy as np
import pymc3 as pm
import matplotlib.pyplot as plt

```
```

I'll model hockey as a Poisson process, where each team has some long-term average scoring rate, `lambda`

, in goals per game.

For the first example, we'll assume that `lambda`

is known (somehow) to be 2.7. Since regulation play (as opposed to overtime) is 60 minutes, we can compute the goal scoring rate per minute.

```
In [2]:
```lam_per_game = 2.7
min_per_game = 60
lam_per_min = lam_per_game / min_per_game
lam_per_min, lam_per_min**2

```
Out[2]:
```

```
In [3]:
```np.random.random(min_per_game)

```
Out[3]:
```

If the random value is less than `lam_per_min`

, that means we score a goal during that minute.

```
In [4]:
```np.random.random(min_per_game) < lam_per_min

```
Out[4]:
```

So we can get the number of goals scored by one team like this:

```
In [5]:
```np.sum(np.random.random(min_per_game) < lam_per_min)

```
Out[5]:
```

I'll wrap that in a function.

```
In [6]:
```def half_game(lam_per_min, min_per_game=60):
return np.sum(np.random.random(min_per_game) < lam_per_min)

And simulate 10 games.

```
In [7]:
```size = 10
sample = [half_game(lam_per_min) for i in range(size)]

```
Out[7]:
```

`lam_per_game`

.

```
In [8]:
```size = 1000
sample_sim = [half_game(lam_per_min) for i in range(size)]
np.mean(sample_sim), lam_per_game

```
Out[8]:
```

```
In [9]:
```from collections import Counter
class Pmf(Counter):
def normalize(self):
"""Normalizes the PMF so the probabilities add to 1."""
total = sum(self.values())
for key in self:
self[key] /= total
def sorted_items(self):
"""Returns the outcomes and their probabilities."""
return zip(*sorted(self.items()))

Here are some functions for plotting PMFs.

```
In [10]:
```plot_options = dict(linewidth=3, alpha=0.6)
def underride(options):
"""Add key-value pairs to d only if key is not in d.
options: dictionary
"""
for key, val in plot_options.items():
options.setdefault(key, val)
return options
def plot(xs, ys, **options):
"""Line plot with plot_options."""
plt.plot(xs, ys, **underride(options))
def bar(xs, ys, **options):
"""Bar plot with plot_options."""
plt.bar(xs, ys, **underride(options))
def plot_pmf(sample, **options):
"""Compute and plot a PMF."""
pmf = Pmf(sample)
pmf.normalize()
xs, ps = pmf.sorted_items()
bar(xs, ps, **options)
def pmf_goals():
"""Decorate the axes."""
plt.xlabel('Number of goals')
plt.ylabel('PMF')
plt.title('Distribution of goals scored')
legend()
def legend(**options):
"""Draw a legend only if there are labeled items.
"""
ax = plt.gca()
handles, labels = ax.get_legend_handles_labels()
if len(labels):
plt.legend(**options)

Here's what the results from the simulation look like.

```
In [11]:
```plot_pmf(sample_sim, label='simulation')
pmf_goals()

```
```

For the simulation we just did, we can figure out the distribution analytically: it's a binomial distribution with parameters `n`

and `p`

, where `n`

is the number of minutes and `p`

is the probability of scoring a goal during any minute.

We can use NumPy to generate a sample from a binomial distribution.

```
In [12]:
```n = min_per_game
p = lam_per_min
sample_bin = np.random.binomial(n, p, size)
np.mean(sample_bin)

```
Out[12]:
```

And confirm that the results are similar to what we got from the model.

```
In [13]:
```plot_pmf(sample_sim, label='simulation')
plot_pmf(sample_bin, label='binomial')
pmf_goals()

```
```

```
In [14]:
```def plot_cdf(sample, **options):
"""Compute and plot the CDF of a sample."""
pmf = Pmf(sample)
xs, freqs = pmf.sorted_items()
ps = np.cumsum(freqs, dtype=np.float)
ps /= ps[-1]
plot(xs, ps, **options)
def cdf_rates():
"""Decorate the axes."""
plt.xlabel('Goal scoring rate (mu)')
plt.ylabel('CDF')
plt.title('Distribution of goal scoring rate')
legend()
def cdf_goals():
"""Decorate the axes."""
plt.xlabel('Number of goals')
plt.ylabel('CDF')
plt.title('Distribution of goals scored')
legend()
def plot_cdfs(*sample_seq, **options):
"""Plot multiple CDFs."""
for sample in sample_seq:
plot_cdf(sample, **options)
cdf_goals()

Now we can compare the results from the simulation and the sample from the biomial distribution.

```
In [15]:
```plot_cdf(sample_sim, label='simulation')
plot_cdf(sample_bin, label='binomial')
cdf_goals()

```
```

```
In [16]:
```mu = lam_per_game
sample_poisson = np.random.poisson(mu, size)
np.mean(sample_poisson)

```
Out[16]:
```

```
In [17]:
```plot_cdfs(sample_sim, sample_bin)
plot_cdf(sample_poisson, label='poisson', linestyle='dashed')
legend()

```
```

```
In [18]:
```model = pm.Model()
with model:
goals = pm.Poisson('goals', mu)
trace = pm.sample_prior_predictive(1000)

```
In [19]:
```len(trace['goals'])

```
Out[19]:
```

```
In [20]:
```sample_pm = trace['goals']
np.mean(sample_pm)

```
Out[20]:
```

This example is like using a cannon to kill a fly. But it help us learn to use the cannon.

```
In [21]:
```plot_cdfs(sample_sim, sample_bin, sample_poisson)
plot_cdf(sample_pm, label='poisson pymc', linestyle='dashed')
legend()

```
```

```
In [22]:
```import scipy.stats as st
xs = np.arange(11)
ps = st.poisson.cdf(xs, mu)
plot_cdfs(sample_sim, sample_bin, sample_poisson, sample_pm)
plt.plot(xs, ps, label='analytic', linestyle='dashed')
legend()

```
```

```
In [23]:
```xs = np.arange(11)
ps = st.poisson.pmf(xs, mu)
bar(xs, ps, label='analytic PMF')
pmf_goals()

```
```

`mu`

.

```
In [24]:
```def poisson_likelihood(goals, mu):
"""Probability of goals given scoring rate.
goals: observed number of goals (scalar or sequence)
mu: hypothetical goals per game
returns: probability
"""
return np.prod(st.poisson.pmf(goals, mu))

Here's the probability of scoring 6 goals in a game if the long-term rate is 2.7 goals per game.

```
In [25]:
```poisson_likelihood(goals=6, mu=2.7)

```
Out[25]:
```

Here's the probability of scoring 3 goals.

```
In [26]:
```poisson_likelihood(goals=3, mu=2.7)

```
Out[26]:
```

```
In [27]:
```poisson_likelihood(goals=[6, 2], mu=2.7)

```
Out[27]:
```

Ok, it's finally time to do some inference! The function we just wrote computes the likelihood of the data, given a hypothetical value of `mu`

:

$\mathrm{Prob}~(x ~|~ \mu)$

But what we really want is the distribution of `mu`

, given the data:

$\mathrm{Prob}~(\mu ~|~ x)$

If only there were some theorem that relates these probabilities!

The following class implements Bayes's theorem.

```
In [28]:
```class Suite(Pmf):
"""Represents a set of hypotheses and their probabilities."""
def bayes_update(self, data, like_func):
"""Perform a Bayesian update.
data: some representation of observed data
like_func: likelihood function that takes (data, hypo), where
hypo is the hypothetical value of some parameter,
and returns P(data | hypo)
"""
for hypo in self:
self[hypo] *= like_func(data, hypo)
self.normalize()
def plot(self, **options):
"""Plot the hypotheses and their probabilities."""
xs, ps = self.sorted_items()
plot(xs, ps, **options)
def pdf_rate():
"""Decorate the axes."""
plt.xlabel('Goals per game (mu)')
plt.ylabel('PDF')
plt.title('Distribution of goal scoring rate')
legend()

I'll start with a uniform prior just to keep things simple. We'll choose a better prior later.

```
In [29]:
```hypo_mu = np.linspace(0, 20, num=51)
hypo_mu

```
Out[29]:
```

Initially `suite`

represents the prior distribution of `mu`

.

```
In [30]:
```suite = Suite(hypo_mu)
suite.normalize()
suite.plot(label='prior')
pdf_rate()

```
```

Now we can update it with the data and plot the posterior.

```
In [31]:
```suite.bayes_update(data=6, like_func=poisson_likelihood)
suite.plot(label='posterior')
pdf_rate()

```
```

With a uniform prior, the posterior is the likelihood function, and the MAP is the value of `mu`

that maximizes likelihood, which is the observed number of goals, 6.

This result is probably not reasonable, because the prior was not reasonable.

```
In [32]:
```xs = [13/6, 19/6, 8/4, 4/4, 10/6, 13/6, 2/2, 4/2, 5/3, 6/3]

```
Out[32]:
```

`k`

and `theta`

.

```
In [33]:
```def estimate_gamma_params(xs):
"""Estimate the parameters of a gamma distribution.
See https://en.wikipedia.org/wiki/Gamma_distribution#Parameter_estimation
"""
s = np.log(np.mean(xs)) - np.mean(np.log(xs))
k = (3 - s + np.sqrt((s-3)**2 + 24*s)) / 12 / s
theta = np.mean(xs) / k
alpha = k
beta = 1 / theta
return alpha, beta

Here are the estimates.

```
In [34]:
```alpha, beta = estimate_gamma_params(xs)
print(alpha, beta)

```
```

`alpha`

and `beta`

and returns a "frozen" distribution from SciPy's stats module:

```
In [35]:
```def make_gamma_dist(alpha, beta):
"""Returns a frozen distribution with given parameters.
"""
return st.gamma(a=alpha, scale=1/beta)

The frozen distribution knows how to compute its mean and standard deviation:

```
In [36]:
```dist = make_gamma_dist(alpha, beta)
print(dist.mean(), dist.std())

```
```

And it can compute its PDF.

```
In [37]:
```hypo_mu = np.linspace(0, 10, num=101)
ps = dist.pdf(hypo_mu)

```
Out[37]:
```

```
In [38]:
```plot(hypo_mu, ps, label='gamma(9.6, 5.1)')
pdf_rate()

```
```

We can use `make_gamma_dist`

to construct a prior suite with the given parameters.

```
In [39]:
```def make_gamma_suite(xs, alpha, beta):
"""Makes a suite based on a gamma distribution.
xs: places to evaluate the PDF
alpha, beta: parameters of the distribution
returns: Suite
"""
dist = make_gamma_dist(alpha, beta)
ps = dist.pdf(xs)
prior = Suite(dict(zip(xs, ps)))
prior.normalize()
return prior

Here's what it looks like.

```
In [40]:
```prior = make_gamma_suite(hypo_mu, alpha, beta)
prior.plot(label='gamma prior')
pdf_rate()

```
```

And we can update this prior using the observed data.

```
In [41]:
```posterior = prior.copy()
posterior.bayes_update(data=6, like_func=poisson_likelihood)
prior.plot(label='prior')
posterior.plot(label='posterior')
pdf_rate()

```
```

The results are substantially different from what we got with the uniform prior.

```
In [42]:
```suite.plot(label='posterior with uniform prior', color='gray')
posterior.plot(label='posterior with gamma prior', color='C1')
pdf_rate()

```
```

```
In [43]:
```posterior2 = posterior.copy()
posterior2.bayes_update(data=2, like_func=poisson_likelihood)
prior.plot(label='prior')
posterior.plot(label='posterior')
posterior2.plot(label='posterior2')
pdf_rate()

```
```

Or, starting with the original prior, we can update with both pieces of data at the same time.

```
In [44]:
```posterior3 = prior.copy()
posterior3.bayes_update(data=[6, 2], like_func=poisson_likelihood)
prior.plot(label='prior')
posterior.plot(label='posterior')
posterior2.plot(label='posterior2')
posterior3.plot(label='posterior3', linestyle='dashed')
pdf_rate()

```
```

I'm using a gamma distribution as a prior in part because it has a shape that seems credible based on what I know about hockey.

But it is also useful because it happens to be the conjugate prior of the Poisson distribution, which means that if the prior is gamma and we update with a Poisson likelihood function, the posterior is also gamma.

See https://en.wikipedia.org/wiki/Conjugate_prior#Discrete_distributions

And often we can compute the parameters of the posterior with very little computation. If we observe `x`

goals in `1`

game, the new parameters are `alpha+x`

and `beta+1`

.

```
In [45]:
```class GammaSuite:
"""Represents a gamma conjugate prior/posterior."""
def __init__(self, alpha, beta):
"""Initialize.
alpha, beta: parameters
dist: frozen distribution from scipy.stats
"""
self.alpha = alpha
self.beta = beta
self.dist = make_gamma_dist(alpha, beta)
def plot(self, xs, **options):
"""Plot the suite.
xs: locations where we should evaluate the PDF.
"""
ps = self.dist.pdf(xs)
ps /= np.sum(ps)
plot(xs, ps, **options)
def bayes_update(self, data):
return GammaSuite(self.alpha+data, self.beta+1)

Here's what the prior looks like using a `GammaSuite`

:

```
In [46]:
```gamma_prior = GammaSuite(alpha, beta)
gamma_prior.plot(hypo_mu, label='prior')
pdf_rate()
gamma_prior.dist.mean()

```
Out[46]:
```

And here's the posterior after one update.

```
In [47]:
```gamma_posterior = gamma_prior.bayes_update(6)
gamma_prior.plot(hypo_mu, label='prior')
gamma_posterior.plot(hypo_mu, label='posterior')
pdf_rate()
gamma_posterior.dist.mean()

```
Out[47]:
```

```
In [48]:
```gamma_prior.plot(hypo_mu, label='prior')
gamma_posterior.plot(hypo_mu, label='posterior conjugate')
posterior.plot(label='posterior grid', linestyle='dashed')
pdf_rate()

```
```

Ok, let's get to what is usually the point of this whole exercise, making predictions.

The prior represents what we believe about the distribution of `mu`

based on the data (and our prior beliefs).

Each value of `mu`

is a possible goal scoring rate.

For a given value of `mu`

, we can generate a distribution of goals scored in a particular game, which is Poisson.

But we don't have a given value of `mu`

, we have a whole bunch of values for `mu`

, with different probabilities.

So the posterior predictive distribution is a mixture of Poissons with different weights.

The simplest way to generate the posterior predictive distribution is to

Draw a random

`mu`

from the posterior distribution.Draw a random number of goals from

`Poisson(mu)`

.Repeat.

Here's a function that draws a sample from a posterior `Suite`

(the grid approximation, not `GammaSuite`

).

```
In [49]:
```def sample_suite(suite, size):
"""Draw a random sample from a Suite
suite: Suite object
size: sample size
"""
xs, ps = zip(*suite.items())
return np.random.choice(xs, size, replace=True, p=ps)

Here's a sample of `mu`

drawn from the posterior distribution (after one game).

```
In [50]:
```size = 10000
sample_post = sample_suite(posterior, size)
np.mean(sample_post)

```
Out[50]:
```

Here's what the posterior distribution looks like.

```
In [51]:
```plot_cdf(sample_post, label='posterior sample')
cdf_rates()

```
```

Now for each value of `mu`

in the posterior sample we draw one sample from `Poisson(mu)`

```
In [52]:
```sample_post_pred = np.random.poisson(sample_post)
np.mean(sample_post_pred)

```
Out[52]:
```

Here's what the posterior predictive distribution looks like.

```
In [53]:
```plot_pmf(sample_post_pred, label='posterior predictive sample')
pmf_goals()

```
```

The posterior predictive distribution represents uncertainty from two sources:

We don't know

`mu`

Even if we knew

`mu`

, we would not know the score of the next game.

It is tempting, but wrong, to generate a posterior prediction by taking the mean of the posterior distribution and drawing samples from `Poisson(mu)`

with just a single value of `mu`

.

That's wrong because it eliminates one of our sources of uncertainty.

Here's an example:

```
In [54]:
```mu_mean = np.mean(sample_post)
sample_post_pred_wrong = np.random.poisson(mu_mean, size)
np.mean(sample_post_pred_wrong)

```
Out[54]:
```

Here's what the samples looks like:

```
In [55]:
```plot_cdf(sample_post_pred, label='posterior predictive sample')
plot_cdf(sample_post_pred_wrong, label='incorrect posterior predictive')
cdf_goals()

```
```

In the incorrect predictive sample, low values and high values are slightly less likely.

The means are about the same:

```
In [56]:
```print(np.mean(sample_post_pred), np.mean(sample_post_pred_wrong))

```
```

But the standard deviation of the incorrect distribution is lower.

```
In [57]:
```print(np.std(sample_post_pred), np.std(sample_post_pred_wrong))

```
```

Ok, we are almost ready to use PyMC for its intended purpose, but first we are going to abuse it a little more.

Previously we used PyMC to draw a sample from a Poisson distribution with known `mu`

.

Now we'll use it to draw a sample from the prior distribution of `mu`

, with known `alpha`

and `beta`

.

We still have the values I estimated based on previous playoff finals:

```
In [58]:
```print(alpha, beta)

```
```

Now we can draw a sample from the prior predictive distribution:

```
In [59]:
```model = pm.Model()
with model:
mu = pm.Gamma('mu', alpha, beta)
trace = pm.sample_prior_predictive(1000)

This might not be a sensible way to use PyMC. If we just want to sample from the prior predictive distribution, we could use NumPy or SciPy just as well. We're doing this to develop and test the model incrementally.

So let's see if the sample looks right.

```
In [60]:
```sample_prior_pm = trace['mu']
np.mean(sample_prior_pm)

```
Out[60]:
```

```
In [61]:
```sample_prior = sample_suite(prior, 2000)
np.mean(sample_prior)

```
Out[61]:
```

```
In [62]:
```plot_cdf(sample_prior, label='prior')
plot_cdf(sample_prior_pm, label='prior pymc')
cdf_rates()

```
```

It looks pretty good (although not actually as close as I expected).

```
In [63]:
```model = pm.Model()
with model:
mu = pm.Gamma('mu', alpha, beta)
goals = pm.Poisson('goals', mu, observed=[6])
trace = pm.sample_prior_predictive(2000)

```
In [64]:
```sample_prior_pred_pm = trace['goals'].flatten()
np.mean(sample_prior_pred_pm)

```
Out[64]:
```

```
In [65]:
```sample_prior_pred = np.random.poisson(sample_prior)
np.mean(sample_prior_pred)

```
Out[65]:
```

Looks good.

```
In [66]:
```plot_cdf(sample_prior_pred, label='prior pred')
plot_cdf(sample_prior_pred_pm, label='prior pred pymc')
cdf_goals()

```
```

```
In [67]:
```model = pm.Model()
with model:
mu = pm.Gamma('mu', alpha, beta)
goals = pm.Poisson('goals', mu, observed=6)
trace = pm.sample(2000, tune=1000)

```
```

`goals`

fixed, the only unknown is `mu`

, so `trace`

contains a sample drawn from the posterior distribution of `mu`

. We can plot the posterior using a function provided by PyMC:

```
In [68]:
```pm.plot_posterior(trace)
pdf_rate()

```
```

And we can extract a sample from the posterior of `mu`

```
In [69]:
```sample_post_pm = trace['mu']
np.mean(sample_post_pm)

```
Out[69]:
```

And compare it to the sample we drew from the grid approximation:

```
In [70]:
```plot_cdf(sample_post, label='posterior grid')
plot_cdf(sample_post_pm, label='posterior pymc')
cdf_rates()

```
```

Again, it looks pretty good.

To generate a posterior predictive distribution, we can use `sample_ppc`

```
In [71]:
```with model:
post_pred = pm.sample_ppc(trace, samples=2000)

```
```

Here's what it looks like:

```
In [72]:
```sample_post_pred_pm = post_pred['goals']

```
Out[72]:
```

```
In [73]:
```sample_post_pred_pm = post_pred['goals']
np.mean(sample_post_pred_pm)

```
Out[73]:
```

```
In [74]:
```plot_cdf(sample_post_pred, label='posterior pred grid')
plot_cdf(sample_post_pred_pm, label='posterior pred pm')
cdf_goals()

```
```

Look's pretty good!

So far, all of this is based on a gamma prior. To choose the parameters of the prior, I used data from previous Stanley Cup finals and computed a maximum likelihood estimate (MLE). But that's not correct, because

- It assumes that the observed goal counts are the long-term goal-scoring rates.
- It treats
`alpha`

and`beta`

as known values rather than parameters to estimate.

In other words, I have ignored two important sources of uncertainty. As a result, my predictions are almost certainly too confident.

The solution is a hierarchical model, where `alpha`

and `beta`

are the parameters that control `mu`

and `mu`

is the parameter that controls `goals`

. Then we can use observed `goals`

to update the distributions of all three unknown parameters.

Of course, now we need a prior distribution for `alpha`

and `beta`

. A common choice is the half Cauchy distribution (see Gelman), but on advice of counsel, I'm going with exponential.

```
In [75]:
```sample = pm.Exponential.dist(lam=1).random(size=1000)
plot_cdf(sample)
plt.xscale('log')
plt.xlabel('Parameter of a gamma distribution')
plt.ylabel('CDF')
np.mean(sample)

```
Out[75]:
```

This distribution represents radical uncertainty about the value of this distribution: it's probably between 0.1 and 10, but it could be really big or really small.

Here's a PyMC model that generates `alpha`

and `beta`

from an exponential distribution.

```
In [76]:
```model = pm.Model()
with model:
alpha = pm.Exponential('alpha', lam=1)
beta = pm.Exponential('beta', lam=1)
trace = pm.sample_prior_predictive(1000)

Here's what the distributions of `alpha`

and `beta`

look like.

```
In [77]:
```sample_prior_alpha = trace['alpha']
plot_cdf(sample_prior_alpha, label='alpha prior')
sample_prior_beta = trace['beta']
plot_cdf(sample_prior_beta, label='beta prior')
plt.xscale('log')
plt.xlabel('Parameter of a gamma distribution')
plt.ylabel('CDF')
np.mean(sample_prior_alpha)

```
Out[77]:
```

Now that we have `alpha`

and `beta`

, we can generate `mu`

.

```
In [78]:
```model = pm.Model()
with model:
alpha = pm.Exponential('alpha', lam=1)
beta = pm.Exponential('beta', lam=1)
mu = pm.Gamma('mu', alpha, beta)
trace = pm.sample_prior_predictive(1000)

Here's what the prior distribution of `mu`

looks like.

```
In [79]:
```sample_prior_mu = trace['mu']
plot_cdf(sample_prior_mu, label='mu prior hierarchical')
cdf_rates()
np.mean(sample_prior_mu)

```
Out[79]:
```

In effect, the model is saying "I have never seen a hockey game before. As far as I know, it could be soccer, could be basketball, could be pinball."

If we zoom in on the range 0 to 10, we can compare the prior implied by the hierarchical model with the gamma prior I hand picked.

```
In [80]:
```plot_cdf(sample_prior_mu, label='mu prior hierarchical')
plot_cdf(sample_prior, label='mu prior', color='gray')
plt.xlim(0, 10)
cdf_rates()

```
```

Obviously, they are very different. They agree that the most likely values are less than 10, but the hierarchical model admits the possibility that `mu`

could be orders of magnitude bigger.

Crazy as it sounds, that's probably what we want in a non-committal prior.

Ok, last step of the forward process, let's generate some goals.

```
In [81]:
```model = pm.Model()
with model:
alpha = pm.Exponential('alpha', lam=1)
beta = pm.Exponential('beta', lam=1)
mu = pm.Gamma('mu', alpha, beta)
goals = pm.Poisson('goals', mu)
trace = pm.sample_prior_predictive(1000)

Here's the prior predictive distribution of goals.

```
In [82]:
```sample_prior_goals = trace['goals']
plot_cdf(sample_prior_goals, label='goals prior')
cdf_goals()
np.mean(sample_prior_goals)

```
Out[82]:
```

To see whether that distribution is right, I ran samples using SciPy.

```
In [83]:
```def forward_hierarchical(size=1):
alpha = st.expon().rvs(size=size)
beta = st.expon().rvs(size=size)
mu = st.gamma(a=alpha, scale=1/beta).rvs(size=size)
goals = st.poisson(mu).rvs(size=size)
return goals[0]
sample_prior_goals_st = [forward_hierarchical() for i in range(1000)];

```
In [84]:
```plot_cdf(sample_prior_goals, label='goals prior')
plot_cdf(sample_prior_goals_st, label='goals prior scipy')
cdf_goals()
plt.xlim(0, 50)
plt.legend(loc='lower right')
np.mean(sample_prior_goals_st)

```
Out[84]:
```

```
In [85]:
```model = pm.Model()
with model:
alpha = pm.Exponential('alpha', lam=1)
beta = pm.Exponential('beta', lam=1)
mu = pm.Gamma('mu', alpha, beta)
goals = pm.Poisson('goals', mu, observed=[6])
trace = pm.sample(1000, tune=2000, nuts_kwargs=dict(target_accept=0.99))

```
```

`mu`

. The posterior mean is close to the observed value, which is what we expect with a weakly informative prior.

```
In [86]:
```sample_post_mu = trace['mu']
plot_cdf(sample_post_mu, label='mu posterior')
cdf_rates()
np.mean(sample_post_mu)

```
Out[86]:
```

```
In [87]:
```model = pm.Model()
with model:
alpha = pm.Exponential('alpha', lam=1)
beta = pm.Exponential('beta', lam=1)
mu_VGK = pm.Gamma('mu_VGK', alpha, beta)
mu_WSH = pm.Gamma('mu_WSH', alpha, beta)
goals_VGK = pm.Poisson('goals_VGK', mu_VGK, observed=[6])
goals_WSH = pm.Poisson('goals_WSH', mu_WSH, observed=[4])
trace = pm.sample(1000, tune=2000, nuts_kwargs=dict(target_accept=0.95))

```
```

We can use `traceplot`

to review the results and do some visual diagnostics.

```
In [88]:
```pm.traceplot(trace);

```
```

Here are the posterior distribitions for `mu_WSH`

and `mu_VGK`

.

```
In [89]:
```sample_post_mu_WSH = trace['mu_WSH']
plot_cdf(sample_post_mu_WSH, label='mu_WSH posterior')
sample_post_mu_VGK = trace['mu_VGK']
plot_cdf(sample_post_mu_VGK, label='mu_VGK posterior')
cdf_rates()
np.mean(sample_post_mu_WSH), np.mean(sample_post_mu_VGK)

```
Out[89]:
```

```
In [90]:
```np.mean(sample_post_mu_VGK > sample_post_mu_WSH)

```
Out[90]:
```

```
In [91]:
```data = dict(BOS13 = [2, 1, 2],
CHI13 = [0, 3, 3],
NYR14 = [0, 2],
LAK14 = [3, 1],
TBL15 = [1, 4, 3, 1, 1, 0],
CHI15 = [2, 3, 2, 2, 2, 2],
SJS16 = [2, 1, 4, 1],
PIT16 = [3, 3, 2, 3],
NSH17 = [3, 1, 5, 4, 0, 0],
PIT17 = [5, 4, 1, 1, 6, 2],
VGK18 = [6,2,1],
WSH18 = [4,3,3],
)

```
Out[91]:
```

Here's how we can get the data into the model.

```
In [92]:
```model = pm.Model()
with model:
alpha = pm.Exponential('alpha', lam=1)
beta = pm.Exponential('beta', lam=1)
mu = dict()
goals = dict()
for name, observed in data.items():
mu[name] = pm.Gamma('mu_'+name, alpha, beta)
goals[name] = pm.Poisson(name, mu[name], observed=observed)
trace = pm.sample(1000, tune=2000, nuts_kwargs=dict(target_accept=0.95))

```
```

And here are the results.

```
In [93]:
```pm.traceplot(trace);

```
```