# Think Bayes

Second Edition

``````

In [1]:

# If we're running on Colab, install empiricaldist
# https://pypi.org/project/empiricaldist/

import sys

if IN_COLAB:
!pip install empiricaldist

``````
``````

In [2]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from empiricaldist import Pmf
from utils import decorate

``````

## The Beta distribution

The Beta distribution is a conjugate prior for the binomial likelihood function, which means that if you start with a Beta distribution and update with a binomial likelihood, the posterior is also Beta.

Also, given the parameters of the prior and the data, we can compute the parameters of the posterior directly. The following class represents a Beta distribution and provides a constant-time Update method.

``````

In [23]:

import scipy.stats

class Beta:
"""Represents a Beta distribution.

See http://en.wikipedia.org/wiki/Beta_distribution
"""
def __init__(self, alpha=1, beta=1):
"""Initializes a Beta distribution."""
self.alpha = alpha
self.beta = beta

def update(self, data):

data: pair of int (heads, tails)
"""
self.beta += tails

def mean(self):
"""Computes the mean of this distribution."""
return self.alpha / (self.alpha + self.beta)

def mode(self):
"""Computes the value with maximum a posteori probability."""
a = self.alpha - 1
b = self.beta - 1
return a / (a + b)

def random(self):
"""Generates a random variate from this distribution."""
return random.betavariate(self.alpha, self.beta)

def sample(self, n):
"""Generates a random sample from this distribution.

n: int sample size
"""
size = n,
return np.random.beta(self.alpha, self.beta, size)

def eval_pdf(self, qs):
"""Evaluates the PDF at qs."""
return scipy.stats.beta.pdf(qs, self.alpha, self.beta)

def eval_cdf(self, qs):
"""Evaluates the CDF at qs."""
return scipy.stats.beta.cdf(qs, self.alpha, self.beta)

def quantile(self, ps):
"""Returns the given percentiles from this distribution.

ps: scalar, array, or list of [0-1]
"""
return scipy.stats.beta.ppf(qs, self.alpha, self.beta)

def make_pmf(self, num=101):
"""Returns a Pmf of this distribution.

Note: Normally, we just evaluate the PDF at a sequence
of points and treat the probability density as a probability
mass.

But if alpha or beta is less than one, we have to be
more careful because the PDF goes to infinity at x=0
and x=1.  In that case we evaluate the CDF and compute
differences.

The result is a little funny, because the values at 0 and 1
are not symmetric.  Nevertheless, it is a reasonable discrete
model of the continuous distribution, and behaves well as
the number of values increases.
"""
if self.alpha < 1 or self.beta < 1:
cdf = self.make_cdf()
pmf = cdf.make_pmf()
return pmf

qs = np.linspace(0, 1, num)
ps = self.eval_pdf(qs)
return Pmf(ps, qs)

def make_cdf(self, num=101):
"""Returns the CDF of this distribution."""
qs = np.linspace(0, 1, num)
ps = self.eval_cdf(qs)
return Cdf(ps, qs)

``````

Here's how we use it.

``````

In [24]:

beta = Beta(1, 1)
beta.update((140, 110))
beta.mean()

``````
``````

Out[24]:

0.5595238095238095

``````

And here's the posterior.

``````

In [25]:

beta.make_pmf().plot(label='beta')
decorate_euro()

``````
``````

``````

Amazing, no?

Exercise: One way to construct priors is to make a Beta distribution and adjust the parameters until it has the shape you want. Then when you do an update, the data get added to the parameters of the prior. Since the parameters of the prior play the same mathematical role as the data, they are sometimes called "precounts".

Suppose you believe that most coins are fair or unlikely to deviate from 50% by more than a few percentage points. Construct a prior that captures this belief and update it with the Euro data. How much effect does it have on the posterior, compared to the uniform prior?

Hint: A Beta distribution with parameters `(1, 1)` is uniform from 0 to 1.

``````

In [26]:

# Solution

# Here's the uniform prior

uniform = Beta(1, 1)
uniform.make_pmf().plot(label='uniform')
decorate_euro()

``````
``````

``````
``````

In [27]:

# Solution

# And here's what it looks like after the update

uniform.update(data)
uniform.make_pmf().plot(label='posterior')
decorate_euro()

``````
``````

``````
``````

In [28]:

# Solution

# Here's a beta prior with precounts chosen to represent

beta = Beta(100, 100)
beta.make_pmf().plot(label='precount')
decorate_euro()

``````
``````

``````
``````

In [29]:

# Solution

# And here's what it looks like after the update

beta.update(data)
beta.make_pmf().plot(label='precount')
decorate_euro()

``````
``````

``````
``````

In [30]:

# Solution

# Comparing the two, we see that the informative
# prior influences the location and spread of the
# posterior.

uniform.make_pmf().plot(label='uniform')
beta.make_pmf().plot(label='precount')
decorate_euro()

``````
``````

``````
``````

In [31]:

beta.mean()

``````
``````

Out[31]:

0.5333333333333333

``````
``````

In [ ]:

``````