Think Bayes

This notebook presents example code and exercise solutions for Think Bayes.

Copyright 2018 Allen B. Downey

MIT License: https://opensource.org/licenses/MIT


In [1]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

# import classes from thinkbayes2
from thinkbayes2 import Hist, Pmf, Suite

Here's the original statement of the cookie problem:

Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies. Bowl 2 contains 20 of each.

Now suppose you choose one of the bowls at random and, without looking, select a cookie at random. The cookie is vanilla. What is the probability that it came from Bowl 1?

If we only draw one cookie, this problem is simple, but if we draw more than one cookie, there is a complication: do we replace the cookie after each draw, or not?

If we replace the cookie, the proportion of vanilla and chocolate cookies stays the same, and we can perform multiple updates with the same likelihood function.

If we don't replace the cookie, the proportions change and we have to keep track of the number of cookies in each bowl.

Exercise:

Modify the solution from the book to handle selection without replacement.

Hint: Add instance variables to the Cookie class to represent the hypothetical state of the bowls, and modify the Likelihood function accordingly.

To represent the state of a Bowl, you might want to use the Hist class from thinkbayes2.


In [2]:
# Solution

# We'll need an object to keep track of the number of cookies in each bowl. 
# I use a Hist object, defined in thinkbayes2:

bowl1 = Hist(dict(vanilla=30, chocolate=10))
bowl2 = Hist(dict(vanilla=20, chocolate=20))

bowl1.Print()


chocolate 10
vanilla 30

In [3]:
# Solution

# Now I'll make a Pmf that contains the two bowls, giving them equal probability.

pmf = Pmf([bowl1, bowl2])
pmf.Print()


Hist({'vanilla': 30, 'chocolate': 10}) 0.5
Hist({'vanilla': 20, 'chocolate': 20}) 0.5

In [4]:
# Solution

# Here's a likelihood function that takes `hypo`, which is one of 
# the Hist objects that represents a bowl, and `data`, which is either 
# 'vanilla' or 'chocolate'.

# `likelihood` computes the likelihood of the data under the hypothesis, 
# and as a side effect, it removes one of the cookies from `hypo`

def likelihood(hypo, data):
    like = hypo[data] / hypo.Total()
    if like:
        hypo[data] -= 1
    return like

In [5]:
# Solution

# Now for the update.  We have to loop through the hypotheses and 
# compute the likelihood of the data under each hypothesis.

def update(pmf, data):
    for hypo in pmf:
        pmf[hypo] *= likelihood(hypo, data)
    return pmf.Normalize()

In [6]:
# Solution

# Here's the first update.  The posterior probabilities are the 
# same as what we got before, but notice that the number of cookies 
# in each Hist has been updated.

update(pmf, 'vanilla')
pmf.Print()


Hist({'vanilla': 29, 'chocolate': 10}) 0.6000000000000001
Hist({'vanilla': 19, 'chocolate': 20}) 0.4

In [7]:
# Solution

# So when we update again with a chocolate cookies, we get different 
# likelihoods, and different posteriors.

update(pmf, 'chocolate')
pmf.Print()


Hist({'vanilla': 29, 'chocolate': 9}) 0.4285714285714286
Hist({'vanilla': 19, 'chocolate': 19}) 0.5714285714285714

In [8]:
# Solution

# If we get 10 more chocolate cookies, that eliminates Bowl 1 completely

for i in range(10):
    update(pmf, 'chocolate')
    print(pmf[bowl1])


0.2621359223300971
0.13636363636363635
0.06104651162790699
0.023800528900642243
0.008061420345489446
0.002316602316602318
0.0005355548943766738
8.929900282780178e-05
8.118750253710948e-06
0.0