This notebook demonstrates the use of Pmf and Suite objects, and explores a dice-rolling problem that turns out to be more complicated than it sounds
Copyright 2016 Allen Downey
MIT License: http://opensource.org/licenses/MIT
In [1]:
from __future__ import print_function, division
from thinkbayes2 import Pmf, Suite
from thinkbayes2 import MakeMixture
from fractions import Fraction
Suppose I have a six-sided die that is red on 2 sides and blue on 4 sides, and another die that's the other way around, red on 4 sides and blue on 2.
I choose a die at random and roll it, and I tell you it came up red. What is the probability that I rolled the second die (red on 4 sides)?
To answer this question, I'll create Pmf
objects to represent the two dice:
In [2]:
d1 = Pmf({'Red':Fraction(2), 'Blue':Fraction(4)}, label='d1 (bluish) ')
d1.Print()
In [3]:
d2 = Pmf({'Red':Fraction(4), 'Blue':Fraction(2)}, label='d2 (reddish)')
d2.Print()
And I'll make another Pmf
to represent the random choice of one die or the other.
In [4]:
dice = Pmf({d1:Fraction(1), d2:Fraction(1)})
dice.Print()
Now I can use the Random
method to choose a die and then roll it.
In [5]:
dice.Random().Random()
Out[5]:
In [6]:
def rollA(dice):
while True:
die = dice.Random()
roll = die.Random()
yield roll
We can use this generator to simulate rolls:
In [7]:
iterA = rollA(dice)
for i in range(5):
print(next(iterA))
In the long run, the proportion of red and blue is 50-50.
In [8]:
Pmf(next(iterA) for i in range(1000))
Out[8]:
We can see that by computing the weighted mixture of the two dice:
In [9]:
MakeMixture(dice).Print()
To answer the original question, I'll create a suite of hypotheses where each hypothesis is represented by a die, and the likelihood of the data under each hypothesis is the probability that the given die yields the given outcome.
In [10]:
class Dice(Suite):
def Likelihood(self, data, hypo):
"""
data: 'Red' or 'Blue'
hypo: a Die object
"""
return hypo[data]
Now I can create a suite that represents the prior distribution.
In [11]:
prior = Dice({d1:Fraction(1), d2:Fraction(1)})
prior.Print()
And update it with the data.
In [12]:
posterior = prior.Copy()
posterior.Update('Red')
posterior.Print()
The posterior probabilities for d1
is 1/3
and the posterior probability for d2
is 2/3
.
Intuitively, the posterior probability of d2
is higher than the prior (which was 1/2
) because the outcome (red) is more likely on d2
than d1
. If we had rolled blue, the probability of d1
would be higher.
Now suppose I ask you to predict the outcome of the next roll. Remember that in this scenario, I choose a new die each time, so what you learned on the first roll doesn't help you with the second. The predictive distribution is a weighted mixture of the dice, using the prior weights:
In [13]:
predictive = MakeMixture(prior)
predictive.Print()
In [14]:
def rollB(dice):
die = dice.Random()
while True:
roll = die.Random()
yield roll
In the long run, the proportion of red is either 1/3
or 2/3
, not 1/2
:
In [15]:
iterB = rollB(dice)
Pmf(next(iterB) for i in range(1000))
Out[15]:
After the first roll, the posterior suite is the same as in the previous scenario:
In [16]:
posterior = prior.Copy()
posterior.Update('Red')
posterior.Print()
In this scenario, we know we are going to roll the same die each time, so the information we learned from the first roll informs our prediction for the second.
Specifically, now the predictive distribution is based on the posterior, not the prior:
In [17]:
predictive = MakeMixture(posterior)
predictive.Print()
Having seen one red, we are more inclined to belive that I am rolling d2
, so we are more inclined to predict that I will roll red again.
If I do roll red again, we can update the posterior again, using the new data, and make a new prediction:
In [18]:
posterior.Update('Red')
posterior.Print()
In [19]:
predictive = MakeMixture(posterior)
predictive.Print()
If we continue this process, we will eventually be confident that we know which die is being rolled:
In [20]:
posterior = prior.Copy()
for i in range(10):
posterior.Update(next(iterB))
posterior.Print()
And the predictive distribution will be close to 1/3
or 2/3
, depending on which die we think it is.
In [21]:
predictive = MakeMixture(posterior)
predictive.Print()
In [22]:
def rollC(dice):
while True:
die = dice.Random()
roll = die.Random()
if roll == 'Red':
yield roll
In this scenario, obviously, the outcome is always red:
In [23]:
iterC = rollC(dice)
Pmf(next(iterC) for i in range(1000))
Out[23]:
But now suppose I ask you about the last die I rolled. What is the probability that it is the reddish die, d2
?
One each roll, there are four possible results, with these probabilities:
d1, red 1/2 * 1/3
d1, blue 1/2 * 2/3
d2, red 1/2 * 2/3
d2, blue 1/2 * 1/3
On the last roll, I tell you that the outcome is red, so we are left with two possibilities:
d1, red 1/2 * 1/3
d2, red 1/2 * 2/3
The likelihood ratio is 2
to 1
, so we can use that to update the prior:
In [24]:
posterior = prior.Copy()
posterior[d1] *= 1
posterior[d2] *= 2
posterior.Normalize()
posterior.Print()
That's the same posterior we saw in Scenarios A and B. So even though we knew the outcome would be red, nevertheless we learn something about the die.
Of course, now the predictive distribution is always red:
In [25]:
predictive = Pmf({'Red':1.0})
predictive.Print()
In [26]:
def rollD(dice):
die = dice.Random()
while True:
roll = die.Random()
if roll == 'Red':
yield roll
Again, obviously, the outcome is always red:
In [27]:
iterD = rollD(dice)
Pmf(next(iterD) for i in range(1000))
Out[27]:
But now the probability of getting red is the same regardless of which die I chose. On average, it takes longer to get to red if I chose d1
, but if I only tell you the outcome and don't tell you how many times I rolled, you have no way of knowing which die I chose.
So the posterior is the same as the prior.
In [28]:
posterior = prior.Copy()
posterior.Print()
If you are not sure about that argument (and after all this I don't blame you), see below for a more persuasive argument.
And the predictive distribution is, again, always red.
In [29]:
predictive = Pmf({'Red':1.0})
Recall that in Scenario D I choose a die and then roll it until I get red. I claim:
If you know how many times I rolled before getting red, you get information about which die it was.
If you don't know how many times I rolled, you get no information. That is, the posterior probabilities for d1
and d2
are the same as the priors.
To demonstrate the first part, fere's a Suite
that takes as data the number of times I rolled (including the last one that came up red):
In [30]:
class ScenarioD(Suite):
def Likelihood(self, data, hypo):
"""
data: k, number of times I rolled to get a Red
hypo: a Die object
"""
p = hypo['Red']
k = data
return (1-p)**(k-1) * p
The likelihood is the geometric PMF with probability p
.
If you know I got red on the first try, that's evidence in favor of d2
:
In [31]:
suite = ScenarioD([d1, d2])
suite.Update(1)
suite.Print()
If you know I got it on the second try, that's equally likely with d1
or d2
:
In [32]:
suite = ScenarioD([d1, d2])
suite.Update(2)
suite.Print()
If it takes three tries or more, that's evidence for d1
.
In [33]:
suite = ScenarioD([d1, d2])
suite.Update(3)
suite.Print()
If you don't know how many times I rolled, you have to compute the weighted sum of these posterior probabilities.
$P(\mbox{d1} ~|~ \mbox{Red}) = \sum_{k=1}^\infty P(k)~P(\mbox{d1} ~|~ k)$
Suppose $q$ is the probability of choosing d1, $p_1$ is the probability of red on d1, and $p_2$ is the probability of getting red on d2.
The chance of getting the first red on the $k$th roll is
$P(k) = q (1-p_1)^{k-1} p_1 + (1-q) (1-p_2)^{k-1} p_2$
And the probability that we rolled d1 given $k$ is
$P(\mbox{d1} ~|~ k) = \frac{q (1-p_1)^{k-1} p_1}{q (1-p_1)^{k-1} p_1 + (1-q) (1-p_2)^{k-1} p_2}$
The denominator of that fraction is $P(k)$, so when we compute the product, $P(k)$ drops out:
$P(k)~P(\mbox{d1} ~|~ k) = q (1-p_1)^{k-1} p1$
So we can write $P(\mbox{d1} ~|~ \mbox{Red})$ like this (moving the factor of $q$ out of the summation):
$P(\mbox{d1} ~|~ \mbox{Red}) = q \sum_{k=1}^\infty (1-p_1)^{k-1} p_1$
The quantity inside the summation is the PMF of a geometric distribution, which sums to 1. So:
$P(\mbox{d1} ~|~ \mbox{Red}) = q$
The posterior probability of d1
after rolling an eventual red is the same as the prior, which confirms our intuition that we learn nothing about the die from this data.
In [ ]:
In [ ]: