A dice problem

This notebook demonstrates the use of Pmf and Suite objects, and explores a dice-rolling problem that turns out to be more complicated than it sounds

Copyright 2016 Allen Downey

MIT License: http://opensource.org/licenses/MIT


In [1]:
from __future__ import print_function, division

from thinkbayes2 import Pmf, Suite
from thinkbayes2 import MakeMixture

from fractions import Fraction

A dice problem

Suppose I have a six-sided die that is red on 2 sides and blue on 4 sides, and another die that's the other way around, red on 4 sides and blue on 2.

I choose a die at random and roll it, and I tell you it came up red. What is the probability that I rolled the second die (red on 4 sides)?

To answer this question, I'll create Pmf objects to represent the two dice:


In [2]:
d1 = Pmf({'Red':Fraction(2), 'Blue':Fraction(4)}, label='d1 (bluish) ')
d1.Print()


Blue 2/3
Red 1/3

In [3]:
d2 = Pmf({'Red':Fraction(4), 'Blue':Fraction(2)}, label='d2 (reddish)')
d2.Print()


Blue 1/3
Red 2/3

And I'll make another Pmf to represent the random choice of one die or the other.


In [4]:
dice = Pmf({d1:Fraction(1), d2:Fraction(1)})
dice.Print()


d2 (reddish) 1/2
d1 (bluish)  1/2

Now I can use the Random method to choose a die and then roll it.


In [5]:
dice.Random().Random()


Out[5]:
'Blue'

Scenario A

The following generator simulates the process of repeatedly choosing a die and then rolling it.


In [6]:
def rollA(dice):
    while True:
        die = dice.Random()
        roll = die.Random()
        yield roll

We can use this generator to simulate rolls:


In [7]:
iterA = rollA(dice)
for i in range(5):
    print(next(iterA))


Red
Red
Blue
Blue
Blue

In the long run, the proportion of red and blue is 50-50.


In [8]:
Pmf(next(iterA) for i in range(1000))


Out[8]:
Pmf({'Red': 0.528, 'Blue': 0.47200000000000003})

We can see that by computing the weighted mixture of the two dice:


In [9]:
MakeMixture(dice).Print()


Blue 1/2
Red 1/2

To answer the original question, I'll create a suite of hypotheses where each hypothesis is represented by a die, and the likelihood of the data under each hypothesis is the probability that the given die yields the given outcome.


In [10]:
class Dice(Suite):
    def Likelihood(self, data, hypo):
        """
        data: 'Red' or 'Blue'
        hypo: a Die object
        """
        return hypo[data]

Now I can create a suite that represents the prior distribution.


In [11]:
prior = Dice({d1:Fraction(1), d2:Fraction(1)})
prior.Print()


d2 (reddish) 1/2
d1 (bluish)  1/2

And update it with the data.


In [12]:
posterior = prior.Copy()
posterior.Update('Red')
posterior.Print()


d2 (reddish) 2/3
d1 (bluish)  1/3

The posterior probabilities for d1 is 1/3 and the posterior probability for d2 is 2/3.

Intuitively, the posterior probability of d2 is higher than the prior (which was 1/2) because the outcome (red) is more likely on d2 than d1. If we had rolled blue, the probability of d1 would be higher.

Now suppose I ask you to predict the outcome of the next roll. Remember that in this scenario, I choose a new die each time, so what you learned on the first roll doesn't help you with the second. The predictive distribution is a weighted mixture of the dice, using the prior weights:


In [13]:
predictive = MakeMixture(prior)
predictive.Print()


Blue 1/2
Red 1/2

Scenario B

Now consider a different scenario. Instead of choosing a new die every time, I choose a die once and roll it repeatedly. Here's a generator that simulates this scenario:


In [14]:
def rollB(dice):
    die = dice.Random()
    while True:
        roll = die.Random()
        yield roll

In the long run, the proportion of red is either 1/3 or 2/3, not 1/2:


In [15]:
iterB = rollB(dice)
Pmf(next(iterB) for i in range(1000))


Out[15]:
Pmf({'Red': 0.331, 'Blue': 0.669})

After the first roll, the posterior suite is the same as in the previous scenario:


In [16]:
posterior = prior.Copy()
posterior.Update('Red')
posterior.Print()


d2 (reddish) 2/3
d1 (bluish)  1/3

In this scenario, we know we are going to roll the same die each time, so the information we learned from the first roll informs our prediction for the second.

Specifically, now the predictive distribution is based on the posterior, not the prior:


In [17]:
predictive = MakeMixture(posterior)
predictive.Print()


Blue 4/9
Red 5/9

Having seen one red, we are more inclined to belive that I am rolling d2, so we are more inclined to predict that I will roll red again.

If I do roll red again, we can update the posterior again, using the new data, and make a new prediction:


In [18]:
posterior.Update('Red')
posterior.Print()


d2 (reddish) 4/5
d1 (bluish)  1/5

In [19]:
predictive = MakeMixture(posterior)
predictive.Print()


Blue 2/5
Red 3/5

If we continue this process, we will eventually be confident that we know which die is being rolled:


In [20]:
posterior = prior.Copy()
for i in range(10):
    posterior.Update(next(iterB))
posterior.Print()


d2 (reddish) 1/65
d1 (bluish)  64/65

And the predictive distribution will be close to 1/3 or 2/3, depending on which die we think it is.


In [21]:
predictive = MakeMixture(posterior)
predictive.Print()


Blue 43/65
Red 22/65

Scenario C

Now let's consider another scenario: Suppose I choose a die and roll it. If the outcome is red, I report the outcome. Otherwise I choose a die again and roll again, and repear until I get red.

Here's a generator that simulates this scenario.


In [22]:
def rollC(dice):
    while True:
        die = dice.Random()
        roll = die.Random()
        if roll == 'Red':
            yield roll

In this scenario, obviously, the outcome is always red:


In [23]:
iterC = rollC(dice)
Pmf(next(iterC) for i in range(1000))


Out[23]:
Pmf({'Red': 1.0})

But now suppose I ask you about the last die I rolled. What is the probability that it is the reddish die, d2?

One each roll, there are four possible results, with these probabilities:

d1, red       1/2 * 1/3
d1, blue      1/2 * 2/3
d2, red       1/2 * 2/3
d2, blue      1/2 * 1/3

On the last roll, I tell you that the outcome is red, so we are left with two possibilities:

d1, red       1/2 * 1/3
d2, red       1/2 * 2/3

The likelihood ratio is 2 to 1, so we can use that to update the prior:


In [24]:
posterior = prior.Copy()
posterior[d1] *= 1
posterior[d2] *= 2
posterior.Normalize()
posterior.Print()


d2 (reddish) 2/3
d1 (bluish)  1/3

That's the same posterior we saw in Scenarios A and B. So even though we knew the outcome would be red, nevertheless we learn something about the die.

Of course, now the predictive distribution is always red:


In [25]:
predictive = Pmf({'Red':1.0})
predictive.Print()


Red 1.0

Scenario D

Finally, let's consider the scenario where I choose a die once and roll it repeatedly until the out come is red.


In [26]:
def rollD(dice):
    die = dice.Random()
    while True:
        roll = die.Random()
        if roll == 'Red':
            yield roll

Again, obviously, the outcome is always red:


In [27]:
iterD = rollD(dice)
Pmf(next(iterD) for i in range(1000))


Out[27]:
Pmf({'Red': 1.0})

But now the probability of getting red is the same regardless of which die I chose. On average, it takes longer to get to red if I chose d1, but if I only tell you the outcome and don't tell you how many times I rolled, you have no way of knowing which die I chose.

So the posterior is the same as the prior.


In [28]:
posterior = prior.Copy()
posterior.Print()


d2 (reddish) 1/2
d1 (bluish)  1/2

If you are not sure about that argument (and after all this I don't blame you), see below for a more persuasive argument.

And the predictive distribution is, again, always red.


In [29]:
predictive = Pmf({'Red':1.0})

Summary

In summary, each of the four scenarios yields a different pair of posterior and predictive distributions.

Scenario        Posterior probability of d2      Predictive probability of red
A               2/3                              1/2
B               2/3                              5/9
C               2/3                              1
D               1/2                              1

Scenario D, the more persuasive version

Recall that in Scenario D I choose a die and then roll it until I get red. I claim:

  1. If you know how many times I rolled before getting red, you get information about which die it was.

  2. If you don't know how many times I rolled, you get no information. That is, the posterior probabilities for d1 and d2 are the same as the priors.

To demonstrate the first part, fere's a Suite that takes as data the number of times I rolled (including the last one that came up red):


In [30]:
class ScenarioD(Suite):
    def Likelihood(self, data, hypo):
        """
        data: k, number of times I rolled to get a Red
        hypo: a Die object
        """
        p = hypo['Red']
        k = data
        return (1-p)**(k-1) * p

The likelihood is the geometric PMF with probability p.

If you know I got red on the first try, that's evidence in favor of d2:


In [31]:
suite = ScenarioD([d1, d2])
suite.Update(1)
suite.Print()


d2 (reddish) 0.6666666666666666
d1 (bluish)  0.3333333333333333

If you know I got it on the second try, that's equally likely with d1 or d2:


In [32]:
suite = ScenarioD([d1, d2])
suite.Update(2)
suite.Print()


d2 (reddish) 0.5
d1 (bluish)  0.5

If it takes three tries or more, that's evidence for d1.


In [33]:
suite = ScenarioD([d1, d2])
suite.Update(3)
suite.Print()


d2 (reddish) 0.3333333333333333
d1 (bluish)  0.6666666666666666

If you don't know how many times I rolled, you have to compute the weighted sum of these posterior probabilities.

$P(\mbox{d1} ~|~ \mbox{Red}) = \sum_{k=1}^\infty P(k)~P(\mbox{d1} ~|~ k)$

Suppose $q$ is the probability of choosing d1, $p_1$ is the probability of red on d1, and $p_2$ is the probability of getting red on d2.

The chance of getting the first red on the $k$th roll is

$P(k) = q (1-p_1)^{k-1} p_1 + (1-q) (1-p_2)^{k-1} p_2$

And the probability that we rolled d1 given $k$ is

$P(\mbox{d1} ~|~ k) = \frac{q (1-p_1)^{k-1} p_1}{q (1-p_1)^{k-1} p_1 + (1-q) (1-p_2)^{k-1} p_2}$

The denominator of that fraction is $P(k)$, so when we compute the product, $P(k)$ drops out:

$P(k)~P(\mbox{d1} ~|~ k) = q (1-p_1)^{k-1} p1$

So we can write $P(\mbox{d1} ~|~ \mbox{Red})$ like this (moving the factor of $q$ out of the summation):

$P(\mbox{d1} ~|~ \mbox{Red}) = q \sum_{k=1}^\infty (1-p_1)^{k-1} p_1$

The quantity inside the summation is the PMF of a geometric distribution, which sums to 1. So:

$P(\mbox{d1} ~|~ \mbox{Red}) = q$

The posterior probability of d1 after rolling an eventual red is the same as the prior, which confirms our intuition that we learn nothing about the die from this data.


In [ ]:


In [ ]: