This notebook demonstrates the use of Pmf and Suite objects, and explores a dice-rolling problem that turns out to be more complicated than it sounds

Copyright 2016 Allen Downey

MIT License: http://opensource.org/licenses/MIT

```
In [1]:
```from __future__ import print_function, division
from thinkbayes2 import Pmf, Suite
from thinkbayes2 import MakeMixture
from fractions import Fraction

Suppose I have a six-sided die that is red on 2 sides and blue on 4 sides, and another die that's the other way around, red on 4 sides and blue on 2.

I choose a die at random and roll it, and I tell you it came up red. What is the probability that I rolled the second die (red on 4 sides)?

To answer this question, I'll create `Pmf`

objects to represent the two dice:

```
In [2]:
```d1 = Pmf({'Red':Fraction(2), 'Blue':Fraction(4)}, label='d1 (bluish) ')
d1.Print()

```
```

```
In [3]:
```d2 = Pmf({'Red':Fraction(4), 'Blue':Fraction(2)}, label='d2 (reddish)')
d2.Print()

```
```

And I'll make another `Pmf`

to represent the random choice of one die or the other.

```
In [4]:
```dice = Pmf({d1:Fraction(1), d2:Fraction(1)})
dice.Print()

```
```

Now I can use the `Random`

method to choose a die and then roll it.

```
In [5]:
```dice.Random().Random()

```
Out[5]:
```

```
In [6]:
```def rollA(dice):
while True:
die = dice.Random()
roll = die.Random()
yield roll

We can use this generator to simulate rolls:

```
In [7]:
```iterA = rollA(dice)
for i in range(5):
print(next(iterA))

```
```

In the long run, the proportion of red and blue is 50-50.

```
In [8]:
```Pmf(next(iterA) for i in range(1000))

```
Out[8]:
```

We can see that by computing the weighted mixture of the two dice:

```
In [9]:
```MakeMixture(dice).Print()

```
```

```
In [10]:
```class Dice(Suite):
def Likelihood(self, data, hypo):
"""
data: 'Red' or 'Blue'
hypo: a Die object
"""
return hypo[data]

Now I can create a suite that represents the prior distribution.

```
In [11]:
```prior = Dice({d1:Fraction(1), d2:Fraction(1)})
prior.Print()

```
```

And update it with the data.

```
In [12]:
```posterior = prior.Copy()
posterior.Update('Red')
posterior.Print()

```
```

The posterior probabilities for `d1`

is `1/3`

and the posterior probability for `d2`

is `2/3`

.

Intuitively, the posterior probability of `d2`

is higher than the prior (which was `1/2`

) because the outcome (red) is more likely on `d2`

than `d1`

. If we had rolled blue, the probability of `d1`

would be higher.

Now suppose I ask you to predict the outcome of the next roll. Remember that in this scenario, I choose a new die each time, so what you learned on the first roll doesn't help you with the second. The predictive distribution is a weighted mixture of the dice, using the prior weights:

```
In [13]:
```predictive = MakeMixture(prior)
predictive.Print()

```
```

```
In [14]:
```def rollB(dice):
die = dice.Random()
while True:
roll = die.Random()
yield roll

In the long run, the proportion of red is either `1/3`

or `2/3`

, not `1/2`

:

```
In [15]:
```iterB = rollB(dice)
Pmf(next(iterB) for i in range(1000))

```
Out[15]:
```

After the first roll, the posterior suite is the same as in the previous scenario:

```
In [16]:
```posterior = prior.Copy()
posterior.Update('Red')
posterior.Print()

```
```

In this scenario, we know we are going to roll the same die each time, so the information we learned from the first roll informs our prediction for the second.

Specifically, now the predictive distribution is based on the posterior, not the prior:

```
In [17]:
```predictive = MakeMixture(posterior)
predictive.Print()

```
```

Having seen one red, we are more inclined to belive that I am rolling `d2`

, so we are more inclined to predict that I will roll red again.

If I do roll red again, we can update the posterior again, using the new data, and make a new prediction:

```
In [18]:
```posterior.Update('Red')
posterior.Print()

```
```

```
In [19]:
```predictive = MakeMixture(posterior)
predictive.Print()

```
```

If we continue this process, we will eventually be confident that we know which die is being rolled:

```
In [20]:
```posterior = prior.Copy()
for i in range(10):
posterior.Update(next(iterB))
posterior.Print()

```
```

And the predictive distribution will be close to `1/3`

or `2/3`

, depending on which die we think it is.

```
In [21]:
```predictive = MakeMixture(posterior)
predictive.Print()

```
```

```
In [22]:
```def rollC(dice):
while True:
die = dice.Random()
roll = die.Random()
if roll == 'Red':
yield roll

In this scenario, obviously, the outcome is always red:

```
In [23]:
```iterC = rollC(dice)
Pmf(next(iterC) for i in range(1000))

```
Out[23]:
```

But now suppose I ask you about the last die I rolled. What is the probability that it is the reddish die, `d2`

?

One each roll, there are four possible results, with these probabilities:

```
d1, red 1/2 * 1/3
d1, blue 1/2 * 2/3
d2, red 1/2 * 2/3
d2, blue 1/2 * 1/3
```

On the last roll, I tell you that the outcome is red, so we are left with two possibilities:

```
d1, red 1/2 * 1/3
d2, red 1/2 * 2/3
```

The likelihood ratio is `2`

to `1`

, so we can use that to update the prior:

```
In [24]:
```posterior = prior.Copy()
posterior[d1] *= 1
posterior[d2] *= 2
posterior.Normalize()
posterior.Print()

```
```

That's the same posterior we saw in Scenarios A and B. So even though we knew the outcome would be red, nevertheless we learn something about the die.

Of course, now the predictive distribution is always red:

```
In [25]:
```predictive = Pmf({'Red':1.0})
predictive.Print()

```
```

```
In [26]:
```def rollD(dice):
die = dice.Random()
while True:
roll = die.Random()
if roll == 'Red':
yield roll

Again, obviously, the outcome is always red:

```
In [27]:
```iterD = rollD(dice)
Pmf(next(iterD) for i in range(1000))

```
Out[27]:
```

But now the probability of getting red is the same regardless of which die I chose. On average, it takes longer to get to red if I chose `d1`

, but if I only tell you the outcome and don't tell you how many times I rolled, you have no way of knowing which die I chose.

So the posterior is the same as the prior.

```
In [28]:
```posterior = prior.Copy()
posterior.Print()

```
```

If you are not sure about that argument (and after all this I don't blame you), see below for a more persuasive argument.

And the predictive distribution is, again, always red.

```
In [29]:
```predictive = Pmf({'Red':1.0})

Recall that in Scenario D I choose a die and then roll it until I get red. I claim:

If you know how many times I rolled before getting red, you get information about which die it was.

If you don't know how many times I rolled, you get no information. That is, the posterior probabilities for

`d1`

and`d2`

are the same as the priors.

To demonstrate the first part, fere's a `Suite`

that takes as data the number of times I rolled (including the last one that came up red):

```
In [30]:
```class ScenarioD(Suite):
def Likelihood(self, data, hypo):
"""
data: k, number of times I rolled to get a Red
hypo: a Die object
"""
p = hypo['Red']
k = data
return (1-p)**(k-1) * p

The likelihood is the geometric PMF with probability `p`

.

If you know I got red on the first try, that's evidence in favor of `d2`

:

```
In [31]:
```suite = ScenarioD([d1, d2])
suite.Update(1)
suite.Print()

```
```

If you know I got it on the second try, that's equally likely with `d1`

or `d2`

:

```
In [32]:
```suite = ScenarioD([d1, d2])
suite.Update(2)
suite.Print()

```
```

If it takes three tries or more, that's evidence for `d1`

.

```
In [33]:
```suite = ScenarioD([d1, d2])
suite.Update(3)
suite.Print()

```
```

If you don't know how many times I rolled, you have to compute the weighted sum of these posterior probabilities.

$P(\mbox{d1} ~|~ \mbox{Red}) = \sum_{k=1}^\infty P(k)~P(\mbox{d1} ~|~ k)$

Suppose $q$ is the probability of choosing d1, $p_1$ is the probability of red on d1, and $p_2$ is the probability of getting red on d2.

The chance of getting the first red on the $k$th roll is

$P(k) = q (1-p_1)^{k-1} p_1 + (1-q) (1-p_2)^{k-1} p_2$

And the probability that we rolled d1 given $k$ is

$P(\mbox{d1} ~|~ k) = \frac{q (1-p_1)^{k-1} p_1}{q (1-p_1)^{k-1} p_1 + (1-q) (1-p_2)^{k-1} p_2}$

The denominator of that fraction is $P(k)$, so when we compute the product, $P(k)$ drops out:

$P(k)~P(\mbox{d1} ~|~ k) = q (1-p_1)^{k-1} p1$

So we can write $P(\mbox{d1} ~|~ \mbox{Red})$ like this (moving the factor of $q$ out of the summation):

$P(\mbox{d1} ~|~ \mbox{Red}) = q \sum_{k=1}^\infty (1-p_1)^{k-1} p_1$

The quantity inside the summation is the PMF of a geometric distribution, which sums to 1. So:

$P(\mbox{d1} ~|~ \mbox{Red}) = q$

The posterior probability of `d1`

after rolling an eventual red is the same as the prior, which confirms our intuition that we learn nothing about the die from this data.

```
In [ ]:
```

```
In [ ]:
```