Think Bayes: Chapter 5

This notebook presents code and exercises from Think Bayes, second edition.

MIT License: https://opensource.org/licenses/MIT



In [1]:

    
from __future__ import print_function, division

% matplotlib inline
import warnings
warnings.filterwarnings('ignore')

import numpy as np

from thinkbayes2 import Pmf, Cdf, Suite, Beta
import thinkplot

Odds

The following function converts from probabilities to odds.



In [2]:

    
def Odds(p):
    return p / (1-p)

And this function converts from odds to probabilities.



In [3]:

    
def Probability(o):
    return o / (o+1)

If 20% of bettors think my horse will win, that corresponds to odds of 1:4, or 0.25.



In [4]:

    
p = 0.2
Odds(p)









    Out[4]:





0.25

If the odds against my horse are 1:5, that corresponds to a probability of 1/6.



In [5]:

    
o = 1/5
Probability(o)









    Out[5]:





0.16666666666666669

We can use the odds form of Bayes's theorem to solve the cookie problem:



In [6]:

    
prior_odds = 1
likelihood_ratio = 0.75 / 0.5
post_odds = prior_odds * likelihood_ratio
post_odds









    Out[6]:





1.5

And then we can compute the posterior probability, if desired.



In [7]:

    
post_prob = Probability(post_odds)
post_prob









    Out[7]:





0.6

If we draw another cookie and it's chocolate, we can do another update:



In [8]:

    
likelihood_ratio = 0.25 / 0.5
post_odds *= likelihood_ratio
post_odds









    Out[8]:





0.75

And convert back to probability.



In [9]:

    
post_prob = Probability(post_odds)
post_prob









    Out[9]:





0.42857142857142855

Oliver's blood

The likelihood ratio is also useful for talking about the strength of evidence without getting bogged down talking about priors.

As an example, we'll solve this problem from MacKay's {\it Information Theory, Inference, and Learning Algorithms}:

Two people have left traces of their own blood at the scene of a crime. A suspect, Oliver, is tested and found to have type 'O' blood. The blood groups of the two traces are found to be of type 'O' (a common type in the local population, having frequency 60) and of type 'AB' (a rare type, with frequency 1). Do these data [the traces found at the scene] give evidence in favor of the proposition that Oliver was one of the people [who left blood at the scene]?

If Oliver is one of the people who left blood at the crime scene, then he accounts for the 'O' sample, so the probability of the data is just the probability that a random member of the population has type 'AB' blood, which is 1%.

If Oliver did not leave blood at the scene, then we have two samples to account for. If we choose two random people from the population, what is the chance of finding one with type 'O' and one with type 'AB'? Well, there are two ways it might happen: the first person we choose might have type 'O' and the second 'AB', or the other way around. So the total probability is $2 (0.6) (0.01) = 1.2$%.

So the likelihood ratio is:



In [10]:

    
like1 = 0.01
like2 = 2 * 0.6 * 0.01

likelihood_ratio = like1 / like2
likelihood_ratio









    Out[10]:





0.8333333333333334

Since the ratio is less than 1, it is evidence against the hypothesis that Oliver left blood at the scence.

But it is weak evidence. For example, if the prior odds were 1 (that is, 50% probability), the posterior odds would be 0.83, which corresponds to a probability of:



In [11]:

    
post_odds = 1 * like1 / like2
Probability(post_odds)









    Out[11]:





0.45454545454545453

So this evidence doesn't "move the needle" very much.

Exercise: Suppose other evidence had made you 90% confident of Oliver's guilt. How much would this exculpatory evince change your beliefs? What if you initially thought there was only a 10% chance of his guilt?

Notice that evidence with the same strength has a different effect on probability, depending on where you started.



In [12]:

    
# Solution goes here



In [13]:

    
# Solution goes here

Comparing distributions

Let's get back to the Kim Rhode problem from Chapter 4:

At the 2016 Summer Olympics in the Women's Skeet event, Kim Rhode faced Wei Meng in the bronze medal match. They each hit 15 of 25 targets, sending the match into sudden death. In the first round, both hit 1 of 2 targets. In the next two rounds, they each hit 2 targets. Finally, in the fourth round, Rhode hit 2 and Wei hit 1, so Rhode won the bronze medal, making her the first Summer Olympian to win an individual medal at six consecutive summer games.

But after all that shooting, what is the probability that Rhode is actually a better shooter than Wei? If the same match were held again, what is the probability that Rhode would win?

I'll start with a uniform distribution for x, the probability of hitting a target, but we should check whether the results are sensitive to that choice.

First I create a Beta distribution for each of the competitors, and update it with the results.



In [14]:

    
rhode = Beta(1, 1, label='Rhode')
rhode.Update((22, 11))



In [15]:

    
wei = Beta(1, 1, label='Wei')
wei.Update((21, 12))

Based on the data, the distribution for Rhode is slightly farther right than the distribution for Wei, but there is a lot of overlap.



In [16]:

    
thinkplot.Pdf(rhode.MakePmf())
thinkplot.Pdf(wei.MakePmf())
thinkplot.Config(xlabel='x', ylabel='Probability')

To compute the probability that Rhode actually has a higher value of p, there are two options:

Sampling: we could draw random samples from the posterior distributions and compare them.
Enumeration: we could enumerate all possible pairs of values and add up the "probability of superiority".

I'll start with sampling. The Beta object provides a method that draws a random value from a Beta distribution:



In [17]:

    
iters = 1000
count = 0
for _ in range(iters):
    x1 = rhode.Random()
    x2 = wei.Random()
    if x1 > x2:
        count += 1

count / iters









    Out[17]:





0.605

Beta also provides Sample, which returns a NumPy array, so we an perform the comparisons using array operations:



In [18]:

    
rhode_sample = rhode.Sample(iters)
wei_sample = wei.Sample(iters)
np.mean(rhode_sample > wei_sample)









    Out[18]:





0.60799999999999998

The other option is to make Pmf objects that approximate the Beta distributions, and enumerate pairs of values:



In [19]:

    
def ProbGreater(pmf1, pmf2):
    total = 0
    for x1, prob1 in pmf1.Items():
        for x2, prob2 in pmf2.Items():
            if x1 > x2:
                total += prob1 * prob2
    return total



In [20]:

    
pmf1 = rhode.MakePmf(1001)
pmf2 = wei.MakePmf(1001)
ProbGreater(pmf1, pmf2)









    Out[20]:





0.5982856085975918



In [21]:

    
pmf1.ProbGreater(pmf2)









    Out[21]:





0.5982856085975918



In [22]:

    
pmf1.ProbLess(pmf2)









    Out[22]:





0.39831486726635074

Exercise: Run this analysis again with a different prior and see how much effect it has on the results.

Simulation

To make predictions about a rematch, we have two options again:

Sampling. For each simulated match, we draw a random value of x for each contestant, then simulate 25 shots and count hits.
Computing a mixture. If we knew x exactly, the distribution of hits, k, would be binomial. Since we don't know x, the distribution of k is a mixture of binomials with different values of x.

I'll do it by sampling first.



In [23]:

    
import random

def flip(p):
    return random.random() < p

flip returns True with probability p and False with probability 1-p

Now we can simulate 1000 rematches and count wins and losses.



In [24]:

    
iters = 1000
wins = 0
losses = 0

for _ in range(iters):
    x1 = rhode.Random()
    x2 = wei.Random()
    
    count1 = count2 = 0
    for _ in range(25):
        if flip(x1):
            count1 += 1
        if flip(x2):
            count2 += 1
            
    if count1 > count2:
        wins += 1
    if count1 < count2:
        losses += 1
        
wins/iters, losses/iters









    Out[24]:





(0.516, 0.386)

Or, realizing that the distribution of k is binomial, we can simplify the code using NumPy:



In [25]:

    
rhode_rematch = np.random.binomial(25, rhode_sample)
thinkplot.Hist(Pmf(rhode_rematch))



In [26]:

    
wei_rematch = np.random.binomial(25, wei_sample)
np.mean(rhode_rematch > wei_rematch)









    Out[26]:





0.52900000000000003



In [27]:

    
np.mean(rhode_rematch < wei_rematch)









    Out[27]:





0.39400000000000002

Alternatively, we can make a mixture that represents the distribution of k, taking into account our uncertainty about x:



In [28]:

    
from thinkbayes2 import MakeBinomialPmf

def MakeBinomialMix(pmf, label=''):
    mix = Pmf(label=label)
    for x, prob in pmf.Items():
        binom = MakeBinomialPmf(n=25, p=x)
        for k, p in binom.Items():
            mix[k] += prob * p
    return mix



In [29]:

    
rhode_rematch = MakeBinomialMix(rhode.MakePmf(), label='Rhode')
wei_rematch = MakeBinomialMix(wei.MakePmf(), label='Wei')
thinkplot.Pdf(rhode_rematch)
thinkplot.Pdf(wei_rematch)
thinkplot.Config(xlabel='hits')



In [30]:

    
rhode_rematch.ProbGreater(wei_rematch), rhode_rematch.ProbLess(wei_rematch)









    Out[30]:





(0.52023290455545113, 0.39052602488401061)

Alternatively, we could use MakeMixture:



In [31]:

    
from thinkbayes2 import MakeMixture

def MakeBinomialMix2(pmf):
    binomials = Pmf()
    for x, prob in pmf.Items():
        binom = MakeBinomialPmf(n=25, p=x)
        binomials[binom] = prob
    return MakeMixture(binomials)

Here's how we use it.



In [32]:

    
rhode_rematch = MakeBinomialMix2(rhode.MakePmf())
wei_rematch = MakeBinomialMix2(wei.MakePmf())
rhode_rematch.ProbGreater(wei_rematch), rhode_rematch.ProbLess(wei_rematch)









    Out[32]:





(0.52023290455545113, 0.39052602488401061)

Exercise: Run this analysis again with a different prior and see how much effect it has on the results.

Distributions of sums and differences

Suppose we want to know the total number of targets the two contestants will hit in a rematch. There are two ways we might compute the distribution of this sum:

Sampling: We can draw samples from the distributions and add them up.
Enumeration: We can enumerate all possible pairs of values.

I'll start with sampling:



In [33]:

    
iters = 1000
pmf = Pmf()
for _ in range(iters):
    k = rhode_rematch.Random() + wei_rematch.Random()
    pmf[k] += 1
pmf.Normalize()
thinkplot.Hist(pmf)

Or we could use Sample and NumPy:



In [34]:

    
ks = rhode_rematch.Sample(iters) + wei_rematch.Sample(iters)
pmf = Pmf(ks)
thinkplot.Hist(pmf)

Alternatively, we could compute the distribution of the sum by enumeration:



In [35]:

    
def AddPmfs(pmf1, pmf2):
    pmf = Pmf()
    for v1, p1 in pmf1.Items():
        for v2, p2 in pmf2.Items():
            pmf[v1 + v2] += p1 * p2
    return pmf

Here's how it's used:



In [36]:

    
pmf = AddPmfs(rhode_rematch, wei_rematch)
thinkplot.Pdf(pmf)

The Pmf class provides a + operator that does the same thing.



In [37]:

    
pmf = rhode_rematch + wei_rematch
thinkplot.Pdf(pmf)

Exercise: The Pmf class also provides the - operator, which computes the distribution of the difference in values from two distributions. Use the distributions from the previous section to compute the distribution of the differential between Rhode and Wei in a rematch. On average, how many clays should we expect Rhode to win by? What is the probability that Rhode wins by 10 or more?



In [38]:

    
# Solution goes here



In [39]:

    
# Solution goes here



In [40]:

    
# Solution goes here

Distribution of maximum

Suppose Kim Rhode continues to compete in six more Olympics. What should we expect her best result to be?

Once again, there are two ways we can compute the distribution of the maximum:

Sampling.
Analysis of the CDF.

Here's a simple version by sampling:



In [41]:

    
iters = 1000
pmf = Pmf()
for _ in range(iters):
    ks = rhode_rematch.Sample(6)
    pmf[max(ks)] += 1
pmf.Normalize()
thinkplot.Hist(pmf)

And here's a version using NumPy. I'll generate an array with 6 rows and 10 columns:



In [42]:

    
iters = 1000
ks = rhode_rematch.Sample((6, iters))
ks









    Out[42]:





array([[18, 17, 17, ..., 15, 12, 19],
       [18, 17, 17, ..., 16, 21, 14],
       [19, 15, 14, ..., 11, 18, 17],
       [18, 16, 14, ..., 11, 19, 19],
       [20, 20, 13, ..., 20, 16, 14],
       [16, 17, 16, ..., 13, 20, 23]])

Compute the maximum in each column:



In [43]:

    
maxes = np.max(ks, axis=0)
maxes[:10]









    Out[43]:





array([20, 20, 17, 20, 22, 21, 21, 23, 19, 22])

And then plot the distribution of maximums:



In [44]:

    
pmf = Pmf(maxes)
thinkplot.Hist(pmf)

Or we can figure it out analytically. If the maximum is less-than-or-equal-to some value k, all 6 random selections must be less-than-or-equal-to k, so:

$ CDF_{max}(x) = CDF(x)^6 $

Pmf provides a method that computes and returns this Cdf, so we can compute the distribution of the maximum like this:



In [45]:

    
pmf = rhode_rematch.Max(6).MakePmf()
thinkplot.Hist(pmf)

Exercise: Here's how Pmf.Max works:

def Max(self, k):
    """Computes the CDF of the maximum of k selections from this dist.

    k: int

    returns: new Cdf
    """
    cdf = self.MakeCdf()
    cdf.ps **= k
    return cdf

Write a function that takes a Pmf and an integer n and returns a Pmf that represents the distribution of the minimum of k values drawn from the given Pmf. Use your function to compute the distribution of the minimum score Kim Rhode would be expected to shoot in six competitions.



In [46]:

    
def Min(pmf, k):
    cdf = pmf.MakeCdf()
    cdf.ps = 1 - (1-cdf.ps)**k
    return cdf



In [47]:

    
pmf = Min(rhode_rematch, 6).MakePmf()
thinkplot.Hist(pmf)

Exercises

Exercise: Suppose you are having a dinner party with 10 guests and 4 of them are allergic to cats. Because you have cats, you expect 50% of the allergic guests to sneeze during dinner. At the same time, you expect 10% of the non-allergic guests to sneeze. What is the distribution of the total number of guests who sneeze?



In [48]:

    
# Solution goes here



In [49]:

    
# Solution goes here

Exercise This study from 2015 showed that many subjects diagnosed with non-celiac gluten sensitivity (NCGS) were not able to distinguish gluten flour from non-gluten flour in a blind challenge.

Here is a description of the study:

"We studied 35 non-CD subjects (31 females) that were on a gluten-free diet (GFD), in a double-blind challenge study. Participants were randomised to receive either gluten-containing ﬂour or gluten-free ﬂour for 10 days, followed by a 2-week washout period and were then crossed over. The main outcome measure was their ability to identify which ﬂour contained gluten. "The gluten-containing ﬂour was correctly identiﬁed by 12 participants (34%)..." Since 12 out of 35 participants were able to identify the gluten flour, the authors conclude "Double-blind gluten challenge induces symptom recurrence in just one-third of patients fulﬁlling the clinical diagnostic criteria for non-coeliac gluten sensitivity."

This conclusion seems odd to me, because if none of the patients were sensitive to gluten, we would expect some of them to identify the gluten flour by chance. So the results are consistent with the hypothesis that none of the subjects are actually gluten sensitive.

We can use a Bayesian approach to interpret the results more precisely. But first we have to make some modeling decisions.

Of the 35 subjects, 12 identified the gluten flour based on resumption of symptoms while they were eating it. Another 17 subjects wrongly identified the gluten-free flour based on their symptoms, and 6 subjects were unable to distinguish. So each subject gave one of three responses. To keep things simple I follow the authors of the study and lump together the second two groups; that is, I consider two groups: those who identified the gluten flour and those who did not.
I assume (1) people who are actually gluten sensitive have a 95% chance of correctly identifying gluten flour under the challenge conditions, and (2) subjects who are not gluten sensitive have only a 40% chance of identifying the gluten flour by chance (and a 60% chance of either choosing the other flour or failing to distinguish).

Using this model, estimate the number of study participants who are sensitive to gluten. What is the most likely number? What is the 95% credible interval?



In [50]:

    
# Solution goes here



In [51]:

    
# Solution goes here



In [52]:

    
# Solution goes here



In [53]:

    
# Solution goes here



In [54]:

    
# Solution goes here

Exercise Coming soon: the space invaders problem.



In [55]:

    
# Solution goes here



In [56]:

    
# Solution goes here



In [57]:

    
# Solution goes here



In [58]:

    
# Solution goes here



In [59]:

    
# Solution goes here



In [60]:

    
# Solution goes here



In [61]:

    
# Solution goes here



In [62]:

    
# Solution goes here



In [ ]: