Computational Statistics

Distributions

In statistics a **distribution** is a set of values and their corresponding probabilities.

For example, if you roll a six-sided die, the set of possible values is the numbers 1 to 6, and the probability associated with each value is 1/6.

As another example, you might be interested in how many times each word appears in common English usage. You could build a distribution that includes each word and how many times it appears.

To represent a distribution in Python, you could use a dictionary that maps from each value to its probability. I have written a class called Pmf that uses a Python dictionary in exactly that way, and provides a number of useful methods. I called the class Pmf in reference to a **probability mass function**, which is a way to represent a distribution mathematically.

Pmf is defined in a Python module I wrote to accompany this book, thinkbayes.py. You can download it from http://thinkbayes.com/thinkbayes.py. For more information see Section [download].

To use Pmf you can import it like this:



In [2]:

    
import sys
sys.path.insert(0, './code')
# Go into the subdirectory 

from thinkbayes import Pmf
# Grab the thinkbayes script

The following code builds a Pmf to represent the distribution of outcomes for a six-sided die:



In [8]:

    
help(Pmf)
# What is this object?









    



Help on class Pmf in module thinkbayes:

class Pmf(_DictWrapper)
 |  Represents a probability mass function.
 |  
 |  Values can be any hashable type; probabilities are floating-point.
 |  Pmfs are not necessarily normalized.
 |  
 |  Method resolution order:
 |      Pmf
 |      _DictWrapper
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  AddConstant(self, other)
 |      Computes the Pmf of the sum a constant and  values from self.
 |      
 |      other: a number
 |      
 |      returns: new Pmf
 |  
 |  AddPmf(self, other)
 |      Computes the Pmf of the sum of values drawn from self and other.
 |      
 |      other: another Pmf
 |      
 |      returns: new Pmf
 |  
 |  CredibleInterval(self, percentage=90)
 |      Computes the central credible interval.
 |      
 |      If percentage=90, computes the 90% CI.
 |      
 |      Args:
 |          percentage: float between 0 and 100
 |      
 |      Returns:
 |          sequence of two floats, low and high
 |  
 |  MakeCdf(self, name=None)
 |      Makes a Cdf.
 |  
 |  Max(self, k)
 |      Computes the CDF of the maximum of k selections from this dist.
 |      
 |      k: int
 |      
 |      returns: new Cdf
 |  
 |  MaximumLikelihood(self)
 |      Returns the value with the highest probability.
 |      
 |      Returns: float probability
 |  
 |  Mean(self)
 |      Computes the mean of a PMF.
 |      
 |      Returns:
 |          float mean
 |  
 |  Normalize(self, fraction=1.0)
 |      Normalizes this PMF so the sum of all probs is fraction.
 |      
 |      Args:
 |          fraction: what the total should be after normalization
 |      
 |      Returns: the total probability before normalizing
 |  
 |  Prob(self, x, default=0)
 |      Gets the probability associated with the value x.
 |      
 |      Args:
 |          x: number value
 |          default: value to return if the key is not there
 |      
 |      Returns:
 |          float probability
 |  
 |  ProbGreater(self, x)
 |      Probability that a sample from this Pmf exceeds x.
 |      
 |      x: number
 |      
 |      returns: float probability
 |  
 |  ProbLess(self, x)
 |      Probability that a sample from this Pmf is less than x.
 |      
 |      x: number
 |      
 |      returns: float probability
 |  
 |  Probs(self, xs)
 |      Gets probabilities for a sequence of values.
 |  
 |  Random(self)
 |      Chooses a random element from this PMF.
 |      
 |      Returns:
 |          float value from the Pmf
 |  
 |  Var(self, mu=None)
 |      Computes the variance of a PMF.
 |      
 |      Args:
 |          mu: the point around which the variance is computed;
 |              if omitted, computes the mean
 |      
 |      Returns:
 |          float variance
 |  
 |  __add__(self, other)
 |      Computes the Pmf of the sum of values drawn from self and other.
 |      
 |      other: another Pmf
 |      
 |      returns: new Pmf
 |  
 |  __eq__(self, obj)
 |      Less than.
 |      
 |      obj: number or _DictWrapper
 |      
 |      returns: float probability
 |  
 |  __ge__(self, obj)
 |      Greater than or equal.
 |      
 |      obj: number or _DictWrapper
 |      
 |      returns: float probability
 |  
 |  __gt__(self, obj)
 |      Greater than.
 |      
 |      obj: number or _DictWrapper
 |      
 |      returns: float probability
 |  
 |  __hash__(self)
 |      Return hash(self).
 |  
 |  __le__(self, obj)
 |      Less than or equal.
 |      
 |      obj: number or _DictWrapper
 |      
 |      returns: float probability
 |  
 |  __lt__(self, obj)
 |      Less than.
 |      
 |      obj: number or _DictWrapper
 |      
 |      returns: float probability
 |  
 |  __ne__(self, obj)
 |      Less than.
 |      
 |      obj: number or _DictWrapper
 |      
 |      returns: float probability
 |  
 |  __sub__(self, other)
 |      Computes the Pmf of the diff of values drawn from self and other.
 |      
 |      other: another Pmf
 |      
 |      returns: new Pmf
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from _DictWrapper:
 |  
 |  Copy(self, name=None)
 |      Returns a copy.
 |      
 |      Make a shallow copy of d.  If you want a deep copy of d,
 |      use copy.deepcopy on the whole object.
 |      
 |      Args:
 |          name: string name for the new Hist
 |  
 |  Exp(self, m=None)
 |      Exponentiates the probabilities.
 |      
 |      m: how much to shift the ps before exponentiating
 |      
 |      If m is None, normalizes so that the largest prob is 1.
 |  
 |  GetDict(self)
 |      Gets the dictionary.
 |  
 |  Incr(self, x, term=1)
 |      Increments the freq/prob associated with the value x.
 |      
 |      Args:
 |          x: number value
 |          term: how much to increment by
 |  
 |  InitFailure(self, values)
 |      Raises an error.
 |  
 |  InitMapping(self, values)
 |      Initializes with a map from value to probability.
 |      
 |      values: map from value to probability
 |  
 |  InitPmf(self, values)
 |      Initializes with a Pmf.
 |      
 |      values: Pmf object
 |  
 |  InitSequence(self, values)
 |      Initializes with a sequence of equally-likely values.
 |      
 |      values: sequence of values
 |  
 |  Items(self)
 |      Gets an unsorted sequence of (value, freq/prob) pairs.
 |  
 |  Log(self, m=None)
 |      Log transforms the probabilities.
 |      
 |      Removes values with probability 0.
 |      
 |      Normalizes so that the largest logprob is 0.
 |  
 |  MaxLike(self)
 |      Returns the largest frequency/probability in the map.
 |  
 |  Mult(self, x, factor)
 |      Scales the freq/prob associated with the value x.
 |      
 |      Args:
 |          x: number value
 |          factor: how much to multiply by
 |  
 |  Print(self)
 |      Prints the values and freqs/probs in ascending order.
 |  
 |  Remove(self, x)
 |      Removes a value.
 |      
 |      Throws an exception if the value is not there.
 |      
 |      Args:
 |          x: value to remove
 |  
 |  Render(self)
 |      Generates a sequence of points suitable for plotting.
 |      
 |      Returns:
 |          tuple of (sorted value sequence, freq/prob sequence)
 |  
 |  Scale(self, factor)
 |      Multiplies the values by a factor.
 |      
 |      factor: what to multiply by
 |      
 |      Returns: new object
 |  
 |  Set(self, x, y=0)
 |      Sets the freq/prob associated with the value x.
 |      
 |      Args:
 |          x: number value
 |          y: number freq or prob
 |  
 |  SetDict(self, d)
 |      Sets the dictionary.
 |  
 |  Total(self)
 |      Returns the total of the frequencies/probabilities in the map.
 |  
 |  Values(self)
 |      Gets an unsorted sequence of values.
 |      
 |      Note: one source of confusion is that the keys of this
 |      dictionary are the values of the Hist/Pmf, and the
 |      values of the dictionary are frequencies/probabilities.
 |  
 |  __contains__(self, value)
 |  
 |  __init__(self, values=None, name='')
 |      Initializes the distribution.
 |      
 |      hypos: sequence of hypotheses
 |  
 |  __iter__(self)
 |  
 |  __len__(self)
 |  
 |  iterkeys(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from _DictWrapper:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

This is a Probability Mass Function object, which includes some pre-defined methods and parameters to help us deal with Pmfs (which measures the chance that some disecrete number is equal some value, where all values must sum to 1).

Here's a PMF for the fair die problem we will explore below



In [15]:

    
pmf = Pmf()
# intialize the object

Pmf() creates an empty Probability Mass Function with no values.



In [16]:

    
for x in [1,2,3,4,5,6]:
    # for x in array
    
    pmf.Set(x, 1/6.0)
    # Set the frequency for each x
    
pmf.Print()









    



1 0.16666666666666666
2 0.16666666666666666
3 0.16666666666666666
4 0.16666666666666666
5 0.16666666666666666
6 0.16666666666666666

The Set method sets the probability associated with each value to $1/6$.

Here’s another example that counts the number of times each word appears in a sequence:



In [18]:

    
word_list = ['hi', 'the', 'bye', 'hi', 'football', 'sky']

pmf = Pmf()

for word in word_list:
    pmf.Incr(word, 1)
pmf.Print()









    



bye 1
football 1
hi 2
sky 1
the 1

Incr increases the “probability” associated with each word (array value) by 1. If a word is not already in the Pmf, it is added.

“Probability” is in quotes because in this example, the probabilities are not normalized; that is, they do not add up to 1, so they are not true probabilities.

However, in this example the word counts are proportional to the probabilities. So after we count all the words, we can compute probabilities by dividing through by the total number of words.

Pmf provides a method, Normalize, that does exactly that:



In [20]:

    
pmf.Normalize()
pmf.Print()
# wow!









    



bye 0.16666666666666669
football 0.16666666666666669
hi 0.33333333333333337
sky 0.16666666666666669
the 0.16666666666666669

Once you have a defined Pmf object, you can ask for the probability associated with any value:



In [21]:

    
print(pmf.Prob('the'))









    



0.16666666666666669

Which returns the frequency of the word “the” as a fraction of the words in the list.

Pmf uses a Python dictionary to store the values and their probabilities, so the values in the Pmf can be any hashable type.

The probabilities can be any numerical type, but they are usually floating-point numbers (type float).

In the context of Bayes’s theorem, it is natural to use a PMF to map from each hypothesis to its probability.

In the cookie problem, the hypotheses are $B_1$ and $B_2$. In Python, I represent them with strings:



In [37]:

    
pmf = Pmf()
# Reinitialize the Pmf()

pmf.Set('Bowl 1', 0.5)
pmf.Set('Bowl 2', 0.5)
# Set up the prior distribution; 50/50 odds

pmf.Print()
# Show us what's in there so far









    



Bowl 1 0.5
Bowl 2 0.5

This distribution, which contains the priors for each hypothesis, is called (wait for it) the **prior distribution**.

To update the distribution based on new data (the vanilla cookie), we multiply each prior by the corresponding likelihood.

Here's a visualization of that in action, courtesy of this amazing post by Brandon Rohrer

Back to the cookie example, the likelihood of drawing a vanilla cookie from Bowl 1 is 3/4. The likelihood for Bowl 2 is 1/2.

Let's use the Mult method to update these probabilities with the Vanilla likelihoods. Mult does what you would expect. It gets the probability for the given hypothesis and multiplies by the given likelihood.



In [38]:

    
pmf.Mult('Bowl 1', 0.75)
pmf.Mult('Bowl 2', 0.5)
# Update with the vanilla likelihoods

pmf.Print()
# Where are we at now?









    



Bowl 1 0.375
Bowl 2 0.25

Note that this does not add up to 1. That is because after this update, the distribution is no longer normalized, but because these hypotheses are mutually exclusive and collectively exhaustive, we can **renormalize**:



In [39]:

    
pmf.Normalize()









    Out[39]:





0.625

The result is a distribution that contains the posterior probability for each hypothesis, which is called (wait now) the **posterior distribution**.

Check if it's normalized now:



In [40]:

    
pmf.Normalize()









    Out[40]:





1.0

Finally, we can get the posterior probability for Bowl 1, what are the odds of getting that vanilla cookie from Bowl 1?



In [41]:

    
pmf.Prob('Bowl 1')
# Odds of getting the Vanilla cookie from Bowl 1?









    Out[41]:





0.6000000000000001

And for fun, Bowl 2?



In [42]:

    
pmf.Prob('Bowl 2')
# Odds of getting the Vanilla cookie from Bowl 1?









    Out[42]:





0.4

And the answer is 0.6. You can download this example from http://thinkbayes.com/cookie.py. For more information see Section [download].

The Bayesian Framework

Before we go on to other problems, I want to rewrite the code from the previous section to make it more general. First I’ll define a class to encapsulate the code related to this problem:



In [84]:

    
class Cookie(Pmf):
    """A map from string bowl ID to probablity."""

    def __init__(self, hypos):
        """Initialize self.

        hypos: sequence of string bowl IDs
        """
        Pmf.__init__(self)
        # Intializie the Pmf object from before
        for hypo in hypos:
            #self.Set(hypo, 1)
            # For hypo in in array, set to 1
            
            # For learning, let's see what happens with Pmf.Incr()
            # Yields the same result
            self.Incr(hypo, 1)
            
        self.Normalize()
        #Renormalize after all the new hypotheses

        
    mixes = {
        'Bowl 1':dict(vanilla=0.75, chocolate=0.25),
        'Bowl 2':dict(vanilla=0.5, chocolate=0.5),
        }
    
    # Mix data as provided by the problem. 
    # Refresher: 
    # * Bowl 1 contains 30 vanilla cookies and 10 chocolate cookies.
    # * Bowl 2 contains 20 of each (10 vanilla, 10 chocolate)
    
    def Likelihood(self, data, hypo):
        """The likelihood of the data under the hypothesis.

        data: string cookie type
        hypo: string bowl ID
        """
        mix = self.mixes[hypo]
        # Search for the mix of a given hypo ('Bowl 1' or 'Bowl 2')
        like = mix[data]
        # Likelihood of the prior given the current data in the mixes dict
        return like
        # Return the likelihood 
        
    def Update(self, data):
        """Updates the PMF with new data.

        data: string cookie type
        """
        for hypo in self.Values():
        # For every hypo in the current prior distribution    
            like = self.Likelihood(data, hypo)
            # Get the likelihood value using the Likelihood() method above
            self.Mult(hypo, like)
            # Multiple the prior by the new Likelihood
        self.Normalize()
        # Renormalize after all the new updates

A Cookie object is now a Pmf that maps from hypotheses to their probabilities.

The __init__ method gives each hypothesis with the same prior probability. As in the previous section, there are two hypotheses:

Prior Probabilities



In [9]:

    
hypos = ['Bowl 1', 'Bowl 2']
pmf = Cookie(hypos)
# Run the Cookie object on our hypothesis, using __init__ to 
# generate priors

pmf.Print()
# Show us the current distribution









    



---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-9-8bd510bf2bac> in <module>()
      1 hypos = ['Bowl 1', 'Bowl 2']
----> 2 pmf = Cookie(hypos)
      3 # Run the Cookie object on our hypothesis, using __init__ to
      4 # generate priors
      5 

NameError: name 'Cookie' is not defined

Likelihood

Cookie provides an Update method that takes data as a parameter and updates the probabilities.

Update loops through each hypothesis in the suite and multiplies its probability by the likelihood of the data under the hypothesis, which is computed by Likelihood.

Likelihood uses the mixes parameter, which is a dictionary that maps from the name of a bowl to the mix of cookies in the bowl.



In [89]:

    
Cookie.mixes









    Out[89]:





{'Bowl 1': {'chocolate': 0.25, 'vanilla': 0.75},
 'Bowl 2': {'chocolate': 0.5, 'vanilla': 0.5}}

Here’s what the update looks like:



In [90]:

    
pmf.Update('vanilla')
# Update using all 'vanilla' entries in the dictionary

pmf.Print()
# Show us the posterior distribution (post Update via Likelihood)









    



Bowl 1 0.6000000000000001
Bowl 2 0.4

And then we can print the posterior probability of each hypothesis:



In [91]:

    
for hypo, prob in pmf.Items():
    print(hypo, prob)









    



Bowl 1 0.6000000000000001
Bowl 2 0.4

Which is the same as what we got before.

This code is more complicated than what we saw in the previous section, but the advantage is that it generalizes to the case where we draw more than one cookie from the same bowl (with replacement), using the data placeholder in the above methods:



In [93]:

    
dataset = ['vanilla', 'chocolate', 'vanilla']
# 3 draws, here are the results

for data in dataset:
    pmf.Update(data)
    # Update our pmf using the results of our draws
pmf.Print()
# What's the new distribution? 
# More refined with new information









    



Bowl 1 0.6549865229110512
Bowl 2 0.34501347708894875

The other advantage is that it provides a framework for solving many similar problems.

In the next section we’ll solve the Monty Hall problem computationally and then see what parts of the framework are the same.

The code in this section is available from http://thinkbayes.com/cookie2.py. For more information see Section [download].

The Monty Hall problem

To solve the Monty Hall problem, we’ll define a new class using the same skeleton:



In [27]:

    
class Monty(Pmf):
    """Map from string location of car to probability"""

    def __init__(self, hypos):
        """Initialize the prior distribution using the hp

        hypos: sequence of hypotheses
        """
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, 1)
        self.Normalize()

    def Update(self, data):
        """Updates each hypothesis based on the data.

        data: any representation of the data
        """
        for hypo in self.Values():
            like = self.Likelihood(data, hypo)
            self.Mult(hypo, like)
        self.Normalize()

    def Likelihood(self, data, hypo):
        """Compute the likelihood of the data under the hypothesis.

        hypo: string name of the door where the prize is
        data: string name of the door Monty opened
        """
        if hypo == data:
            return 0
        elif hypo == 'A':
            return 0.5
        else:
            return 1

So far Monty and Cookie are nearly the same (ignoring the Likelihood method for a second)

The code that creates the Pmf is the same, too, except for the names of the hypotheses:



In [28]:

    
hypos = 'ABC'
pmf = Monty(hypos)

pmf.Print()
# Current prior; all have the same odds









    



A 0.3333333333333333
B 0.3333333333333333
C 0.3333333333333333

Calling Update is pretty much the same:



In [29]:

    
data = 'B'
# Opened Door B

pmf.Update(data)
# Update Prior with the Likelihoods



In [30]:

    
pmf.Print()
# Posterior Distribution

# Our opened door B in data was not the car, so the odds for Car behind B are now 0









    



A 0.3333333333333333
B 0.0
C 0.6666666666666666

The implementation of Update is exactly the same; we are updating the Prior distribution as defined by the hypothesis using the Mult function via Likelhood.

Speaking of, let's examine the primary change of the new obect, the Likelihood:



In [ ]:

    
def Likelihood(self, data, hypo):
        """Compute the likelihood of the data under the hypothesis.

        hypo: string name of the door where the prize is
        data: string name of the door Monty opened
        """
        print('Is our hypo {} the same as our data {}?'.format(hypo, data))
        if hypo == data:
            print('Yes, so the odds of the car beind {} are 0'.format(data))
            return 0
        elif hypo == 'A':
            print('Not A, so the odds update to 50/50, only two doors left')
            return 0.5
        else:
            return 1

Finally, printing the results is the same:



In [26]:

    
for hypo, prob in pmf.Items():
    print(hypo, prob)









    



B 0.0
C 0.6666666666666666
A 0.3333333333333333

The problem centers around the notion of switching; the car is behind one of three doors, and Monty can safely open one door at random.

Here that door is in our data, door B. Once opened, we need to figure out whether we should stay or switch. Our Bayesian framework suggests that it is in our interest to switch. The logic arrives from the fact that Monty can safely choose between one of two doors that doesn't have the car, which is why the odds become 50/50 if he opens door A.

This combined with the data that there is definitely no car behind the door is what powers our switching behavior. Let us examine the other cases (opening door A or C instead)



In [ ]:

    
class Monty(Pmf):
    """Map from string location of car to probability"""

    def __init__(self, hypos):
        """Initialize the prior distribution using the hp

        hypos: sequence of hypotheses
        """
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, 1)
        self.Normalize()

    def Update(self, data):
        """Updates each hypothesis based on the data.

        data: any representation of the data
        """
        for hypo in self.Values():
            like = self.Likelihood(data, hypo)
            self.Mult(hypo, like)
        self.Normalize()

    def Likelihood(self, data, hypo):
        """Compute the likelihood of the data under the hypothesis.

        hypo: string name of the door where the prize is
        data: string name of the door Monty opened
        """
        if hypo == data:
            return 0
        elif hypo == 'A':
            return 0.5
        else:
            return 1

In this example, writing Likelihood is a little complicated, but the framework of the Bayesian update is simple. The code in this section is available from http://thinkbayes.com/monty.py. For more information see Section [download].

Encapsulating the framework

Now that we see what elements of the framework are the same, we can encapsulate them in an object—a Suite is a Pmf that provides __init__, Update, and Print:

class Suite(Pmf):
        """Represents a suite of hypotheses and their probabilities."""

        def __init__(self, hypo=tuple()):
            """Initializes the distribution."""

        def Update(self, data):
            """Updates each hypothesis based on the data."""

        def Print(self):
            """Prints the hypotheses and their probabilities."""

The implementation of Suite is in thinkbayes.py. To use Suite, you should write a class that inherits from it and provides Likelihood. For example, here is the solution to the Monty Hall problem rewritten to use Suite:



In [3]:

    
from thinkbayes import Suite

class Monty(Suite):

    def Likelihood(self, data, hypo):
        if hypo == data:
            return 0
        elif hypo == 'A':
            return 0.5
        else:
            return 1

And here’s the code that uses this class.

Once you've updated the Likelihood function for the given problem, things get much easier to operationalize.



In [4]:

    
suite = Monty('ABC')
suite.Update('B')
suite.Print()









    



A 0.3333333333333333
B 0.0
C 0.6666666666666666

You can download this example from http://thinkbayes.com/monty2.py. For more information see Section [download].

The M&M problem

We can use the Suite framework to solve the M&M problem. Writing the Likelihood function is tricky, but everything else is straightforward.

The M&M problem

M&M’s are small candy-coated chocolates that come in a variety of colors. Mars Inc. which makes M&M’s, changes the mixture of colors from time to time.

In 1995, they introduced blue M&M’s. Before then, the color mix in a bag of plain M&M’s was 30% Brown, 20% Yellow, 20% Red, 10% Green, 10% Orange, 10% Tan. Afterward it was 24% Blue , 20% Green, 16% Orange, 14% Yellow, 13% Red, 13% Brown.

Suppose a friend of mine has two bags of M&M’s, and he tells me that one is from 1994 and one from 1996. He won’t tell me which is which, but he gives me one M&M from each bag. One is yellow and one is green. What is the probability that the yellow one came from the 1994 bag?

This problem is similar to the cookie problem, with the twist that I draw one sample from each bowl/bag.

The first step is to enumerate the hypotheses. The bag the yellow M&M came from I’ll call Bag 1; I’ll call the other Bag 2. So the hypotheses are:

A: Bag 1 is from 1994, which implies that Bag 2 is from 1996.
B: Bag 1 is from 1996 and Bag 2 from 1994.

With the original table method, we can construct a table with a row for each hypothesis and a column for each term in Bayes’s theorem:

	Prior $\mathrm{p}(H)$	Likelihood $\mathrm{p}(D\vert H)$	$\mathrm{p}(H) \mathrm{p}(D\vert H)$	Posterior $\mathrm{p}(H\vert D)$
A	1/2	(20)(20)	200	20/27
B	1/2	(14)(10)	70	7/27

The first column has the priors. Based on the statement of the problem, it is reasonable to choose ${{\mathrm{p}(A)}} = {{\mathrm{p}(B)}} = 1/2$.

The second column has the likelihoods, which follow from the information in the problem.

For example:

If $A$ is true, the yellow M&M came from the 1994 bag with probability 20%, and the green came from the 1996 bag with probability 20%.
If $B$ is true, the yellow M&M came from the 1996 bag with probability 14%, and the green came from the 1994 bag with probability 10%.

Because the selections are independent, we get the conjoint probability by multiplying.

The third column is just the product of the previous two. The sum of this column, 270, is the normalizing constant. To get the last column, which contains the posteriors, we divide the third column by the normalizing constant.

That’s it. Simple, right?

Well, you might be bothered by one detail. I write $\mathrm{p}(D|H)$ in terms of percentages, not probabilities, which means it is off by a factor of 10,000. But that cancels out when we divide through by the normalizing constant, so it doesn’t affect the result.

When the set of hypotheses is mutually exclusive and collectively exhaustive, you can multiply the likelihoods by any factor, if it is convenient, as long as you apply the same factor to the entire column.



In [32]:

    
from thinkbayes import Suite

class M_and_M(Suite):
    """Map from hypothesis (A or B) to probability."""

    # Mixes as defined by the problem
    
    mix94 = dict(brown=30,
                 yellow=20,
                 red=20,
                 green=10,
                 orange=10,
                 tan=10)
    
    mix96 = dict(blue=24,
                 green=20,
                 orange=16,
                 yellow=14,
                 red=13,
                 brown=13)

    hypoA = dict(bag1=mix94, bag2=mix96)
    hypoB = dict(bag1=mix96, bag2=mix94)
    
    # Hypothesis using the info, i.e which bag did it come from, 1 or 2? 
    
    hypotheses = dict(A=hypoA, B=hypoB)

    def Likelihood(self, data, hypo):
        """Computes the likelihood of the data under the hypothesis.

        hypo: string hypothesis (A or B)
        data: tuple of string bag, string color
        """
        print('The data we observed is {}'.format(data))   
        bag, color = data
        # Take the bag and color of M&M from the observation
        mix = self.hypotheses[hypo][bag]
        print('The current hypo we are examing is {}'.format(hypo))
        # Pull the mixes for the relevant bag and color
        print('The current mix for {} is {}'.format(bag, mix))
        like = mix[color]
        # Calculate the likelihood of seeing that color
        print('Return the number of M&Ms with that color in {} ({}) and renomarlize for likelihood'.format(bag, like))
        return like

First I need to encode the color mixes from before and after 1995:

mix94 = dict(brown=30,
             yellow=20,
             ...

mix96 = dict(blue=24,
             green=20,
             ...

Then I have to encode the hypotheses:

hypoA = dict(bag1=mix94, bag2=mix96)
hypoB = dict(bag1=mix96, bag2=mix94)

hypoA represents the hypothesis that Bag 1 is from 1994 and Bag 2 from 1996. hypoB is the other way around.

Next I map from the name of the hypothesis to the representation:

hypotheses = dict(A=hypoA, B=hypoB)

And finally I can write Likelihood. In this case the hypothesis, hypo, is a string, either A or B. The data is a tuple that specifies a bag and a color.

    def Likelihood(self, data, hypo):
        bag, color = data
        mix = self.hypotheses[hypo][bag]
        like = mix[color]
        return like

Here’s the code that creates the suite and updates it:



In [31]:

    
suite = M_and_M('AB')
print('\n The prior probabilities before any observations are:\n')
suite.Print()

print('\n Where \'A\' are the odds the bag is from 1994, and \'B\' are the odds that it came from 1996\n')

print('\n Let us draw the first M&M')
suite.Update(('bag1', 'yellow'))

print('\n The posterior probabilities after this observation is now:')
suite.Print()

print('\n Let us draw another M&M')
suite.Update(('bag2', 'green'))

print('\n The posterior probabilities after pulling both M&Ms is now:')
suite.Print()









    



 The prior probabilities before any observations are:

A 0.5
B 0.5

 Where 'A' are the odds the bag is from 1994, and 'B' are the odds that it came from 1996


 Let us draw the first M&M
The data we observed is ('bag1', 'yellow')
The current hypo we are examing is A
The current mix for bag1 is {'tan': 10, 'red': 20, 'green': 10, 'orange': 10, 'brown': 30, 'yellow': 20}
The number of M&Ms with that color in bag1 is 20
The data we observed is ('bag1', 'yellow')
The current hypo we are examing is B
The current mix for bag1 is {'orange': 16, 'red': 13, 'green': 20, 'brown': 13, 'blue': 24, 'yellow': 14}
The number of M&Ms with that color in bag1 is 14

 The posterior probabilities after this observation is now:
A 0.5882352941176471
B 0.4117647058823529

 Let us draw another M&M
The data we observed is ('bag2', 'green')
The current hypo we are examing is A
The current mix for bag2 is {'orange': 16, 'red': 13, 'green': 20, 'brown': 13, 'blue': 24, 'yellow': 14}
The number of M&Ms with that color in bag2 is 20
The data we observed is ('bag2', 'green')
The current hypo we are examing is B
The current mix for bag2 is {'tan': 10, 'red': 20, 'green': 10, 'orange': 10, 'brown': 30, 'yellow': 20}
The number of M&Ms with that color in bag2 is 10

 The posterior probabilities after pulling both M&Ms is now:
A 0.7407407407407407
B 0.2592592592592592

The posterior probability of A is approximately $20/27$, which is what we got before.

The code in this section is available from http://thinkbayes.com/m_and_m.py. For more information see Section [download].

Discussion

This chapter presents the Suite class, which encapsulates the Bayesian update framework.

Suite is an **abstract type**, which means that it defines the interface a Suite is supposed to have, but does not provide a complete implementation. The Suite interface includes Update and Likelihood, but the Suite class only provides an implementation of Update, not Likelihood.

A **concrete type** is a class that extends an abstract parent class and provides an implementation of the missing methods. For example, Monty extends Suite, so it inherits Update and provides Likelihood.

If you are familiar with design patterns, you might recognize this as an example of the template method pattern. You can read about this pattern at http://en.wikipedia.org/wiki/Template_method_pattern.

Most of the examples in the following chapters follow the same pattern; for each problem we define a new class that extends Suite, inherits Update, and provides Likelihood. In a few cases we override Update, usually to improve performance.

Exercises

In Section [framework] I said that the solution to the cookie problem generalizes to the case where we draw multiple cookies with replacement.

But in the more likely scenario where we eat the cookies we draw, the likelihood of each draw depends on the previous draws.

Modify the solution in this chapter to handle selection without replacement. Hint: add instance variables to Cookie to represent the hypothetical state of the bowls, and modify Likelihood accordingly. You might want to define a Bowl object.

Computational Statistics

Distributions

The Cookie Problem

The Bayesian Framework

Prior Probabilities

Likelihood

The Monty Hall problem

Encapsulating the framework

The M&M problem

The M&M problem

Discussion

Exercises