Copyright 2016 Allen Downey

MIT License: http://opensource.org/licenses/MIT

```
In [1]:
```from __future__ import print_function, division
%matplotlib inline
%precision 6
import matplotlib.pyplot as plt
import numpy as np

```
In [2]:
```from inspect import getsourcelines
def show_code(func):
lines, _ = getsourcelines(func)
for line in lines:
print(line, end='')

```
In [3]:
```from distribution import Pmf
show_code(Pmf)

```
```

I'll create a Pmf that represents a biased coin, which has a 60% chance of landing heads and a 40% chance of landing tails.

If you are bothered by the fact that it is physically very difficult to bias a coin toss in this way, imagine a 10-sided die with `H`

on 6 sides and `T`

on 4 sides.

```
In [4]:
```coin = Pmf(dict(H=0.6, T=0.4))
coin.display()

```
```

`+`

operator to compute the possible outcomes of two coin tosses and their probabilities.

```
In [5]:
```twice = coin + coin
twice.display()

```
```

And similarly, the possible outcomes of three coin tosses.

```
In [6]:
```thrice = sum([coin]*3)
thrice.display()

```
```

Notice that the outcomes take the order of the tosses into account, so `HT`

is considered a different outcome from `TH`

.

If we don't care about the order and we only care about the number of heads and tails, we can loop through the outcomes and count the number of heads.

```
In [7]:
```from collections import Counter
for val, prob in sorted(thrice.items()):
heads = val.count('H')
print(heads, prob)

```
```

And we can make a new Pmf that maps from the total number of heads to the probability of that total.

```
In [8]:
```def make_pmf_heads(coin, n):
coins = sum([coin]*n)
pmf = Pmf()
for val, prob in coins.items():
heads = val.count('H')
pmf[heads] += prob
return pmf

Here's what it looks like:

```
In [9]:
```pmf_heads = make_pmf_heads(coin, 3)
pmf_heads.display()

```
```

**Exerise:** Create `pmf_heads`

for a few different values of `n`

and plot them using the `plot_pmf`

method.

```
In [10]:
```# Solution
for n in [5, 10, 15]:
make_pmf_heads(coin, n).plot_pmf()

```
```

**Exerise:** Run the following example and see how long it takes. Try it out with a few values of `n`

and see how the run time depends on `n`

.

```
In [11]:
```n = 15
%time make_pmf_heads(coin, n).plot_pmf()

```
```

This way of computing `pmf_heads`

is not very efficient. For `n`

tosses, there are $2^n$ possible outcomes, and for large values of `n`

, that is not tractable.

In the next section we will figure out a better way.

```
In [12]:
```from sympy import symbols
p = symbols('p')
sym_coin = Pmf(dict(H=p, T=1-p))
sym_coin.display()

```
```

Now we can see the distribution of the number of heads, which I'll call `k`

, after a few tosses:

```
In [13]:
```make_pmf_heads(sym_coin, 2).display()

```
```

```
In [14]:
```make_pmf_heads(sym_coin, 3).display()

```
```

```
In [15]:
```make_pmf_heads(sym_coin, 4).display()

```
```

The general pattern is that probability of `k`

heads after `n`

tosses is the product of three terms

- The probability of
`k`

heads, which is`p**k`

. - The probability of
`n-k`

tails, which is`(1-p)**(n-k)`

. - An integer coefficient.

You might already know that the coefficient is the "binomial coefficient", which is written $n \choose k$ and pronounced "n choose k". But pretend for a moment that you don't know that and let's figure it out.

To make the pattern easier to see, I'll create a fair coin where `p = 1-p = 1/2`

```
In [16]:
```fair_coin = Pmf(dict(H=p, T=p))
fair_coin.display()

```
```

Now the probability of all outcomes is `p**n`

```
In [17]:
```thrice = sum([fair_coin]*3)
thrice.display()

```
```

So when we count the number of heads, it is easier to see the coefficients.

```
In [18]:
```pmf_heads = make_pmf_heads(fair_coin, 3)
pmf_heads.display()

```
```

And even easier if we divide through by `p**n`

```
In [19]:
```for val, prob in pmf_heads.items():
print(val, prob / p**3)

```
```

`n`

:

```
In [20]:
```def coefficients(n):
fair_coin = Pmf(dict(H=p, T=p))
pmf_heads = make_pmf_heads(fair_coin, n)
for val, prob in pmf_heads.items():
print(prob / p**n, end=' ')
print()

Here are the coefficients for `n=3`

```
In [21]:
```coefficients(3)

```
```

And here they are for `n`

in the range from `1`

to `9`

```
In [22]:
```for n in range(1, 10):
coefficients(n)

```
```

Now we can look for patterns.

If we flip the coin

`n`

times, the coefficient associated with getting`k`

heads is`1`

if`k`

is`0`

or`n`

.Otherwise, the

`(n, k)`

coefficient is the sum of two coefficients from the previous row:`(n-1, k)`

and`(n-1, k-1)`

We can use these observations to compute the binomial coefficient recursively:

```
In [23]:
```def binomial_coefficient(n, k):
if k==0 or k==n:
return 1
return binomial_coefficient(n-1, k) + binomial_coefficient(n-1, k-1)

And it yields the same results.

```
In [24]:
```binomial_coefficient(9, 5)

```
Out[24]:
```

Here are the results for `n`

from 1 to 9 again.

```
In [25]:
```for n in range(1, 10):
for k in range(n+1):
print(binomial_coefficient(n, k), end=' ')
print()

```
```

So far so good.

**Exercise:** SciPy provides a "special" function called `binom`

that you can import from `scipy.special`

. Test it to confirm that it is consistent with our function.

Note that `scipy.special.binom`

returns a float, so for large values of `n`

it is only approximate.

```
In [26]:
```# Solution
from scipy.special import binom
binom(9, 5)

```
Out[26]:
```

**Exercise:** The recursive implementation of `binomial_coefficient`

is inefficient for large values of `n`

because it computes the same intermediate results many times. You can speed it up (a lot!) by memoizing previously computed results.

Write a version called `fast_binom`

that caches results in a dictionary.

```
In [27]:
```# Solution
def fast_binom(n, k, cache={}):
if k==0 or k==n:
return 1
try:
return cache[n, k]
except KeyError:
res = fast_binom(n-1, k) + fast_binom(n-1, k-1)
cache[n, k] = res
return res
for n in range(1, 10):
for k in range(n+1):
print(binomial_coefficient(n, k), end=' ')
print()

```
```

Bringing it all together, we have

$PMF(k; n,p) = {n \choose k} p^k (1-p)^{n-k}$

This equation can be interpreted at two levels:

If you want to know the probability of

`k`

heads in`n`

tosses, where the probability of heads in each toss is`p`

, you can use the formula on the right to compute it.More abstractly, this equation states that the formula on the right is the PMF of

`k`

with the parameters`n`

and`p`

.

To make the difference between these interpretations clear, let's look at two functions:

```
1. `eval_binomial_pmf` evaluates this formula for given values of `k`, `n`, and `p`.
2. `make_binomial_pmf` evaluates the formula for a range of values of `k`, and returns the resulting `Pmf` object.
```

First, here's `eval_binomial_pmf`

:

```
In [28]:
```from scipy.special import binom
def eval_binomial_pmf(k, n, p):
return binom(n, k) * p**k * (1-p)**(n-k)

We can use it to compute probabilities for each value of `k`

directly:

```
In [29]:
```n = 3
p = 0.6
for k in range(n+1):
print(k, eval_binomial_pmf(k, n, p))

```
```

And here's the corresponding `pmf_heads`

for comparison

```
In [30]:
```coin = Pmf(dict(H=p, T=1-p))
make_pmf_heads(coin, n).display()

```
```

They are the same, at least within floating point error.

Now here's `make_binomial_pmf`

```
In [31]:
```def make_binomial_pmf(n, p):
pmf = Pmf()
for k in range(n+1):
pmf[k] = eval_binomial_pmf(k, n, p)
return pmf

We can use it to make a Pmf that contains the possible values of `k`

and their probabilities:

```
In [32]:
```pmf = make_binomial_pmf(n, p)
pmf.display()

```
```

Here's what the distribution of `k`

looks like for given values of `n`

and `p`

:

```
In [33]:
```pmf.plot_pmf()
plt.xlabel('k');

```
```

If we hold `p`

constant, we can see how the distribution of `k`

changes as `n`

increases

```
In [34]:
```p = 0.6
for n in [5, 10, 15]:
make_binomial_pmf(n, p).plot_pmf(label=n)
plt.legend();

```
```

**Exercise:** Keeping `n=10`

, plot the distribution of `k`

for a few different values of `p`

.

```
In [35]:
```# Solution
n = 10
for p in [0.2, 0.5, 0.8]:
make_binomial_pmf(n, p).plot_pmf(label=p)
plt.legend();

```
```

The Pmf objects we just created represent differnet distributions of `k`

based on different values of the parameters `n`

and `p`

.

As `make_binomial_pmf`

demonstrates, if you give me `n`

and `p`

, I can compute a distribution of `k`

.

Speaking casually, people sometimes refer to this equation as the PMF of "the" binomial distribution:

$PMF(k; n,p) = {n \choose k} p^k (1-p)^{n-k}$

More precisely, it is a "family" of distributions, where the parameters `n`

and `p`

specify a particular member of the family.

**Exercise:** Suppose you toss a fair coin 10 times. What is the probability of getting 5 heads?

If you run this experiment many times, what is the mean number of heads you expect? What is the variance in the number of heads?

```
In [36]:
```# Solution
n = 10
p = 0.5
pmf = make_binomial_pmf(n, p)
print(pmf[5])
print(pmf.mean())
print(pmf.var())

```
```

**Exercise:** Suppose you toss a fair coin 10 times. What is the probability of getting fewer than 5 heads?

If you run this experiment many times, what is the median number of heads you expect? What is the interquartile range (IQR) in the number of heads?

Hint: You might want to make a CDF.

```
In [37]:
```# Solution
from distribution import compute_cumprobs, Cdf
xs, ps = compute_cumprobs(pmf.d)
cdf = Cdf(xs, ps)
print(cdf[4])
low, median, high = cdf.values([0.25, 0.5, 0.75])
print(median, high-low)

```
```

```
In [ ]:
```