Second Edition

Copyright 2020 Allen B. Downey

License: Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)

```
In [1]:
```# If we're running on Colab, install empiricaldist
# https://pypi.org/project/empiricaldist/
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
!pip install empiricaldist

```
In [2]:
```# Get utils.py and create directories
import os
if not os.path.exists('utils.py'):
!wget https://github.com/AllenDowney/ThinkBayes2/raw/master/code/soln/utils.py
if not os.path.exists('figs'):
!mkdir figs
if not os.path.exists('tables'):
!mkdir tables

```
In [3]:
```import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from empiricaldist import Pmf
from utils import decorate, savefig

On November 1, 2007, contestants named Letia and Nathaniel appeared on *The Price is Right*, an American game show. They competed in a game called "The Showcase", where the objective is to guess the price of a collection of prizes. The contestant who comes closest to the actual price, without going over, wins the prizes.

Nathaniel went first. His showcase included a dishwasher, a wine cabinet, a laptop computer, and a car. He bid $26,000.

Letia’s showcase included a pinball machine, a video arcade game, a pool table, and a cruise of the Bahamas. She bid $21,500.

The actual price of Nathaniel’s showcase was $25,347. His bid was too high, so he lost.

The actual price of Letia’s showcase was $21,578.

She was only off by $78, so she won her showcase and, because her bid was off by less than 250, she also won Nathaniel’s showcase.

For a Bayesian thinker, this scenario suggests several questions:

Before seeing the prizes, what prior beliefs should the contestant have about the price of the showcase?

After seeing the prizes, how should the contestant update those beliefs?

Based on the posterior distribution, what should the contestant bid?

The third question demonstrates a common use of Bayesian methods: decision analysis.

This problem is inspired by this example in Cameron Davidson-Pilon’s book, Probablistic Programming and Bayesian Methods for Hackers.

To choose a prior distribution of prices, we can take advantage of data from previous episodes. Fortunately, fans of the show keep detailed records.

For this example, I downloaded files containing the price of each showcase from the 2011 and 2012 seasons and the bids offered by the contestants.

The following cells load the data files.

```
In [4]:
```# Load the data files
import os
if not os.path.exists('showcases.2011.csv'):
!wget https://github.com/AllenDowney/ThinkBayes2/raw/master/showcases.2011.csv
if not os.path.exists('showcases.2012.csv'):
!wget http://github.com/AllenDowney/ThinkBayes2/raw/master/showcases.2012.csv

The following function reads the data and cleans it up a little.

```
In [5]:
```def read_data(filename):
"""Read the showcase price data.
filename: string
returns: DataFrame
"""
df = pd.read_csv(filename, index_col=0, skiprows=[1])
return df.dropna().transpose()

I'll read both files and concatenate them.

```
In [6]:
```df2011 = read_data('showcases.2011.csv')
df2011.shape

```
Out[6]:
```

```
In [7]:
```df2012 = read_data('showcases.2012.csv')
df2012.shape

```
Out[7]:
```

```
In [8]:
```df = pd.concat([df2011, df2012], ignore_index=True)
df.shape

```
Out[8]:
```

Here's what the dataset looks like:

```
In [9]:
```df.head()

```
Out[9]:
```

This dataset contains the prices for 313 previous showcases, which we can think of as a sample from the population of possible prices.

We can use this sample to estimate the prior distribution of showcase prices. One way to do that is kernel density estimation (KDE), which uses the sample to estimate a smooth distribution.

SciPy provides `gaussian_kde`

, which takes a sample and returns an object that represents the estimated distribution.

The following function takes a sample, makes a KDE, evaluates it at a given sequence of quantities, and returns the result as a normalized PMF.

```
In [10]:
```from scipy.stats import gaussian_kde
def make_kde(qs, sample):
"""Make a PMF based on KDE:
qs: quantities where we should evaluate the KDE
sample: sequence of values
returns: Series that represents a normalized PMF
"""
kde = gaussian_kde(sample)
ps = kde(qs)
pmf = Pmf(ps, qs)
pmf.normalize()
return pmf

We can use it to estimate the distribution of values for Showcase 1:

```
In [11]:
```qs = np.linspace(0, 80000, 81)
prior1 = make_kde(qs, df['Showcase 1'])

Here's what it looks like:

```
In [12]:
```def decorate_value(title=''):
decorate(xlabel='Showcase value ($)',
ylabel='PMF',
title=title)

```
In [13]:
```prior1.plot(label='Prior 1')
decorate_value('Prior distribution of showcase value')
savefig('fig08-01')

```
```

**Exercise:** Use this function to make a Pmf that represents the prior distribution for Showcase 2, and plot it.

```
In [14]:
```# Solution
qs = np.linspace(0, 80000, 81)
prior2 = make_kde(qs, df['Showcase 2'])

```
In [15]:
```# Solution
prior1.plot(label='Prior 1')
prior2.plot(label='Prior 2')
decorate_value('Prior distributions of showcase value')

```
```

To update these priors, we have to answer these questions:

What data should we consider and how should we quantify it?

Can we compute a likelihood function; that is, for each hypothetical price, can we compute the conditional likelihood of the data?

To answer these questions, I will model the contestant as a price-guessing instrument with known error characteristics. In other words, when the contestant sees the prizes, they guess the price of each prize --- ideally without taking into consideration the fact that the prize is part of a showcase --- and add up the prices. Let’s call this total `guess`

.

Under this model, the question we have to answer is, “If the actual price is `price`

, what is the likelihood that the contestant’s estimate would be `guess`

?”

Equivalently, if we define `error = guess - price`

, we can ask, “What is the likelihood that the contestant’s estimate is off by `error`

?”

To answer this question, I'll use the historical data again. For each showcase in the dataset, let's look at the difference between the contestant's bid and the actual price:

```
In [16]:
```sample_diff1 = df['Bid 1'] - df['Showcase 1']
sample_diff2 = df['Bid 2'] - df['Showcase 2']

To visualize the distribution of these differences, we can use KDE again.

```
In [17]:
```qs = np.linspace(-40000, 20000, 61)
kde_diff1 = make_kde(qs, sample_diff1)
kde_diff2 = make_kde(qs, sample_diff2)

```
In [18]:
```kde_diff1.plot(label='Diff 1')
kde_diff2.plot(label='Diff 2')
decorate(xlabel='Difference in value ($)',
ylabel='PMF',
title='Difference between bid and actual value')
savefig('fig08-02')

```
```

It looks like the bids are too low more often than too high, which makes sense. Remember that under the rules of the game, you lose if you overbid, so contestants probably underbid to some degree deliberately.

Here is the mean and standard deviation of `Diff`

for Player 1.

```
In [19]:
```mean_diff1 = sample_diff1.mean()
std_diff1 = sample_diff1.std()
mean_diff1, std_diff1

```
Out[19]:
```

We can use the observed distribution of differences to model the contestant's distribution of errors.

This step is a little tricky because we don’t actually know the contestant’s guesses; we only know what they bid.

So we have to make some assumptions:

I'll assume that contestants underbid because they are being strategic, and that on average their guesses are accurate. In other words, the mean of their errors is 0.

But I'll assume that the spread of the differences reflects the actual spread of their errors. So, I'll use the standard deviation of the differences as the standard deviation of their errors.

Based on these assumptions, I'll make a normal distribution with parameters 0 and `std_diff1`

.

```
In [20]:
```from scipy.stats import norm
error_dist1 = norm(0, std_diff1)
error_dist1

```
Out[20]:
```

We'll use this distribution to do the update.

```
In [21]:
```guess1 = 23000
qs = prior1.index
error1 = guess1 - qs

Now suppose you know based on past performance that your estimation error is well modeled by `error_dist1`

.

Under that assumption we can compute the likelihood of your estimate under each hypothesis.

```
In [22]:
```likelihood1 = error_dist1.pdf(error1)

And we can use that likelihood to update the prior.

```
In [23]:
```posterior1 = prior1 * likelihood1
posterior1.normalize()

```
Out[23]:
```

Here's what the posterior distribution looks like:

```
In [24]:
```prior1.plot(color='gray', label='Prior 1')
posterior1.plot(color='C0', label='Posterior 1')
decorate_value('Prior and posterior distribution of showcase value')
savefig('fig08-03')

```
```

```
In [25]:
```prior1.mean(), posterior1.mean()

```
Out[25]:
```

Before you saw the prizes, you expected to see a showcase with a value close to $30,000.

After making an estimate of $23,000, you updated the prior distribution.

Based on the combination of the prior and your estimate, you now expect the actual price to be about $26,000.

**Exercise:** Now suppose you are Player 2. When you see your showcase, you estimte that the total price is $38,000.

Use `diff2`

to construct a normal distribution that represents the distribution of your estimation errors.

Compute the likelihood of your estimate for each actual price and use it to update `prior2`

.

Plot the posterior distribution and compute the posterior mean. Based on your estimate, what do you expect the actual price of the showcase to be?

```
In [26]:
```# Solution
mean_diff2 = sample_diff2.mean()
std_diff2 = sample_diff2.std()
mean_diff2, std_diff2

```
Out[26]:
```

```
In [27]:
```# Solution
error_dist2 = norm(0, std_diff2)

```
In [28]:
```# Solution
guess2 = 38000
qs = prior2.index
error2 = guess2 - qs
likelihood2 = error_dist2.pdf(error2)

```
In [29]:
```# Solution
posterior2 = prior2 * likelihood2
posterior2.normalize()

```
Out[29]:
```

```
In [30]:
```# Solution
prior2.plot(color='gray', label='Prior 2')
posterior2.plot(color='C1', label='Posterior 2')
decorate_value('Prior and posterior distribution of showcase value')

```
```

```
In [31]:
```# Solution
prior2.mean(), posterior2.mean()

```
Out[31]:
```

Now that we have a posterior distribution for each player, let's think about strategy.

First, from the point of view of Player 1, let's compute the probability that Player 2 overbids. To keep it simple, I'll use only the performance of past players, ignoring the estimated value of the showcase.

The following function takes a sequence of past bids and returns the fraction that overbid.

```
In [32]:
```def prob_overbid(sample_diff):
"""Returns the probability this player overbids.
sample_diff: sequence of differences
"""
return np.mean(sample_diff > 0)

Here's an estimate for the probability that Player 2 overbids.

```
In [33]:
```prob_overbid(sample_diff2)

```
Out[33]:
```

Now suppose Player 1 underbids by $5000. What is the probability that Player 2 underbids by more?

The following function uses past performance to estimate the probability that a player underbids by more than a given amount, `diff`

:

```
In [34]:
```def prob_worse_than(diff, sample_diff):
"""Probability the opponents's diff is worse than the given diff.
diff: how much the oppenent is off by (always negative)
sample_diff: sequence of differences for the opponent
"""
return np.mean(sample_diff < diff)

Here's the probability that Player 2 underbids by more than $5000.

```
In [35]:
```prob_worse_than(-5000, sample_diff2)

```
Out[35]:
```

And here's the probability they are off by more than $10,000.

```
In [36]:
```prob_worse_than(-10000, sample_diff2)

```
Out[36]:
```

```
In [37]:
```def compute_prob_win(diff, sample_diff):
"""Computes the probability of winning for a given diff.
diff: how much your bid was off by
sample_diff: sequence of differences for the opponent
"""
# if you overbid you lose
if diff > 0:
return 0
# if the opponent over bids, you win
p1 = prob_overbid(sample_diff)
# or of their bid is worse than yours, you win
p2 = prob_worse_than(diff, sample_diff)
return p1 + p2

Here's the probability that you win, given that you underbid by $5000.

```
In [38]:
```compute_prob_win(-5000, sample_diff2)

```
Out[38]:
```

Now let's look at the probability of winning for a range of possible differences.

```
In [39]:
```xs = np.linspace(-30000, 5000, 121)
ys = [compute_prob_win(x, sample_diff2) for x in xs]
plt.plot(xs, ys)
decorate(xlabel='Difference between bid and actual price ($)',
ylabel='Probability of winning',
title='Player 1')
savefig('fig08-04')

```
```

If you underbid by $30,000, the chance of winning is about 30%, which is mostly the chance your opponent overbids.

As your bids gets closer to the actual price, your chance of winning approaches 1.

And, of course, if you overbid, you lose (even if your opponent also overbids).

**Exercise:** Run the same analysis from the point of view of Player 2. Using the sample of differences from Player 1, compute:

The probability that Player 1 overbids.

The probability that Player 1 underbids by more than $5000.

The probability that Player 2 wins, given that they underbid by $5000.

Then plot the probability that Player 2 wins for a range of possible differences between their bid and the actual price.

```
In [40]:
```prob_overbid(sample_diff1)

```
Out[40]:
```

```
In [41]:
```prob_worse_than(-5000, sample_diff1)

```
Out[41]:
```

```
In [42]:
```compute_prob_win(-5000, sample_diff1)

```
Out[42]:
```

```
In [43]:
```xs = np.linspace(-30000, 5000, 121)
ys = [compute_prob_win(x, sample_diff1) for x in xs]
plt.plot(xs, ys)
decorate(xlabel='Difference between bid and actual price ($)',
ylabel='Probability of winning',
title='Player 2')

```
```

In the previous section we computed the probability of winning, given that we have underbid by a particular amount.

In reality the contestants don't know how much they have underbid by, because they don't know the actual price.

But they do have a posterior distribution that represents their beliefs about the actual price, and they can use that to estimate their probability of winning with a given bid.

The following function take a possible bid, a posterior distribution of actual prices, and a sample of differences for the opponent.

It loops through the hypothetical prices in the posterior distribution and for each price:

Computes the difference between the bid and the hypothetical price.

Computes the probability that the player wins, given that difference.

Adds up the weighted sum of the probabilities, where the weights are the probabilities in the posterior distribution.

```
In [44]:
```def total_prob_win(bid, posterior, sample_diff):
"""Computes the total probability of winning with a given bid.
bid: your bid
posterior: Pmf of showcase value
sample_diff: sequence of differences for the opponent
returns: probability of winning
"""
total = 0
for price, prob in posterior.items():
diff = bid - price
total += prob * compute_prob_win(diff, sample_diff)
return total

This loop implements the law of total probability:

$P(win) = \sum_{price} P(price) ~ P(win ~|~ price)$

```
In [45]:
```total_prob_win(25000, posterior1, sample_diff2)

```
Out[45]:
```

```
In [46]:
```bids = posterior1.index
probs = [total_prob_win(bid, posterior1, sample_diff2)
for bid in bids]
prob_win_series = pd.Series(probs, index=bids)

```
In [47]:
```prob_win_series.plot(color='C1')
decorate(xlabel='Bid ($)',
ylabel='Probability of winning',
title='Player 1')
savefig('fig08-05')

```
```

And here's the bid that maximizes your chance of winning.

```
In [48]:
```prob_win_series.idxmax()

```
Out[48]:
```

```
In [49]:
```prob_win_series.max()

```
Out[49]:
```

Recall that your estimate was $23,000.

After using your estimate to compute the posterior distribution, the posterior mean is about $26,000.

But the bid that maximizes your chance of winning is $21,000.

**Exercise:** Do the same analysis for Player 2.

```
In [50]:
```# Solution
bids = posterior2.index
probs = [total_prob_win(bid, posterior2, sample_diff1)
for bid in bids]
prob_win_series = pd.Series(probs, index=bids)

```
In [51]:
```# Solution
prob_win_series.plot(color='C1')
decorate(xlabel='Bid ($)',
ylabel='Probability of winning',
title='Player 2')

```
```

```
In [52]:
```# Solution
prob_win_series.idxmax()

```
Out[52]:
```

In the previous section we computed the bid that maximizes your chance of winning. And if that's your goal, the bid we computed is optimal.

But winning isn't everything. Remember that if your bid is off by $250 or less, you win both showcases. So it might be a good idea to increase your bid a little: it increases the chance you overbid and lose, but it also increases the chance of winning both showcases.

Let's see how that works out. The following function computes how much you will win, on average, given your bid, the actual price, and a sample of errors for your opponent.

```
In [53]:
```def compute_gain(bid, price, sample_diff):
"""Computes expected gain given a bid and actual price.
bid: number
price: actual price
sample_diff: sequence of differences for the opponent
"""
diff = bid - price
prob = compute_prob_win(diff, sample_diff)
# if you are within 250 dollars, you win both showcases
if -250 <= diff <= 0:
return 2 * price * prob
else:
return price * prob

For example, if the actual price is $35000

and you bid $30000,

you will win about $23,600 worth of prizes on average.

```
In [54]:
```compute_gain(30000, 35000, sample_diff2)

```
Out[54]:
```

```
In [55]:
```def expected_gain(bid, posterior, sample_diff):
"""Computes the expected return of a given bid.
bid: your bid
posterior: distribution of showcase values
sample_diff: distribution of differences for the opponent
"""
total = 0
for price, prob in posterior.items():
total += prob * compute_gain(bid, price, sample_diff)
return total

For the posterior we computed earlier, based on an estimate of $23,000,

the expected gain for a bid of $21,000

is about $16,900.

```
In [56]:
```expected_gain(21000, posterior1, sample_diff2)

```
Out[56]:
```

But can we do any better?

To find out, we can loop through a range of bids and find the one that maximizes expected gain.

```
In [57]:
```bids = posterior1.index
gains = [expected_gain(bid, posterior1, sample_diff2) for bid in bids]
expected_gain_series = pd.Series(gains, index=bids)

Here are the results.

```
In [58]:
```expected_gain_series.plot(color='C2')
decorate(xlabel='Bid ($)',
ylabel='Expected gain ($)',
title='Player 1')
savefig('fig08-06')

```
```

And here is the optimal bid.

```
In [59]:
```expected_gain_series.idxmax()

```
Out[59]:
```

With that bid, the expected gain is about $17,400.

```
In [60]:
```expected_gain_series.max()

```
Out[60]:
```

Recall that the estimated value of the prizes was $23,000.

The bid that maximizes the chance of winning is $21,000.

And the bid that maximizes your expected gain is $22,000.

**Exercise:** Do the same analysis for Player 2.

```
In [61]:
```bids = posterior2.index
gains = [expected_gain(bid, posterior2, sample_diff1) for bid in bids]
expected_gain_series = pd.Series(gains, index=bids)

Here are the results.

```
In [62]:
```expected_gain_series.plot(color='C2')
decorate(xlabel='Bid ($)',
ylabel='Expected gain ($)',
title='Player 2')

```
```

And here is the optimal bid.

```
In [63]:
```expected_gain_series.idxmax()

```
Out[63]:
```

**Exercise:** This exercise is inspired by a true story. In 2001 I created Green Tea Press to publish my books, starting with *Think Python*. I ordered 100 copies from a short run printer and made the book available for through a distributor. After the first week, the distributor reported that 12 copies were sold. Based that report, I thought I would run out of copies in about 8 weeks, so I got ready to order more. My printer offered me a discount if I ordered more than 1000 copies, so I went a little crazy and ordered 2000 copies. A few days later, my mother called to tell me that her *copies* of the book had arrived. Surprised, I asked how many *copies*. She said ten.

It turned out I had sold only two copies to non-relatives. And it took a lot longer than I expected to sell 2000 copies.

The details of this story are unique, but the general problem is something almost every retailer has to figure out. Based on past sales, how do you predict future sales? And based on those predictions, how do you decide how much to order and when?

Often the cost of a bad decision is complicated. If you place a lot of small orders rather than one big one, your costs are likely to be higher. If you run out of inventory, you might lose customers. And if you order too much, you have to pay the various costs of holding inventory.

So, let's solve a version of the problem I faced. Suppose you start selling books online. During the first week you sell 12 copies (and let's assume that none of the customers are your mother). During the second week you sell 8 copies.

Assuming that the arrival of orders is a Poisson process, we can think of the weekly orders as samples from a Poisson distribution with an unknown rate. Choose a prior you think is appropriate and use the data to compute the posterior distribution of the order rate. Then generate a posterior predictive distribution for the number of copies you expect during the next 8 weeks.

Suppose the cost of printing the book is $5 per copy,

But if you order 100 or more, it's $4.50 per copy.

For every book you sell, you get $10.

But if you run out of books before the end of 8 weeks, you lose $50 in future sales for every week you are out of stock.

If you have books left over at the end of 8 weeks, you lose $2 in inventory costs per extra book.

For example, suppose you get orders for 10 books per week, every week.

If you order 60 books,

The total cost is $300.

You sell all 30 books, so you make $600.

But the book is out of stock for two weeks, so you lose $100 in future sales.

In total, your profit is $200.

If you order 100 books,

The total cost is $450.

You sell 80 books, again, so you make $800.

But you have 20 books left over at the end, so you lose $40.

In total, your profit is $310.

Combining these costs with your predictive distribution, how many books should you order to maximize your expected profit?

```
In [64]:
```def print_cost(printed):
if printed < 100:
return printed * 5
else:
return printed * 4.5

```
In [65]:
```def total_income(printed, orders):
sold = min(printed, np.sum(orders))
return sold * 10

```
In [66]:
```def inventory_cost(printed, orders):
excess = printed - np.sum(orders)
if excess > 0:
return excess * 2
else:
return 0

```
In [67]:
```def out_of_stock_cost(printed, orders):
weeks = len(orders)
total_orders = np.cumsum(orders)
for i, total in enumerate(total_orders):
if total > printed:
return (weeks-i) * 50
return 0

```
In [68]:
```def compute_profit(printed, orders):
return (total_income(printed, orders) -
print_cost(printed)-
out_of_stock_cost(printed, orders) -
inventory_cost(printed, orders))

To test these function, suppose we get exactly 10 orders per week for eight weeks:

```
In [69]:
```always_10 = [10] * 8
always_10

```
Out[69]:
```

If you print 60 books, your net profit is $200, as in the example.

```
In [70]:
```compute_profit(60, always_10)

```
Out[70]:
```

If you print 100 books, your net profit is $310.

```
In [71]:
```compute_profit(100, always_10)

```
Out[71]:
```

Of course, in the context of the problem you don't know how many books will be ordered in any given week. You don't even know the average rate of orders. However, given the data and some assumptions about the prior, you can compute the distribution of the rate of orders.

You'll have a chance to do that, but to demonstrate the decision analysis part of the problem, I'll start with the arbitrary assumption that order rates comes from a gamma distribution with mean 10.

Here's a `Pmf`

that represents this distribution.

```
In [72]:
```from scipy.stats import gamma
α = 10
qs = np.linspace(0, 25, 101)
ps = gamma.pdf(qs, α)
pmf = Pmf(ps, qs)
pmf.normalize()
pmf.mean()

```
Out[72]:
```

```
In [73]:
```pmf.plot(color='C1')
decorate(xlabel='Book ordering rate (λ)',
ylabel='PMF')

```
```

Now, we *could* generate a predictive distribution for the number of books ordered in a given week, but in this example we have to deal with a complicated cost function. In particular, `out_of_stock_cost`

depends on a sequence of orders.

So, rather than generate a predictive distribution, I suggest we run simulations. I'll demonstrate the steps.

First, from our hypothetical distribution of rates, we can draw a random sample of 1000 values.

```
In [74]:
```rates = pmf.choice(1000)
np.mean(rates)

```
Out[74]:
```

For each possible rate, we can generate a sequence of 8 orders.

```
In [75]:
```rng = np.random.default_rng()
order_array = rng.poisson(rates, size=(8, 1000)).transpose()
order_array[:5, :]

```
Out[75]:
```

Each row of this array is a hypothetical sequence of orders based on a different hypothetical order rate.

Now, if you tell me how many books you printed, I can compute your expected profits, averaged over these 1000 possible sequences.

```
In [76]:
```def compute_expected_profits(printed, order_array):
"""
"""
profits = [compute_profit(printed, orders)
for orders in order_array]
return np.mean(profits)

For example, here are the expected profits if you order 70, 80, or 90 books.

```
In [77]:
```compute_expected_profits(70, order_array)

```
Out[77]:
```

```
In [78]:
```compute_expected_profits(80, order_array)

```
Out[78]:
```

```
In [79]:
```compute_expected_profits(90, order_array)

```
Out[79]:
```

```
In [80]:
```printed_array = np.arange(70, 110)
t = [compute_expected_profits(printed, order_array)
for printed in printed_array]
expected_profits = pd.Series(t, printed_array)

```
In [81]:
```expected_profits.plot(label='')
decorate(xlabel='Number of books printed',
ylabel='Expected profit ($)')

```
```

The peak is at 80 books, where your expected profit is about $250.

```
In [82]:
```expected_profits.idxmax(), expected_profits.max()

```
Out[82]:
```

In this example, it turns out to be optimal to order the number of books you expect to sell on averagewhich is not too surprising. But notice that you do almost as well if you order 100 books. So it's not obvious that ordering the mean is always optimal.

Now it's your turn. Choose a prior that you think is reasonable, update it with the data you are given, and then use the posterior distribution to do the analysis I just demonstrated.

```
In [83]:
```# Solution
# For a prior I chose a log-uniform distribution;
# that is, a distribution that is uniform in log-space
# from 1 to 100 books per week.
qs = np.logspace(0, 2, 101)
prior = Pmf(1, qs)
prior.normalize()

```
Out[83]:
```

```
In [84]:
```# Solution
prior.make_cdf().plot(color='C1')
decorate(xlabel='Book ordering rate (λ)',
ylabel='CDF')

```
```

```
In [85]:
```# Solution
from scipy.stats import poisson
def update_book(pmf, data):
"""Update book ordering rate.
pmf: Pmf of book ordering rates
data: observed number of orders in one week
"""
k = data
lams = pmf.index
likelihood = poisson.pmf(k, lams)
pmf *= likelihood
pmf.normalize()

```
In [86]:
```# Solution
posterior1 = prior.copy()
update_book(posterior1, 12)

```
In [87]:
```# Solution
posterior2 = posterior1.copy()
update_book(posterior2, 8)

```
In [88]:
```# Solution
posterior1.plot()
posterior2.plot()

```
```

```
In [89]:
```# Solution
prior.mean(), posterior1.mean(), posterior2.mean()

```
Out[89]:
```

```
In [90]:
```# Solution
rates = posterior2.choice(1000)
np.mean(rates)

```
Out[90]:
```

```
In [91]:
```# Solution
order_array = rng.poisson(rates, size=(8, 1000)).transpose()
order_array[:5, :]

```
Out[91]:
```

```
In [92]:
```# Solution
printed_array = np.arange(70, 110)
t = [compute_expected_profits(printed, order_array)
for printed in printed_array]
expected_profits = pd.Series(t, printed_array)

```
In [93]:
```# Solution
expected_profits.plot(label='')
decorate(xlabel='Number of books printed',
ylabel='Expected profit ($)')

```
```

```
In [ ]:
```