The Riddler

https://fivethirtyeight.com/features/who-steals-the-most-in-a-town-full-of-thieves/

A town of 1,000 households has a strange law intended to prevent wealth-hoarding. On January 1 of every year, each household robs one other household, selected at random, moving all of that house’s money into their own house. The order in which the robberies take place is also random and is determined by a lottery. (Note that if House A robs House B first, and then C robs A, the houses of A and B would each be empty and C would have acquired the resources of both A and B.)

Two questions about this fateful day:

  1. What is the probability that a house is not robbed over the course of the day?
  2. Suppose that every house has the same amount of cash to begin with — say \$100. Which position in the lottery has the most expected cash at the end of the day, and what is that amount?

Part one

What is the probability that a house is not robbed over the course of the day?

We will assume that a house cannot rob itself, so if there are 1000 houses, there are 999 times a house can be robbed. The probability of being robbed each time is independent from round to round. The probability of not being robbed in a single round is $\frac{999}{1000}$ and the probability of not being robbed in any given two rounds is $\frac{999}{1000}^2$. The probability of not being robbed in 999 rounds is $\frac{999}{1000}^{999}$ which works out to 0.368063.

We assume a house cannot rob itself, so for 1000 rounds there are only 999 chances of being robbed.


In [53]:
(999/1000)**999


Out[53]:
0.3680634882592229

Let's plot the probability of any one house not getting robbed if there are N homes.


In [7]:
%matplotlib inline

In [57]:
import numpy as np
import pandas as pd
import seaborn as sns

idx = range(1, 1001)
s = pd.Series([((i-1)/i)**(i-1) for i in idx], index=idx)
print(s.tail(1))
s.plot()


1000    0.368063
dtype: float64
Out[57]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd6891d2fd0>

In [58]:
1/np.e


Out[58]:
0.36787944117144233

Part two

Suppose that every house has the same amount of cash to begin with — say $100. Which position in the lottery has the most expected cash at the end of the day, and what is that amount?

Assumption: people do not choose themselves to rob.

Some notation:

  • $N$ is the number of houses. It is 1000 in this example, but we will consider it a fixed variable.
  • $X$ is the value that each house starts with. It is \$100 in this example.
  • The current round is $r$ and starts at 1 and goes to N.
    • There are $r-1$ houses before and $N-r$ houses after.
  • The probability of not being robbed in $n$ rounds is $P(n)$ where
    $P(n) = (\frac{N-1}{N})^n$
  • $E_{r,i}$ is the expected value of the house at position $r$ at the conclusion of turn $i$.
  • $G_r$ is the expected value of the gain for the turn of the house at position $r$.

And some assumptions:

  • People cannot randomly pick themselves to rob.

Solution

This problem can be broken down into three parts, where we consider a house at position $r$.

  1. What is the expected value of house $r$ at the beginning of the turn?
  2. What is the expected value of the gain of house $r$?
  3. What is the expected value for house $r$ at the end of all rounds?

For part one, the expected value of house $r$ at the beginning of the turn is just the starting value times the probability of not being robbed in $r-1$ turns.

$E_{r,r-1} = P(r-1)X$

For part two, the expected gain is the total amount of money in other houses divided by the number of houses. In the case where the house at position $r$ has not been robbed, the amount of money in other houses is $(N-1)X$, while if the house has been robbed the amount is $NX$. The probability of not being robbed at the start of this turn is $P(r-1)$ and the probability of being robbed is $1-P(r-1)$. So the expected value of the gain of this turn is given by

$$\begin{eqnarray} G_r &=& P(r-1)\frac{(N-1)X}{(N-1)} + (1-P(r-1))\frac{NX}{(N-1)} \\ &=& P(r-1)X + (1-P(r-1))\frac{NX}{(N-1)} \end{eqnarray}$$

So the expected value of the house at position $r$ at the conclusion of their turn is given by:

$$\begin{eqnarray} E_{r,r} &=& E_{r,r-1} + G_r \\ &=& P(r-1)X + P(r-1)X + (1-P(r-1))\frac{NX}{(N-1)} \end{eqnarray}$$

And for part three, the expected value at the end of all rounds is the expected value at the end of turn $r$ times the probability of not being robbed in the subsequent $N-r$ rounds.

$$\begin{eqnarray} E_{r,N} &=& E_{r,r} * P(N-r) \\ &=& (P(r-1)X + P(r-1)X + (1-P(r-1))\frac{NX}{(N-1)}) * P(N-r) \end{eqnarray}$$

In [42]:
import matplotlib.pyplot as plt
import numpy as np

X = 100
N = 1000

def p(n, N):
    """The probability of not being robbed in `n` turns,given each turn picks one house
    at random from `N` houses."""
    return np.power((N-1)/N, n)

r = np.arange(1, N+1)
# expected value at the start of turn
E_pre = p(r-1, N) * X

# expected gain
G = p(r-1, N)*X + (1-p(r-1, N))*N*X/(N-1)

# exected value at the end of the turn
E_post = E_pre + G

# expected value at the end of all rounds
E = E_post * p(N-r, N)

df = pd.DataFrame(data={'E_pre': E_pre, 'gain': G, 'E_post': E_post, 'E': E}, index=r, 
                  columns=['E_pre', 'gain', 'E_post', 'E'])
df


Out[42]:
E_pre gain E_post E
1 100.000000 100.000000 200.000000 73.612698
2 99.900000 100.000100 199.900100 73.649578
3 99.800100 100.000200 199.800300 73.686495
4 99.700300 100.000300 199.700600 73.723449
5 99.600600 100.000400 199.600999 73.760440
6 99.500999 100.000500 199.501499 73.797468
7 99.401498 100.000599 199.402097 73.834533
8 99.302097 100.000699 199.302795 73.871635
9 99.202794 100.000798 199.203592 73.908774
10 99.103592 100.000897 199.104489 73.945950
11 99.004488 100.000997 199.005485 73.983164
12 98.905484 100.001096 198.906579 74.020415
13 98.806578 100.001195 198.807773 74.057703
14 98.707771 100.001294 198.709065 74.095029
15 98.609064 100.001392 198.610456 74.132392
16 98.510455 100.001491 198.511946 74.169792
17 98.411944 100.001590 198.413534 74.207230
18 98.313532 100.001688 198.315220 74.244705
19 98.215219 100.001787 198.217005 74.282218
20 98.117003 100.001885 198.118888 74.319768
21 98.018886 100.001983 198.020870 74.357356
22 97.920868 100.002081 197.922949 74.394981
23 97.822947 100.002179 197.825126 74.432644
24 97.725124 100.002277 197.727401 74.470345
25 97.627399 100.002375 197.629774 74.508084
26 97.529771 100.002473 197.532244 74.545860
27 97.432241 100.002570 197.434812 74.583674
28 97.334809 100.002668 197.337477 74.621526
29 97.237474 100.002765 197.240240 74.659416
30 97.140237 100.002863 197.143100 74.697344
... ... ... ... ...
971 37.889910 100.062172 137.952083 134.006980
972 37.852021 100.062210 137.914231 134.104315
973 37.814169 100.062248 137.876417 134.201747
974 37.776354 100.062286 137.838640 134.299277
975 37.738578 100.062324 137.800902 134.396904
976 37.700839 100.062362 137.763201 134.494629
977 37.663139 100.062399 137.725538 134.592452
978 37.625475 100.062437 137.687912 134.690373
979 37.587850 100.062475 137.650325 134.788392
980 37.550262 100.062512 137.612774 134.886509
981 37.512712 100.062550 137.575262 134.984724
982 37.475199 100.062587 137.537787 135.083038
983 37.437724 100.062625 137.500349 135.181450
984 37.400286 100.062662 137.462949 135.279960
985 37.362886 100.062700 137.425586 135.378569
986 37.325523 100.062737 137.388260 135.477277
987 37.288198 100.062775 137.350972 135.576084
988 37.250909 100.062812 137.313721 135.674989
989 37.213658 100.062849 137.276508 135.773994
990 37.176445 100.062886 137.239331 135.873097
991 37.139268 100.062924 137.202192 135.972300
992 37.102129 100.062961 137.165090 136.071602
993 37.065027 100.062998 137.128025 136.171004
994 37.027962 100.063035 137.090997 136.270505
995 36.990934 100.063072 137.054006 136.370105
996 36.953943 100.063109 137.017052 136.469806
997 36.916989 100.063146 136.980135 136.569606
998 36.880072 100.063183 136.943255 136.669506
999 36.843192 100.063220 136.906412 136.769506
1000 36.806349 100.063257 136.869606 136.869606

1000 rows × 4 columns


In [43]:
df.plot.line()


Out[43]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd6897d1710>

So we see that the later in the lottery the better the house fares. And the house at position #1000 has an expected value of \$136.869606.


In [52]:
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

def plot_expected_values(N=1000):
    X = 100
    def p(n, N):
        """The probability of not being robbed in `n` turns,given each turn picks one house
        at random from `N` houses."""
        return np.power((N-1)/N, n)

    r = np.arange(1, N+1)
    # expected value at the start of turn
    E_pre = p(r-1, N) * X

    # expected gain
    G = p(r-1, N)*X + (1-p(r-1, N))*N*X/(N-1)

    # exected value at the end of the turn
    E_post = E_pre + G

    # expected value at the end of all rounds
    E = E_post * p(N-r, N)
    
    df = pd.DataFrame(data={'E_pre': E_pre, 'gain': G, 'E_post': E_post, 'E': E}, index=r, 
                  columns=['E_pre', 'gain', 'E_post', 'E'])
    ax = df.plot.line()
    ax.set_title('N = {}'.format(N), fontsize=18)
    ax.set_ylim(bottom=0)
    plt.show()


interact(plot_expected_values, N=widgets.IntSlider(min=2,max=1000,step=1,value=100));