The question

On a sold out flight, one hundred people line up to board the plane. The first person in line has lost his boarding pass, so he randomly chooses a seat. After that, each person boarding sits in his or her assigned seat, if it is available. If the assigned seat is not available, the passenger chooses an unoccupied seat randomly. You are the last passenger to board. What is the probability that you find your assigned seat unoccupied?

What's your intuition? Does it seem likely that you get your assigned seat? Or does it seem like a long shot? My initial impression has always been that it is unlikely, which is maybe why the answer always surprises me.



In [1]:

    
# load numpy (as np) and matplotlib

%pylab inline

# The fraction type makes it easier to 
# calculate exact results, and to see
# patterns in our results.

import fractions; from fractions import Fraction
def prettyprint(M) :
    print(np.array2string(M,formatter={'all':'{!s:^10}'.format}))









    



Populating the interactive namespace from numpy and matplotlib

Arrange information in a matrix

Can we arrange what we know about this problem in the form of a matrix? Maybe this is just a helpful way of recording what we know about the problem, who knows? Suppose then that we set up a matrix where the ith row and the jth column indicates the probability that passenger i sits in passenger j's seat.

What happens in a 2 passenger plane?

Sometimes we can understand what happens in a seemingly complicated problem by looking at a smaller versions of the original. And there is most likely nothing very special about the number of passengers on the plane (or that they are passengers on a plane) - we could just as well be talking about a thousand people at a football game. Whatever method of solution we come up with in all likelihood will solve all of these problems.

In the 2 passenger plane, the first person has a 1/2 chance of taking his own seat, and a 1/2 chance of taking your seat. You therefore have a 1/2 chance of getting your own seat, and 1/2 not. The matrix in question then looks like



In [2]:

    
M2=np.ones((2,2),Fraction)*Fraction(1,2)
prettyprint(M2)









    



[[   1/2        1/2    ]
 [   1/2        1/2    ]]

What happens with a three passenger plane?

Here passenger 1 is equally likely to sit in any of the three seats, so with probability 1/3 each. If passenger 2 finds his seat empty, which happens with probability 2/3, he or she takes it. If it is unoccupied, which happens with probability 1/3, then passenger 2 takes either seat 1 or seat 3 with equal (1/6) probability.

And what about you, the last passenger? There is a 1/2 chance that your seat is left open, and a 1/2 chance that seat 1 is open. There is no chance that seat 2 is open. So our matrix looks like this:



In [3]:

    
M3=np.zeros((3,3),Fraction)
M3[0,:] = Fraction(1,3)
M3[1,0] = M3[1,2] = Fraction(1,6)
M3[1,1] = Fraction(2,3)
M3[2,0] = M3[2,2] = Fraction(1,2)
prettyprint(M3)









    



[[   1/3        1/3        1/3    ]
 [   1/6        2/3        1/6    ]
 [   1/2         0         1/2    ]]

What do we see in these cases? In both cases the probability that your seat is unoccupied when you board is 1/2, which may be coincidental, or maybe not. Also, while it's obvious that the rows add up to one, notice that the columns add up to one in both cases. A moment's thought should convince you that this is always the case. Each seat is going to be occupied by exactly one passenger, no more and no less, so the events where passenger 1, 2, 3 sits in a given seat are mutually exclusive. That makes the matrix doubly stochastic, which may or may not be helpful for the problem at hand.

Another interesting observation in the three passenger example is that only passengers 1 and 2 have a chance to sit in passenger 2's seat. If one doesn't take it, then two will definitely sit there. For n passengers, the obvious generalization is that the only passengers that can sit in k's seat where k is two or greater are passengers 1,2,3,...,k. Taking this to the extreme, the last person to board is certain to either sit in his own seat or the seat of the first passenger.

Let's write a loop in python to fill in the entire matrix, and avoid having to do it by hand. numpy multicasting makes this easy to write without a lot of code.



In [4]:

    
def calculateProbabilities(n) :
    """
    Return an n x n array where the
    entry in the ith row and jth column
    is the probsbility of passenger i
    sitting in seat j.
    """
    M = np.zeros((n,n),Fraction)
    #
    # Passenger 1 lost his boarding pass, so
    # he picks a seat uniformly at random
    #
    M[0,:] = Fraction(1,n)
    for i in range(1,n) :
        #
        # here's the probability that passenger
        # i's seat is empty.  If it is empty,
        # he'll take it.
        #
        b = 1-sum(M[:i,i])
        M[i,i] = b
        #
        # Inductively, the probability that the
        # a remaining seat {0,i+1,...,n} is occupied
        # is the same for all of these n-i seats.
        #
        M[i,0]    = Fraction(1-b,n-i)
        M[i,i+1:] = Fraction(1-b,n-i)
    return M

Here's the n=100 case, which requires almost no effort to calculate.



In [5]:

    
M = calculateProbabilities(100)
prettyprint(M)









    



[[  1/100      1/100      1/100    ...,   1/100      1/100      1/100   ]
 [  1/9900     99/100     1/9900   ...,   1/9900     1/9900     1/9900  ]
 [  1/9702       0        98/99    ...,   1/9702     1/9702     1/9702  ]
 ..., 
 [   1/12        0          0      ...,    3/4        1/12       1/12   ]
 [   1/6         0          0      ...,     0         2/3        1/6    ]
 [   1/2         0          0      ...,     0          0         1/2    ]]

So now the 1/2 answer seems like way more than a coincidence. It also seems like, in general, that we might have that the probability that passenger i is in his own seat is

$$(n-i+1)/(n-i+2)$$

There seems like strong computational evidence here, but can we give a proof? Well, there seems to be some symmetry here between seats 1 and 100. Note that the entries in the first and last columns are identical



In [6]:

    
np.all(M[:,0]==M[:,-1])









    Out[6]:





True

If you prefer a picture, here is a plot of the seat probabilities, where each seat is depicted as a pie chart, indicating the probabilities that each passenger sits in that seat, here for the 9 passenger plane case.



In [7]:

    
pctfmt = lambda xx : '' if xx < 8 else '%.f%%' % (xx,)

n=9
M = calculateProbabilities(n)
fig=figure(figsize=(9,9))
for j in range(n) :
    subplot(3,3,j+1)
    #
    # Don't plot the zero size wedges
    maxj = n if (j==0 or j==n-1) else j+1
    colors = [str(1-0.5*(1-i/n)) for i in range(maxj)]
    labels = [str(i+1) for i in range(maxj)]
    patches = pie(M[:maxj,j],\
                     colors=colors,\
                     labels=labels,\
                     autopct=pctfmt)[0]
    ax=patches[0].axes
    ax.set_title('P(i occupies seat {})'.format(j+1),fontsize=12)
figure_title='Probabilities for each seat in the 9-passenger case'
fig.text(0.25,0.07,figure_title,fontsize=14);

This suggests that were we to switch seats 1 and n, no one would be the wiser. Of course, even easier than moving physical seats, we could secretly swap the boarding passes of the first and last passenger. How would that change the boarding process?

Not at all! Passenger one loses his boarding pass, so it doesn't matter which boarding pass he held. We didn't mess with the boarding passes of passengers 2 through n-1 so they do what they would normally do, and you, the last passenger, had no choice in choosing a seat anyway (there's only one seat left). So what boarding pass you hold is irrelevant.

So, as we saw earlier, you are going to end up in either seat 1 or n. And since the answer doesn't change when we swap your and the first passenger's boarding passes, your probability of sitting in seat 1 is the same as your probability of sitting in seat n, so both must be 1/2.

A slight variant on this also proves the formula $(n-i+1)/(n-i+2)$ mentioned above. Just randomly permute the boarding passes of passengers $1,i,i+1,...,n-1,$ and $n$. This won't change in any way what passengers $1$ through $i-1$ do, so passenger $i$ is equally likely to find their seat occupied regardless of which of the remaining $n-i+1$ shuffled boarding passes he or she holds.

Conclusion

So why are we (or at least I) led astray by our intuition on this problem? I think that my intuition is driven here by the assumption that there is a rather serious shuffling of passengers going on, when in reality, it is a very, very weak shuffle indeed. Think of it - if passenger 1 sits in seat k, then every one of passengers 2 through k-1 are going to get their own seats. So roughly half of the time, more than half of the passengers will get their own seats. In hindsight that doesn't seem very random at all, so maybe it is not surprising that you have such a high probability of getting your own seat.

Follow-up question

For those who want another challenge, here is one - On average, for $n$ large, how many passengers will be mis-seated?



In [ ]: