In the World Series, one team hosts Games 1, 2, 6 and 7, while the other team hosts Games 3, 4 and 5. When the Nationals beat the Astros last month, it marked the first time in World Series history that the home team lost all seven games. On average, the home team actually wins about 54 percent of the time in baseball. Running the numbers, you’ll quickly see that seven home losses is a once-in-a-lifetime event.

But putting seven aside for a moment, what’s the probability that the home team will lose at least six consecutive games?

Solution

Let $\alpha$ represent the probability of winning a home game. Let $n$ represent the total games. We want to lose $k$ games in a row (consecutive). Consider the case where we have already seen $m$ consecutive failures and we have $t$ more games to play. Let $f(m,t)$ represent the probability of eventually losing consecutive $k$ games. Them the probability of this happening is given by: \begin{align*} f(m,t) &= \underbrace{\alpha f(m+1,t-1)}_{\text{Next trial is a loss: m increases to m+1, one less trial left}} + \overbrace{(1-\alpha) f(m, t-1)}^{\text{Next trial is a win so we reset to 0}}\\ \end{align*} Boundary conditions: \begin{align*} f(k, t) &= 1 \quad \forall t\geq 0 (\because \text{we already have achieved $k$ consecutive failures} )\\ f(j, n) &= \quad \forall k-j > n (\because \text{there is no way to catcup with failures given traisl are few}) \\ \end{align*}

This gives us a dynamic programming problem. For the problem in hand $n=7, k=6, \alpha=0.46$ and we are seeking $f(6, 0) + f(6,1) + f(7,0)$. But do we want to write this down??

Alternate approach:

We re-imagine this in a state space as follows: we denote a loss by L and a win by W. Since we are interested in a streak of continuous failures, our basis starts with the last game being a 'W'. There are then $k+1$ possible states that we can enumerate (conditional on the fact that we still haven't seen $k$ consecutive `L' at a stretch):

\begin{align*} S_1 &: \text{Last game $W$}\\ S_2 &: \text{Last 2 games $WL$}\\ S_3 &: \text{Last 3 games $WLL$}\\ & \vdots\\ S_k &: \text{Last 3 games $W\underbrace{{LLL\dots L}}_{k-1\text{ times}}$}\\ S_{k+1} &: \text{Last 3 games $W\underbrace{{LLL\dots L}}_{k \text{ times}}$}\\ \end{align*}

This can be imagined as a $k+1$ state markov chain with the transition matrix given by:

$$ P = \begin{bmatrix} 1-\alpha & 1-\alpha & 1-\alpha & \cdots & 1-\alpha & 0\\ \alpha & 0 & 0 & \cdots & 0 & 0\\ 0 & \alpha & 0 & \cdots & 0 & 0\\ 0 & 0 & \alpha & \cdots & 0 & 0\\ \vdots & \vdots & \vdots\\ 0 & 0 & 0 & \alpha & 0 & 0\\ 0 & 0 & 0 & \cdots & \alpha & 1\\ \end{bmatrix} $$

Notice the columns sum to one (thuis making it a left stochastimc matrix). Tghwe way this happens is because the probability to stay in state $S_1$ is to get another $W$ which is $1-\alpha$ and it can switch to $S_2$ with probability $\alpha$. Similarly, once in $S_2$, the probability of switching back to $S_1$ is to get a $W$ with probability $1-\alpha$ or else switch to $S_3$ with probability $\alpha$ (and this is reflected in the second column).

We are interested in what happens in a game of length $n$, which is equivalent to the number of jumps in such a markov chain being $n$, which is given by $P^n$. we however, are interested in exactly $k$ consecutive failures or more which is given by the probability to reach state $S_{k+1}$ from $S_1$, which is given by the entry $P_{k+1,1}^n$.

For the case where we are seeking $k$ or more failures, the probability is given by $\sum_{k}P_{k+1,1}^n$. For this case, $\alpha=1-0.54=0.46$ and for loosing at least $k=6$ games of $n=7$, the probabillity is $0.0264$.


In [1]:
%pylab inline
np.set_printoptions(threshold=False)
import sys
np.set_printoptions(threshold=sys.maxsize)


Populating the interactive namespace from numpy and matplotlib

In [2]:
n = 8
k = 6
alpha = 0.46

P = np.zeros((k+1, k+1))
P[0, 0:k] = 1-alpha
P[k, k] = 1
for i in range(1, k+1):
    P[i, i-1] = alpha

In [3]:
P
# np.sum(P, axis=0)


Out[3]:
array([[0.54, 0.54, 0.54, 0.54, 0.54, 0.54, 0.  ],
       [0.46, 0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.46, 0.  , 0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.46, 0.  , 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.  , 0.46, 0.  , 0.  , 0.  ],
       [0.  , 0.  , 0.  , 0.  , 0.46, 0.  , 0.  ],
       [0.  , 0.  , 0.  , 0.  , 0.  , 0.46, 1.  ]])

In [4]:
Pn = np.linalg.matrix_power(P, n)
Pn[k, 0]


Out[4]:
0.019706537543680003

k or more wins


In [5]:
def get_pn(n, k, alpha):
    P = np.zeros((k+1, k+1))
    P[0, 0:k] = 1-alpha
    P[k, k] = 1
    for i in range(1, k+1):
        P[i, i-1] = alpha
    Pn = np.linalg.matrix_power(P, n)
    return Pn[k, 0]

In [6]:
n = 8
alpha = 0.46
s= 0
for k in range(6, 8):
    s+= get_pn(n, k, alpha)
print(s)


0.026418129464806407

In [ ]: