Lecture 16: Exponential Distribution, Memoryless Property

Stat 110, Prof. Joe Blitzstein, Harvard University


Exponential Distribution

Description

Real-valued distribution describing wait times, survival times. Rate parameter $\lambda \gt 0$.

Continuous analog of the geometric distribution.

Unique in that the exponential distribution has the memoryless property.


In [1]:
import matplotlib
import numpy as np
import matplotlib.pyplot as plt

from matplotlib.ticker import (MultipleLocator, FormatStrFormatter,
                               AutoMinorLocator)
from scipy.stats import expon

%matplotlib inline

plt.xkcd()

_, ax = plt.subplots(figsize=(12,8))


# seme Exponential parameters
lambdas = [2.0, 1.5, 1.0, 0.5]

# qualitative color scheme
colors = ['#66c2a5', '#fc8d62', '#8da0cb', '#e78ac3']

x = np.linspace(0, 4, 500)
for i,l in enumerate(lambdas):
    pdf = expon.pdf(x, scale=1/l)
    ax.plot(x, pdf, color=colors[i], lw=3.2, label=r'$\lambda = {}$'.format(l))

# legend styling
legend = ax.legend()
for label in legend.get_texts():
    label.set_fontsize('large')
for label in legend.get_lines():
    label.set_linewidth(1.5)

# y-axis
ax.set_ylim([-0.01, 2.0])
ax.set_ylabel(r'$f(x)$')
ax.set_yticks(np.arange(0,2.1,.2))

# x-axis
ax.set_xlim([0.0, 3.0])
ax.set_xlabel(r'$x$')

# x-axis tick formatting
majorLocator = MultipleLocator(1)
majorFormatter = FormatStrFormatter('%d')
minorLocator = MultipleLocator(1)
ax.xaxis.set_major_locator(majorLocator)
ax.xaxis.set_major_formatter(majorFormatter)
ax.xaxis.set_minor_locator(minorLocator)

ax.grid(color='grey', linestyle='-', linewidth=0.3)

plt.suptitle(r'Exponential PDF: $f(x) = \lambda e^{-\lambda x}$ for $x \geq 0$')

plt.show()


Notation

$X \sim \operatorname{Expo}(\lambda)$

Parameters

$\lambda$ - rate parameter where $\lambda \gt 0$

Probability density function

\begin{align} f(x) &= \begin{cases} \lambda e^{-\lambda x}, &\text{ if } x \ge 0 \\ 0, &\text{ otherwise } \end{cases} \end{align}

Cumulative distribution function

\begin{align} F(x) &= \int_{0}^{x} \lambda e^{-\lambda t} \, dt \\ &= \lambda \int_{0}^{x} e^{-\lambda t} \, dt &\text{ let } u = -\lambda t \text{, } du = -\lambda \, dt \\ &= \int - e^{u} \, du \\ &= - e^{u} \\ &= \left. - e^{-\lambda t} \right|_{0}^{x} \\ &= \boxed{ 1 - e^{-\lambda x} } \end{align}

Standardized Exponential Distribution

If we let $Y = \lambda X$, then $Y \sim \operatorname{Expo}(1)$.

You can compare this with the standardized Normal.

Proof

\begin{align} P(Y \le y) &= P\left(X \le \frac{y}{\lambda} \right) \\ &= 1 - e^{-\lambda \frac{y}{\lambda}} &\text{ just plugging } \frac{y}{\lambda} \text{ into the CDF above} \\ &= 1 - e^{-y} &\text{ which is the CDF of } \operatorname{Expo}(1) ~~~~ \blacksquare \\ \end{align}

We will next find the mean and variance of $\operatorname{Expo}(1)$, and then derive the general case mean and variance afterwards.

Mean and variance of $\operatorname{Expo}(1)$

Let $Y \sim \operatorname{Expo}(1)$, find $\mathbb{E}(Y)$ and $\operatorname{Var}(Y)$.

\begin{align} \mathbb{E}(Y) &= \int_{0}^{\infty} y\,e^{-y}\,dy & &\text{ let } u = y \text{, } du = dy \\ & & &\text{ and let } dv = e^{-y}\,dy \text{, } v = -e^{-y} \\ &= \underbrace{ \left. -y e^{^y} \right\vert_{0}^{\infty}}_{\text{evaluates to }0} + \underbrace{\int_{0}^{\infty} e^{-y}\,dy}_{\text{PDF of }\operatorname{Expo}(1)} \\ &= \boxed{1} \\ \\\\ \operatorname{Var}(Y) &= \mathbb{E}(Y^2) - \mathbb{E}Y ^2\\ &= \int_{0}^{\infty} y^{2}\,e^{-y}\,dy \,- 1^2 & &\text{ let } u = y^{2} \text{, } du = 2y\,dy \\ & & &\text{ and let } dv = e^{-y}\,dy \text{, } v = -e^{-y} \\ &= \left. -y^{2}\,e^{^y} \right\vert_{0}^{\infty} + \int_{0}^{\infty} 2y\,e^{-y}\,dy \,-1 \\ &= 0 + 2 - 1 \\ &= \boxed{1} \end{align}

Mean and variance of $\operatorname{Expo}(\lambda)$

We can derive the mean and variance of $\operatorname{Expo}(\lambda)$ from that of $\operatorname{Expo}(1)$.

From $Y = \lambda X$ we have $X = \frac{y}{\lambda}$.

\begin{align} \mathbb{E}(X) &= \mathbb{E}\left(\frac{Y}{\lambda}\right) \\ &= \frac{1}{\lambda} \, \mathbb{E}(Y) \\ &= \boxed{ \frac{1}{\lambda} } \\ \\\\ \operatorname{Var}(X) &= \operatorname{Var}\left(\frac{Y}{\lambda}\right) \\ &= \frac{1}{\lambda^2} \, \operatorname{Var}(Y) \\ &= \boxed{ \frac{1}{\lambda^2} } \\ \end{align}

Memorylessness Property

If you have a random variable representing a wait-time (continuous), the memorylessness property means that no matter how long you have already waited, the probability that you will have to wait an additional time $t$ is the same as if you were starting fresh from 0 (irrespective of the time you already spent waiting).

Fact: $\operatorname{Expo}(\lambda)$ is the only distribution with the memorylessness property.

\begin{align} P(X \ge s+t | X \ge s) &= P(X \ge t) \\ \end{align}

The survival function is the random variable that describes how long something might live/exist, in constrast to that for a waiting time. In other words, $P(X \ge s)$ is the probability that some object of interest lasts longer than a continuous time $s$.

\begin{align} P(X \ge s) &= 1 - P(X \le s) \\ &= 1 - (1 - e^{-\lambda s}) \\ &= e^{-\lambda s} & \quad \text{ the survival function} \end{align}

And so using this survival function in an equation using the definition of conditional probability, we have:

\begin{align} P(X \ge s+t | X \ge s) &= \frac{P(X \ge s+t \text{, }X \ge s)}{P(X \ge s)} \\ &= \frac{P(X \ge s+t)}{P(X \ge s)} & \quad \text{ since } P(X \ge s) \text{ is redundant} \\ &= \frac{e^{-\lambda (s+t)}}{e^{-\lambda s}} & \quad \text{ ratio of survival functions} \\ &= e^{-\lambda t} \\ &= P(X \ge t) & \quad \blacksquare \end{align}

Useful corollary of the Memorylessness Property

This is a brief introduction to conditional expectation, which is just like conditional probability.

Given $X \sim \operatorname{Expo}(\lambda)$, what is the expected wait-time if we have already waited for some time $a$?

\begin{align} \mathbb{E}(X | X \gt a) &= a + \mathbb{E}(X - a|X \gt a) & \quad \text{ by linearity} \\ &= a + \frac{1}{\lambda} & \quad \text{ by the memorylessness property} \end{align}