Probability and Statistics


In [13]:
import numpy as np
import matplotlib.pyplot as plt

from matplotlib.patches import Polygon

%matplotlib inline

Continuous Random Variables

A continuous random variable, often denoted as $X$, is a quantity whose values range over an interval of numbers.

Probability Density Functions

Any random variable is associated with a probability density function (PDF), denoted $f(X)$ so that :

\begin{equation*} P(a \leq X \leq b) = \int_{a}^{b} f(X) dx \end{equation*}

Since a probability is between 0 and 1, our PDF must satisfy following two restrictions:

1) The PDF must be greater than or equal to zero

\begin{equation*} f(X) \geq 0 \text{ for all } X \end{equation*}

2) The total probability must be 1

\begin{equation*} \int_{-\infty}^{\infty} f(X) dx = 1 \end{equation*}

Assume that we have a random variable $X$ with a probability density function:

\begin{equation*} f(X) = \frac{1}{9} e^{-X/9} \end{equation*}

Q: How do we compute $P(10 \leq X \leq 20)$? What is the probability that $X$ is between 10 and 20?

A: We take the definite integral from 10 to 20.


In [28]:
def pdf(X):
    return 1/9 * np.exp(-X/9)

X = np.linspace(0, 60)
y = pdf(X)

fig, ax = plt.subplots(figsize=(10, 5))

ax.plot(X, y)
ax.set_ylim(0)
ax.set_xlim(0)

a = 10
b = 20

ix = np.linspace(a, b)
iy = pdf(ix)
verts = [(a, 0)] + list(zip(ix, iy)) + [(b, 0)]
poly = Polygon(verts, facecolor='0.9', edgecolor='0.5')
ax.add_patch(poly)

ax.text(a+(b-a)/2, pdf(b)/2, r"$\int_{%s}^{%s} f(x)\mathrm{d}x$" % (a, b),
         horizontalalignment='center', fontsize=14)


Out[28]:
<matplotlib.text.Text at 0x1f955bba588>