Lecture 12: Discrete vs Continuous Distributions

Stat 110, Prof. Joe Blitzstein, Harvard University


Discrete vs Continuous Random Variables

So, we've completed our whirlwind introduction of discrete random variables, covering the following:

  1. Bernoulli
  2. Binomial
  3. Hypergeometric
  4. Geometric
  5. Negative Binomial
  6. Poisson

As we now move into continuous random variables, let's create a table to compare/contrast important random variable properties and concepts.

Discrete Continuous
$X$ $X$
PMF $P(X=x)$ PDF $f_x(x) = F^\prime(x)$
note $P(X=x)=0$
CDF $F_x(x)=P(X \le x)$ CDF $F_x(x) = P(X \le x)$
$\mathbb{E}(X) = \sum_{x} x P(X=x)$ $\mathbb{E}(X) = \int_{-\infty}^{\infty} x f(x)dx$
$Var(X) = \mathbb{E}X^2 - \mathbb{E}(X)^2$ $Var(X) = \mathbb{E}X^2 - \mathbb{E}(X)^2$
$SD(X) = \sqrt{Var(X)}$ $SD(X) = \sqrt{Var(X)}$
LOTUS $\mathbb{E}(g(x)) = \sum_{x} g(x) P(X=x)$ LOTUS $\mathbb{E}( g(x) ) = \int_{-\infty}^{\infty} g(x) f(x)dx$

Probability Density Function

In the discrete case, we could calculate probability by summing (counting) the discrete elements, since each element represents a bit of mass. The probability would be the total of all elements concerned, divided by total mass.

But we cannot count discrete elements in the continuous case. Instead, we integrate the density function over a range to get probability mass per area.

Definition: random variable

A random variable $X$ has PDF $f(x)$ if for all $a$ and $b$

\begin{align} & P(a \le x \le b) = \int_a^b f(x) dx \\ \end{align}

Test for validity

Note that to be a valid PDF,

  1. $f(x) \ge 0$
  2. $\int_{-\infty}^{\infty} f(x) = 1$

Probability at a point is 0

  • $a = b \Rightarrow \int_{a}^{a} f(x)dx = 0$

Density $\times$ Length

But for some point $x_0$ and some very, very small value $\epsilon$, we can derive $f(x_0) \epsilon \approx P(X \in (x_0-\frac{\epsilon}{2}, x_0+\frac{\epsilon}{2})$


Cumulative Distribution Function

Deriving CDF from PDF

If continuous r.v. $X$ has PDF $f$, the CDF is

\begin{align} F(x) &= P(X \le x) \\ &= \int_{-\infty}^{x} f(t) dt \\ \\ \Rightarrow P(a \le x \le b) &= \int_{a}^{b} f(x)dx \\ &= F(b) - F(a) \\ \end{align}

Deriving PDF from CDF

If continuous r.v. $X$ has CDF $F$ (and $X$ is continuous), the PDF is

\begin{align} f(x) &= F^\prime(x) & &\text{ by the Fundamental Theorem of Calculus} \\ \end{align}

Variance

Mean only tells you where the average is. Another useful statistic is the variance of a random variable, which tells you how the random variable is spread out around the mean.

In other words, variance answers the question How far is $X$ from its mean, on average?

Definition: variance

Variance is a measure of how a random variable is spread about its mean.

\begin{align} \operatorname{Var}(X) &= \mathbb{E}(X - \mathbb{E}X)^2 & \quad \text{or alternatively} \\ \\ &= \mathbb{E}X^2 - 2X(\mathbb{E}X) + \mathbb{E}(X^2) & \quad \text{by Linearity}\\ &= \boxed{\mathbb{E}X^2 - \mathbb{E}(X)^2} \end{align}

Sometimes the second form of variance is easier to use.

Note that the formula variance is the same for both discrete and continuous r.v.

But you might be wondering right now how to calculate $\mathbb{E}X^2$. We will get to that in a bit...


Standard Deviation

But note that variance is expressed in terms of units squared. Standard deviation is sometimes easier to use than variance, as it is given in the original units.

Definition: standard deviation

The standard deviation the square root of the variance.

\begin{align} SD(X) &= \sqrt{\operatorname{Var}(X)} \end{align}

Note that like variance, the formula for standard deviation is the same for both discrete and continuous r.v.


Uniform Distribution

Description

The simplest and perhaps the most famous continuous distribution. Given starting point $a$ and ending point $b$, probability $\propto$ length.

Notation

$X \sim \operatorname{Unif}(a,b)$

Parameters

  • $a$ start of the segment, $a < b$
  • $b$ end of the segment, $b > a$

Probability density function

\begin{align} f(x) &= \begin{cases} c & \quad \text{ if } a \le x \le b \\ 0 & \quad \text{ otherwise } \end{cases} \\ \\ \\ 1 &= \int_{a}^{b} c dx \\ \Rightarrow c &= \boxed{\frac{1}{b-a}} \end{align}

Cumulative distribution function

\begin{align} F(x) &= \int_{-\infty}^{x} f(t)dt \\ &= \int_{a}^{x} f(t)dt \\ &= \begin{cases} 0 & \quad \text{if } x \lt a \\ \frac{x-a}{b-a} & \quad \text{if } a \lt x \lt b \\ 1 & \quad \text{if } x \gt b \end{cases} \\ \end{align}

So this means that as $X$ increase, its probability increase likewise in a linear fashion.

Expected value

For continuous r.v.

\begin{align} \mathbb{E}(X) &= \int_{a}^{b} x \frac{1}{b-a} dx \\ &=\left. \frac{x^2}{2(b-a)} ~~ \right\vert_{a}^{b} \\ &= \frac{(b^2-a^2)}{2(b-a)} \\ &= \boxed{\frac{b+a}{2}} \end{align}

Variance

Remember that lingering doubt about $\mathbb{E}X^2$?

Let random variable $Y = X^2$.

\begin{align} \mathbb{E}X^2 &= \mathbb{E}(Y) \\ &\stackrel{?}{=} \int_{-\infty}^{\infty} x^2 f(x) dx & &\text{since we need the PDF of Y..?} \end{align}

Law of the Unconscious Statistician (LOTUS)

Actually, that last bit of wishful thinking is correct and will work in both the discrete and continuous cases.

In general for continuous r.v.

\begin{align} \mathbb{E}( g(x) ) = \int_{-\infty}^{\infty} g(x) f(x)dx \end{align}

And likewise for discrete r.v.

\begin{align} \mathbb{E}(g(x)) = \sum_{x} g(x) P(X=x) \end{align}

Variance of $U \sim \operatorname{Unif}(0,1)$

\begin{align} \mathbb{E}(U) &= \frac{1}{b-a} \\ &= \frac{1}{2} \\ \\ \\ \mathbb{E}U^2 &= \int_{0}^{1} u^2 \underbrace{f(u) du}_{1} \\ &= \left.\frac{u^3}{3} ~~ \right\vert_{0}^{1} \\ &= \frac{1}{3} \\ \\ \\ \Rightarrow Var(U) &= \mathbb{E}U^2 - \mathbb{E}(U)^2 \\ &= \frac{1}{3} - \left(\frac{1}{2}\right)^2 \\ &= \frac{1}{12} \end{align}

Universality of the Uniform

Given an arbitrary CDF $F$ and the uniform $\operatorname{U} \sim \operatorname{Unif}(0,1)$, it is possible to simulate a draw from the continuous r.v. of the CDF $F$.

Assume:

  1. $F$ is strictly increasing
  2. $F$ is continuous as a function

If we define $X = F^{-1}(U)$. Then $X \sim F$.

\begin{align} P(X \le x) &= P(F^{-1}(U) \le x) \\ &= P(U \le F(x)) \\ &= F(x) & \quad \text{ since } P(U \le u) \propto 1~~ \blacksquare \end{align}