Special Continuous Distributions

Distribution is a generic word, that for a continous random variable may refer to density, or also its CDF.

We have seen some examples already:

  • $Expn(\lambda): \text{density }~\rightarrow ~ f(x) = \left\{\begin{array}{lrr} \lambda e^{-\lambda x} & if & x>0 \\ 0 && otherwise\end{array}\right.$
  • $Unifo(a,b): \text{density }~\rightarrow ~ f(x) = \left\{\begin{array}{lrr} \frac{1}{b-a} & if & x \in [a,b] \\ 0 && otherwise\end{array}\right.$

Normal Distribution

  • The standard normal distribution

    $$f(x) = \frac{1}{\sqrt{2\pi}} e^{\displaystyle -\frac{x^2}{2}}$$

Notation: Z with this density we write: $Z \sim mathcal{N}(0,1)$
the 0 stands for the fact that $\mathbf{E}[Z] = 0$
the 1 stands for the fact that $\mathbf{Var}[Z] = \mathbf{E}[Z^2] - (\mathbf{E}[Z])^2 = 1$

indeed:

$$\mathbf{E}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z e^{-z^2/2} dz = 0\text{ because } \\ z e^{-z^2/2} \text{ is an odd function which we integrate over an even interval}$$

Another way to say this mathematically:

$$\mathbf{E}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z~e^{-z^2/2} dz = \\ \int_{-\infty}^0 z e^{-z^2/2} dz + \int_{0}^\infty z e^{-z^2/2} dz \text{change of variables} z'=-z\\\text{ this is the oppisute of the first term }$$

$$\mathbf{Var}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z^2~e^{-z^2/2} dz \text{use the integration by parts } \\ u=z \rightarrow du=dz \text{ and } dv = ze^{-z^2/2} dz \longrightarrow v = -e^{-z^2/2}$$

therefore, we get $$\mathbf{Var}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z^2~e^{-z^2/2} dz = \frac{1}{\sqrt{2\pi}} \left( uv - \int_{-\infty}^\infty vdu \right) \\ = \frac{1}{\sqrt{2\pi}} \left(\left.ze^{-z^2/2}\right]_{-\infty}^\infty - \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty ~e^{-z^2/2} dz \right) \\ = 0 + \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty ~e^{-z^2/2} dz = \sqrt{2\pi} \frac{1}{\sqrt{2\pi}} $$

The final term above has to be 1, since we know that it is the same form as the density of normal random variabe, and integrating over the density should be 1.

Definition

Let $\mu$ and $\sigma$ be constants (typically we assume $\sigma > 0$).

Let $Z\sim \mathcal{N}(0,1)$

Let $X = \mu + \sigma Z$

We say that $X$ is Normal (not standard) with parameters $\mu$ and $\sigma^2$.

Notaiton: $X \sim \mathcal{N}(\mu, \sigma^2)$

Moreover, $$\mathbf{E}[X] = \mu ~~~~~~ \mathbf{Var}[X] = \sigma^2$$

To prove the above, use the linearity of expectations. (very easy!)

Moreover, we can get the density of $X$ (using affie transformation):

$$f_X(x) = \frac{1}{\sqrt{2\pi \sigma^2}} e^{\displaystyle -\frac{(x-\mu)^2}{2\sigma^2}}$$

Theorem:

Let $Z_1,Z_2$ be two i.i.d. standard normal random variables. ($Z_1,Z_2~~i.i.d~~\mathcal{N}(0,1)$)

Let $\sigma > 0$ a positive constant. Let $X = Z_1 + \sigma Z_2$. Then, $$Z \sim \mathcal{N}(0, 1+\sigma^2)$$

Corollaries of the previous definitions and theorems

  • Let $X\sim\mathcal{N}(\mu,\sigma^2)$. Then, we can standardize then $Z = \displaystyle \frac{Z-\mu}{\sigma}$ is the standard version of $X$ and $Z$ is standard normal: $Z\sim \mathcal{N}(0,1)$.

  • Let $X_1,X_2,...X_n$ be all independent with distributions $\mathcal{N}(\mu_1,\sigma_1^2),...\mathcal{N}(\mu_n,\sigma_n^2)$, then $X_1+X_2+..+X_n $ is still normal and $$X_1+X_2+..+X_n = \mathcal{N}(\sum_k\mu_k,\sum_k \sigma_k^2)$$

    • Proof is rather straight-forward by iterating the results about $Z_1 + \sigma Z_2$ n times, and using definition of how to represent $X_i$ as $\mu_i + \sigma_i Z_i$.
Definition (Covariance)

Let $(X_1,X_2,...X_n)$ be an n-dimensional random vector. Assume $\mathbf{Var}[X_i] < \infty$ for all $i=1,...n$.

$$\mathbf{Cov}(X_i,X_j) = \mathbf{E}[(X_i - \mathbf{E}[X_i])(X_j - \mathbf{E}[X_j])]$$

Let $C_{ij} = \mathbf{Cov}(X_i,X_j)$. This defines a matrix $C$ with size $n\times n$.

This $C$ is the covariance matrix of the random vector $(X_1,X_2,...X_n)$.

Definition/Theorem

We say that the vector $(X_1,X_2,...X_n)$ is normal with mean vector $\mathbf{\mu}$ and covariance $C$ if its joint density look slike for $\mathbf{x} \in \mathbb{R}^n$ where $\mathbf{x} = (X_1,X_2,...X_n)$

(sometimes they use $\Sigma$ for the covaraince matrix$)

$$f(\mathbf{x}) = f(x_1,x_2,...,x_n) = \frac{1}{\sqrt{(2\pi)^n} \det C} \exp\left(-\frac{1}{2} (\mathbf{x}-\mathbf{\mu})^TC^{-1}(\mathbf{x}-\mathbf{\mu})\right)$$
  • in this expression $(\mathbf{x}-\mathbf{\mu})^TC^{-1}(\mathbf{x}-\mathbf{\mu})$, the first factor is a row vector, and the last term must be a column vector.
  • It turns out that if we fix values $a_1,a_2,...,a_n$ and we look at $X = \sum_{k=1}^n a_k X_k$ this $X$ is normal.

    Note: it also turns out that $X_1,...X_n$ are independent if and only if $C$ is diagonal.

Remark:

When $X_1,X_2$ are independent, then covaraince of $\mathbf{Cov}(X_1,X_2) = 0$. The converse is almost always false in general.

The converse is true if $X_1,X_2$ are normal.

Remark

The previous definitio/theorem also says that any linear combination of jointly normal random variables, is a normal random variable.

jointly normal means a normal vector.

Example

Let $(X_1,X_2)$ be $\mathcal{N}(0,C)$. There, $C=\mathbf{Cov matrix} = \left(\begin{array}{cc}\sigma_1^2 & \rho \\ \rho & \sigma_2^2\end{array}\right)

Ofcourse by definition, $\rho = \mathbf{Cov}(X_1,X_2)$.

We can do the linear regression of $X_2$ wrt $X_1$. There ought to exist a standard normal random variable $X_3$ such that $X_2 = c ~ X_1 + d~X_3$. Let's fund $c,d$?

Note: since $X_1\&X_3$ are independent, we are guaranteed that $X_2$ is normal. We must have $\mathbf{E}[X_2] = 0 \forall c,d$

We must also have $\mathbf{Var}[X_2] = \sigma_2^2 $ and also $=c^2 \sigma_1^2 + d^2$.

We must also have $\mathbf{Cov}(X_1,X_2) = \rho $ and also $=\mathbf{E}[X_1 X_2] = \mathbf{E}[(c X_1 + dX_3)X_1] = \mathbf{E}[cX_1^2 + dX_1 X_3] = c \sigma_1^2 + d\times 0 (by independence)$

$$\left\{\begin{array}{c} \sigma_2^2 = c^2\sigma_1^2 + d^2\\\rho = c\sigma_1^2 \end{array}\right. \rightarrow \text{a triangular system of equation}$$$$c = \frac{\rho}{\sigma_1^2} ~~\text{ and }~ d = \sqrt{\sigma_2^2 - \frac{\rho^2}{\sigma_1^2}}$$

This example shows how to represent the non-independent normal pair $(X_1,X_2)$ using a normal pair $(X_1,X_3)$ which are independent.

Finally, we can aplpy the theorem

Theorem

Let $(X_1,X_2,...X_n)\sim \mathcal{N}(\mu, C)$.

Consider $X=(Z_1, Z_2, ...Z_n) = X \sqrt{C}^{-1}$

Then, $Z$ is a vector which is normal, and $Z\sim \mathcal{N}(0, I)$ ($I$ represents identoty matrix)

This says that we can multiply a normal vector by a specific matrix to turn it into a standard normal vector. In the notation above, $\sqrt{C}$ is a square root matrix of $C$. We mean that $\sqrt{C} \sqrt{C}^T = \sqrt{C}^T \sqrt{C}= C$

$\sqrt{C}$ exists because $C$ is Positive definite. Indeed, $C$ can be diagonalized like this $C = P^{-1} D P$ where $\exists P$ orthogonal matrix ($P^T = P^{-1}$) and $D$ is the diagonal matrix of eigenvalues of $C$.

We can check that $\star$ has the correct structure. $\mathbf{Cov}(Z)$ is a matrix with elements $\mathbf{Cov}(Z_i,Z_j)$ and we just want to show that $\mathbf{Cov}[Z_i,Z_j)=0$ if $i \ne j$ and $\mathbf{Cov}[Z_i,Z_j)=1$ if $i=j$.

$$\mathbf{Cov}[Z_i,Z_j]= \mathbf{Cov}[X\sqrt{C}^{-1}, X\sqrt{C}^{-1}] = \mathbf{Cov}[\sum_k X_k\sqrt{C}^{-1}]$$
$$ = \sum_k \sum_{k'} C_{kk'} \sqrt{C}^{-1}_{ki} \sqrt{C}^{-1}_{kj} = $$

this is the $ij$th component of the following product

$$(\sqrt{C}^{-1})^T C (\sqrt{C}^{-1})$$

We replace $C$ by $\sqrt{C} \sqrt{C}^T = \sqrt{C}^T \sqrt{C}= C$, and we get

$$(\sqrt{C}^{-1})^T \sqrt{C} \sqrt{C}^T (\sqrt{C}^{-1}) = I$$

We also have $$\sqrt{C}^{-1} = P^{-1} \sqrt{D} P$$

Concludion:

We start with $X \sim \mathcal{N}(0,C)$. We compute $\sqrt{C}^{-1}$ for example using $\sqrt{C}^{-1} = P^{-1} \sqrt{D} P$. We write $Z = X sqrt{C}^{-1}$. This $X$ has iid $\sim mathca;l{N}(0,1)$.

Conversely, to construct an $X\sim \mathcal{N}(0,C)$: start with $Z ~iid \mathcal{N}(0,1)~$ and write $X = Z \sqrt{C}$

Gamma Distribution

Definition

Let $X$ have density $f(x) = \left\{\begin{array}{lrr}constant ~ \frac{\displaystyle x^{\alpha - 1}}{\displaystyle \theta ^{\alpha}} \displaystyle e^{\displaystyle -\frac{x}{\theta}} &if&x>0 \\ 0 &&otherwise\end{array}\right.$

Interpretation:
  • A model for distribution of life-times
  • The waiting time until the kth occurance of an event in Poisson process has Gamma distribution

This is a Gamma random variable with parameters $\alpha>0, \theta>0$. $\theta$ is a scale parameter, and $\alpha$ is a shpe parameter.

The constant can be computed by looking at the case $\theta=1$:

$$constant = \frac{1}{\displaystyle \int_0^\infty x^{\alpha-1} e^{-x} dx} \rightarrow \Gamma(\alpha)$$

Notation: $X \sim \Gamma(\alpha, \theta)$

Super important facts

  • Let $\alpha=1$: $f(x) = \frac{1}{\theta} e^{-\frac{x}{\theta}}$ for $x>0$.

    • We recognize that this is $\sim Expn(\lambda = \frac{1}{\theta})$.
      What constant? $constant=\frac{1}{\int ...} = 1$
  • Let $\alpha=2$: Note that $\Gamma(2) = 1$ and for integer $n$: $\Gamma(n) = (n-1)!$
    But we can represent $X \sim \Gamma(\alpha=2, \theta)$ like $X=X_1 + X_2$ where $X_1,X_2$ are iid $Expn(\frac{1}{\theta})$

More generally, if $X_1\sim \Gamma(\alpha_1,\theta) \ \ \ \&\ \ \ X_2\sim \Gamma(\alpha_2,\theta) $ even if $\alpha_1,\alpha_2$ are not integers, if $X_1\&X_2$ are independent, then $X=X_1 + X_2 \sim \Gamma(\alpha_1+\alpha_2, \theta)$

  • Combining the previous two points, we get the folloing corollary:
Corollary

Let $X_i:i=1,2,..n$ be iid $Expn(\frac{1}{\theta})$. Then, $$X=X_1+X_2+..+X_n \sim \Gamma(n, \theta)$$

This fact, relates to the Poisson process like this:

Recall that the inter-arrival times $X_1,X_2,...$ of a Posson process with parameyet $\lambda$ are iid random variables with $\sim Exp(\lambda)$.

Therefore, the actual time $T_n$ at which the Poisson process jumps for the $n$th time (the actual $n$th arrival time) $T_n =X_1 + X_2 +... +X_n$

In particular, $T_n \sim \Gamma (n, \frac{1}{\lambda})$

A quick way to avoid confusion: $\theta $ is time, $\lambda$ is a rate

Algebraic relation between the $\Gamma$ and Poisson distribution:

  • $\mathbf{P}[T_n > \lambda] = \mathbf{P}[\text{there are less than n arrivals in the time interval }] ... $
Question: how to simulate $X\sim \Gamma (n, \theta)$?

We already know the answer. We draw $n$ independent samples ($U_1,U_2,..U_n$) from $Unif[0,1]$ distribution., and let $X = -\theta \ln U_1 -\theta \ln U_2 -... -\theta \ln U_n$

Lets chaeck that at least the expectations match.

  • $\mathbf{E}[X] = ?$ we know that $\mathbf{E}[-\ln U_1] = \mathbf{E}[Expn(\lambda=1)] = 1$. Therefore, $\mathbf{E}[X] = n\theta$.

    We used the following for $Y\sim \Gamma(\alpha, \theta) $: $$\mathbf{E}[Y] = \alpha \theta \ \ \ \ \text{and} \ \ \ \mathbf{Var}[X] = \alpha \theta^2$$

    So, for this $X$: $$\mathbf{E}[X] = n\theta \ \ \ \ \text{and} \ \ \ \mathbf{Var}[X] = n \theta^2$$

What about when $\alpha$ is not an integer?

There is an easy trick. Let $n=\lfloor \alpha \rfloor$ the integer part of $\alpha$. Simulate $X$ which is $\Gamma(n,\theta)$ like before. Then, consider two more independent uniform random variable on $[0,1]$: $U_{n+1}, U_{n+2}$. Then, let $Y=X -\theta \ln U_{n+1} ~Z$ where $Z=\left\{\begin{array}{lrr}0 &if & U_{n+2}<\alpha - n\\ 1 &&U_{n+2} > \alpha - n\end{array}\right.$

Turns out this $Y$ has Gamma distribution: $Y \sim \Gamma (\alpha, \theta)$


Example

Let $Z\sim \mathcal{N}(0,1)$.

Let $X = Z^2$.

What is the distribution of $X$?

Strategy: use CDF

$F_X(x) = \mathbf{P}[X \le x] = \left\{\begin{array}{lrr} 0 & if &x < 0 \\ \mathbf{P}[-\sqrt{x} \le X \le \sqrt{x}] & if & x \ge 0\end{array}\right.$

Let $\Phi$ be the CDF of Z.

$$\mathbf{P}[-\sqrt{x} \le X \le \sqrt{x}] = \Phi(\sqrt{x}) - \Phi( - \sqrt{x}) = 2\Phi(\sqrt{x}) - 1$$

Now, for finding the density of $X$, we take derivative of its CDF. Note, that we do not need to explicitly have $\Phi$, but we know that it is diffrentiable and its derivative is $\displaystyle \phi(z) = \frac{d \Phi(z)}{dz} = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}$.

$$\frac{d}{dx}(F_X(x)) = \text{using chain rule } = 2\frac{d \Phi}{dz} (z=\sqrt{x}) \frac{d \sqrt{x}}{dx} \\ = 2\frac{1}{\sqrt{2\pi}} e^{-x/2} \times \frac{1}{2\sqrt{x}} = \\ \frac{1}{\sqrt{2\pi x}} e^{-x/2}$$

This is the density of the Gamma distribution with $\alpha = \frac{1}{2}$ and $\theta = 2$. This confirms that $$f_X(x) = \frac{(x/\theta)^{\alpha-1}}{\Gamma(\alpha)} e^{-\frac{x}{\theta}}$$

We know that $\displaystyle \Gamma(\frac{1}{2}) = \sqrt{\pi}$

Definition

Note that $\Gamma(\alpha=1/2, \theta=2)$ is also known as the $\mathcal{X}^2$ (Chi-Squared distribution) with one degree of freedom. More generally, the $\mathcal{X}^2$ with $n$ degrees of freedom can be represented as $X = (Z_1)^2 + (Z_2)^2 + ... + (Z_n)^2$ where $Z_i$'s are $iid ~~\sim \mathcal{N}(0,1)$.

We know from property 3, that the sum of independent Gamma random variables with a common $\theta$ is again Gamma, just add the shape parameters. Therefore, in this case, the $\mathcal{X}^2$ distribution with $n$ degrees of freedom is $$\Gamma(\alpha=\frac{n}{2}, \theta=2)$$

One last thing: in case $n=2$, this proves that if $Z_1\&Z_2$ are $iid~\sim \mathcal{N}(0,1)$, then $Z_1^2 + Z_2^2 \sim \Gamma(\alpha=1, \theta=2)$

$density = constant ~ e^{-x/2}$ and the $constant=\frac{1}{2}$. We recognize this as exponential distribution: $\sim Expn(\lambda = \frac{1}{2})$.


Example:

Let $X \sim \Gamma(\alpha, 1)$. Let $Y = \frac{1}{X}$. Find $\mathbf{E}[Y]$.

We just need to compute $\mathbf{E}[Y] = \mathbf{E}[\frac{1}{X}] \displaystyle = \int_{-\infty}^\infty = \frac{1}{x} \frac{1}{\Gamma(\alpha)}x^{\alpha-1} e^{-x} dx = \frac{1}{\Gamma(\alpha)} \int_0^\infty x^{\alpha-2} e^{-x} dx$

Let's name the integral as $I(\alpha)$.

If $\alpha -2 < 0 $ then $x^{\alpha-2}$ will go to $\infty$ when $x\to 0$. Therefore, let's concentrate on $x\in [0,1]$. In this interval, the exponential factor $e^{-x}$ does not create any problem. So, $I(\alpha)$ exists if and only if $\int_0^1 x^{\alpha-2} dx$ exists. Let's try to compute it:

$$\int_0^1 x^{\alpha-2} dx = \frac{1}{\alpha - 1} x^{\alpha -1}$$

Therefore, $I(\alpha)$ exists when $\alpha > 1$.

We have proved that if $\alpha>1$, then $\mathbf{E}[X]$ is finite (exists). We want to prove that when $\alpha \le 1$, $\mathbf{E}[X]$ does not exists. We just need to show $\int_0^1 x^{\alpha-2}dx = +\infty$ for $\alpha \le 1$.

Indeed: $\displaystyle \int_\epsilon^1 x^{\alpha-2}dx = \frac{1}{\alpha-1} \left(2 - \epsilon^{\alpha-1}\right)$ so when $\epsilon \to 0$, then $\epsilon^{\alpha-1} \to \infty$


In [ ]: