Distribution is a generic word, that for a continous random variable may refer to density, or also its CDF.
We have seen some examples already:
Notation: Z with this density we write: $Z \sim mathcal{N}(0,1)$
the 0 stands for the fact that $\mathbf{E}[Z] = 0$
the 1 stands for the fact that $\mathbf{Var}[Z] = \mathbf{E}[Z^2] - (\mathbf{E}[Z])^2 = 1$
indeed:
$$\mathbf{E}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z e^{-z^2/2} dz = 0\text{ because } \\ z e^{-z^2/2} \text{ is an odd function which we integrate over an even interval}$$
Another way to say this mathematically:
$$\mathbf{E}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z~e^{-z^2/2} dz = \\ \int_{-\infty}^0 z e^{-z^2/2} dz + \int_{0}^\infty z e^{-z^2/2} dz \text{change of variables} z'=-z\\\text{ this is the oppisute of the first term }$$
$$\mathbf{Var}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z^2~e^{-z^2/2} dz \text{use the integration by parts } \\ u=z \rightarrow du=dz \text{ and } dv = ze^{-z^2/2} dz \longrightarrow v = -e^{-z^2/2}$$
therefore, we get $$\mathbf{Var}[Z] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty z^2~e^{-z^2/2} dz = \frac{1}{\sqrt{2\pi}} \left( uv - \int_{-\infty}^\infty vdu \right) \\ = \frac{1}{\sqrt{2\pi}} \left(\left.ze^{-z^2/2}\right]_{-\infty}^\infty - \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty ~e^{-z^2/2} dz \right) \\ = 0 + \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty ~e^{-z^2/2} dz = \sqrt{2\pi} \frac{1}{\sqrt{2\pi}} $$
The final term above has to be 1, since we know that it is the same form as the density of normal random variabe, and integrating over the density should be 1.
Let $\mu$ and $\sigma$ be constants (typically we assume $\sigma > 0$).
Let $Z\sim \mathcal{N}(0,1)$
Let $X = \mu + \sigma Z$
We say that $X$ is Normal (not standard) with parameters $\mu$ and $\sigma^2$.
Notaiton: $X \sim \mathcal{N}(\mu, \sigma^2)$
Moreover, $$\mathbf{E}[X] = \mu ~~~~~~ \mathbf{Var}[X] = \sigma^2$$
To prove the above, use the linearity of expectations. (very easy!)
Moreover, we can get the density of $X$ (using affie transformation):
$$f_X(x) = \frac{1}{\sqrt{2\pi \sigma^2}} e^{\displaystyle -\frac{(x-\mu)^2}{2\sigma^2}}$$
Let $X\sim\mathcal{N}(\mu,\sigma^2)$. Then, we can standardize then $Z = \displaystyle \frac{Z-\mu}{\sigma}$ is the standard version of $X$ and $Z$ is standard normal: $Z\sim \mathcal{N}(0,1)$.
Let $X_1,X_2,...X_n$ be all independent with distributions $\mathcal{N}(\mu_1,\sigma_1^2),...\mathcal{N}(\mu_n,\sigma_n^2)$, then $X_1+X_2+..+X_n $ is still normal and $$X_1+X_2+..+X_n = \mathcal{N}(\sum_k\mu_k,\sum_k \sigma_k^2)$$
Let $(X_1,X_2,...X_n)$ be an n-dimensional random vector. Assume $\mathbf{Var}[X_i] < \infty$ for all $i=1,...n$.
$$\mathbf{Cov}(X_i,X_j) = \mathbf{E}[(X_i - \mathbf{E}[X_i])(X_j - \mathbf{E}[X_j])]$$Let $C_{ij} = \mathbf{Cov}(X_i,X_j)$. This defines a matrix $C$ with size $n\times n$.
This $C$ is the covariance matrix of the random vector $(X_1,X_2,...X_n)$.
We say that the vector $(X_1,X_2,...X_n)$ is normal with mean vector $\mathbf{\mu}$ and covariance $C$ if its joint density look slike for $\mathbf{x} \in \mathbb{R}^n$ where $\mathbf{x} = (X_1,X_2,...X_n)$
(sometimes they use $\Sigma$ for the covaraince matrix$)
$$f(\mathbf{x}) = f(x_1,x_2,...,x_n) = \frac{1}{\sqrt{(2\pi)^n} \det C} \exp\left(-\frac{1}{2} (\mathbf{x}-\mathbf{\mu})^TC^{-1}(\mathbf{x}-\mathbf{\mu})\right)$$It turns out that if we fix values $a_1,a_2,...,a_n$ and we look at $X = \sum_{k=1}^n a_k X_k$ this $X$ is normal.
Note: it also turns out that $X_1,...X_n$ are independent if and only if $C$ is diagonal.
Let $(X_1,X_2)$ be $\mathcal{N}(0,C)$. There, $C=\mathbf{Cov matrix} = \left(\begin{array}{cc}\sigma_1^2 & \rho \\ \rho & \sigma_2^2\end{array}\right)
Ofcourse by definition, $\rho = \mathbf{Cov}(X_1,X_2)$.
We can do the linear regression of $X_2$ wrt $X_1$. There ought to exist a standard normal random variable $X_3$ such that $X_2 = c ~ X_1 + d~X_3$. Let's fund $c,d$?
Note: since $X_1\&X_3$ are independent, we are guaranteed that $X_2$ is normal. We must have $\mathbf{E}[X_2] = 0 \forall c,d$
We must also have $\mathbf{Var}[X_2] = \sigma_2^2 $ and also $=c^2 \sigma_1^2 + d^2$.
We must also have $\mathbf{Cov}(X_1,X_2) = \rho $ and also $=\mathbf{E}[X_1 X_2] = \mathbf{E}[(c X_1 + dX_3)X_1] = \mathbf{E}[cX_1^2 + dX_1 X_3] = c \sigma_1^2 + d\times 0 (by independence)$
$$\left\{\begin{array}{c} \sigma_2^2 = c^2\sigma_1^2 + d^2\\\rho = c\sigma_1^2 \end{array}\right. \rightarrow \text{a triangular system of equation}$$$$c = \frac{\rho}{\sigma_1^2} ~~\text{ and }~ d = \sqrt{\sigma_2^2 - \frac{\rho^2}{\sigma_1^2}}$$This example shows how to represent the non-independent normal pair $(X_1,X_2)$ using a normal pair $(X_1,X_3)$ which are independent.
Finally, we can aplpy the theorem
Let $(X_1,X_2,...X_n)\sim \mathcal{N}(\mu, C)$.
Consider $X=(Z_1, Z_2, ...Z_n) = X \sqrt{C}^{-1}$
Then, $Z$ is a vector which is normal, and $Z\sim \mathcal{N}(0, I)$ ($I$ represents identoty matrix)
This says that we can multiply a normal vector by a specific matrix to turn it into a standard normal vector. In the notation above, $\sqrt{C}$ is a square root matrix of $C$. We mean that $\sqrt{C} \sqrt{C}^T = \sqrt{C}^T \sqrt{C}= C$
$\sqrt{C}$ exists because $C$ is Positive definite. Indeed, $C$ can be diagonalized like this $C = P^{-1} D P$ where $\exists P$ orthogonal matrix ($P^T = P^{-1}$) and $D$ is the diagonal matrix of eigenvalues of $C$.
We can check that $\star$ has the correct structure. $\mathbf{Cov}(Z)$ is a matrix with elements $\mathbf{Cov}(Z_i,Z_j)$ and we just want to show that $\mathbf{Cov}[Z_i,Z_j)=0$ if $i \ne j$ and $\mathbf{Cov}[Z_i,Z_j)=1$ if $i=j$.
this is the $ij$th component of the following product
$$(\sqrt{C}^{-1})^T C (\sqrt{C}^{-1})$$We replace $C$ by $\sqrt{C} \sqrt{C}^T = \sqrt{C}^T \sqrt{C}= C$, and we get
$$(\sqrt{C}^{-1})^T \sqrt{C} \sqrt{C}^T (\sqrt{C}^{-1}) = I$$We also have $$\sqrt{C}^{-1} = P^{-1} \sqrt{D} P$$
We start with $X \sim \mathcal{N}(0,C)$. We compute $\sqrt{C}^{-1}$ for example using $\sqrt{C}^{-1} = P^{-1} \sqrt{D} P$. We write $Z = X sqrt{C}^{-1}$. This $X$ has iid $\sim mathca;l{N}(0,1)$.
Conversely, to construct an $X\sim \mathcal{N}(0,C)$: start with $Z ~iid \mathcal{N}(0,1)~$ and write $X = Z \sqrt{C}$
Let $X$ have density $f(x) = \left\{\begin{array}{lrr}constant ~ \frac{\displaystyle x^{\alpha - 1}}{\displaystyle \theta ^{\alpha}} \displaystyle e^{\displaystyle -\frac{x}{\theta}} &if&x>0 \\ 0 &&otherwise\end{array}\right.$
This is a Gamma random variable with parameters $\alpha>0, \theta>0$. $\theta$ is a scale parameter, and $\alpha$ is a shpe parameter.
The constant can be computed by looking at the case $\theta=1$:
$$constant = \frac{1}{\displaystyle \int_0^\infty x^{\alpha-1} e^{-x} dx} \rightarrow \Gamma(\alpha)$$Notation: $X \sim \Gamma(\alpha, \theta)$
Let $\alpha=1$: $f(x) = \frac{1}{\theta} e^{-\frac{x}{\theta}}$ for $x>0$.
Let $\alpha=2$: Note that $\Gamma(2) = 1$ and for integer $n$: $\Gamma(n) = (n-1)!$
But we can represent $X \sim \Gamma(\alpha=2, \theta)$ like $X=X_1 + X_2$ where $X_1,X_2$ are iid $Expn(\frac{1}{\theta})$
More generally, if $X_1\sim \Gamma(\alpha_1,\theta) \ \ \ \&\ \ \ X_2\sim \Gamma(\alpha_2,\theta) $ even if $\alpha_1,\alpha_2$ are not integers, if $X_1\&X_2$ are independent, then $X=X_1 + X_2 \sim \Gamma(\alpha_1+\alpha_2, \theta)$
Let $X_i:i=1,2,..n$ be iid $Expn(\frac{1}{\theta})$. Then, $$X=X_1+X_2+..+X_n \sim \Gamma(n, \theta)$$
This fact, relates to the Poisson process like this:
Recall that the inter-arrival times $X_1,X_2,...$ of a Posson process with parameyet $\lambda$ are iid random variables with $\sim Exp(\lambda)$.
Therefore, the actual time $T_n$ at which the Poisson process jumps for the $n$th time (the actual $n$th arrival time) $T_n =X_1 + X_2 +... +X_n$
In particular, $T_n \sim \Gamma (n, \frac{1}{\lambda})$
A quick way to avoid confusion: $\theta $ is time, $\lambda$ is a rate
Algebraic relation between the $\Gamma$ and Poisson distribution:
Lets chaeck that at least the expectations match.
$\mathbf{E}[X] = ?$ we know that $\mathbf{E}[-\ln U_1] = \mathbf{E}[Expn(\lambda=1)] = 1$. Therefore, $\mathbf{E}[X] = n\theta$.
We used the following for $Y\sim \Gamma(\alpha, \theta) $: $$\mathbf{E}[Y] = \alpha \theta \ \ \ \ \text{and} \ \ \ \mathbf{Var}[X] = \alpha \theta^2$$
So, for this $X$: $$\mathbf{E}[X] = n\theta \ \ \ \ \text{and} \ \ \ \mathbf{Var}[X] = n \theta^2$$
What about when $\alpha$ is not an integer?
There is an easy trick. Let $n=\lfloor \alpha \rfloor$ the integer part of $\alpha$. Simulate $X$ which is $\Gamma(n,\theta)$ like before. Then, consider two more independent uniform random variable on $[0,1]$: $U_{n+1}, U_{n+2}$. Then, let $Y=X -\theta \ln U_{n+1} ~Z$ where $Z=\left\{\begin{array}{lrr}0 &if & U_{n+2}<\alpha - n\\ 1 &&U_{n+2} > \alpha - n\end{array}\right.$
Turns out this $Y$ has Gamma distribution: $Y \sim \Gamma (\alpha, \theta)$
Let $Z\sim \mathcal{N}(0,1)$.
Let $X = Z^2$.
What is the distribution of $X$?
Strategy: use CDF
$F_X(x) = \mathbf{P}[X \le x] = \left\{\begin{array}{lrr} 0 & if &x < 0 \\ \mathbf{P}[-\sqrt{x} \le X \le \sqrt{x}] & if & x \ge 0\end{array}\right.$
Let $\Phi$ be the CDF of Z.
$$\mathbf{P}[-\sqrt{x} \le X \le \sqrt{x}] = \Phi(\sqrt{x}) - \Phi( - \sqrt{x}) = 2\Phi(\sqrt{x}) - 1$$Now, for finding the density of $X$, we take derivative of its CDF. Note, that we do not need to explicitly have $\Phi$, but we know that it is diffrentiable and its derivative is $\displaystyle \phi(z) = \frac{d \Phi(z)}{dz} = \frac{1}{\sqrt{2\pi}} e^{-z^2/2}$.
$$\frac{d}{dx}(F_X(x)) = \text{using chain rule } = 2\frac{d \Phi}{dz} (z=\sqrt{x}) \frac{d \sqrt{x}}{dx} \\ = 2\frac{1}{\sqrt{2\pi}} e^{-x/2} \times \frac{1}{2\sqrt{x}} = \\ \frac{1}{\sqrt{2\pi x}} e^{-x/2}$$This is the density of the Gamma distribution with $\alpha = \frac{1}{2}$ and $\theta = 2$. This confirms that $$f_X(x) = \frac{(x/\theta)^{\alpha-1}}{\Gamma(\alpha)} e^{-\frac{x}{\theta}}$$
We know that $\displaystyle \Gamma(\frac{1}{2}) = \sqrt{\pi}$
Note that $\Gamma(\alpha=1/2, \theta=2)$ is also known as the $\mathcal{X}^2$ (Chi-Squared distribution) with one degree of freedom. More generally, the $\mathcal{X}^2$ with $n$ degrees of freedom can be represented as $X = (Z_1)^2 + (Z_2)^2 + ... + (Z_n)^2$ where $Z_i$'s are $iid ~~\sim \mathcal{N}(0,1)$.
We know from property 3, that the sum of independent Gamma random variables with a common $\theta$ is again Gamma, just add the shape parameters. Therefore, in this case, the $\mathcal{X}^2$ distribution with $n$ degrees of freedom is $$\Gamma(\alpha=\frac{n}{2}, \theta=2)$$
One last thing: in case $n=2$, this proves that if $Z_1\&Z_2$ are $iid~\sim \mathcal{N}(0,1)$, then $Z_1^2 + Z_2^2 \sim \Gamma(\alpha=1, \theta=2)$
$density = constant ~ e^{-x/2}$ and the $constant=\frac{1}{2}$. We recognize this as exponential distribution: $\sim Expn(\lambda = \frac{1}{2})$.
Let $X \sim \Gamma(\alpha, 1)$. Let $Y = \frac{1}{X}$. Find $\mathbf{E}[Y]$.
We just need to compute $\mathbf{E}[Y] = \mathbf{E}[\frac{1}{X}] \displaystyle = \int_{-\infty}^\infty = \frac{1}{x} \frac{1}{\Gamma(\alpha)}x^{\alpha-1} e^{-x} dx = \frac{1}{\Gamma(\alpha)} \int_0^\infty x^{\alpha-2} e^{-x} dx$
Let's name the integral as $I(\alpha)$.
If $\alpha -2 < 0 $ then $x^{\alpha-2}$ will go to $\infty$ when $x\to 0$. Therefore, let's concentrate on $x\in [0,1]$. In this interval, the exponential factor $e^{-x}$ does not create any problem. So, $I(\alpha)$ exists if and only if $\int_0^1 x^{\alpha-2} dx$ exists. Let's try to compute it:
$$\int_0^1 x^{\alpha-2} dx = \frac{1}{\alpha - 1} x^{\alpha -1}$$Therefore, $I(\alpha)$ exists when $\alpha > 1$.
We have proved that if $\alpha>1$, then $\mathbf{E}[X]$ is finite (exists). We want to prove that when $\alpha \le 1$, $\mathbf{E}[X]$ does not exists. We just need to show $\int_0^1 x^{\alpha-2}dx = +\infty$ for $\alpha \le 1$.
Indeed: $\displaystyle \int_\epsilon^1 x^{\alpha-2}dx = \frac{1}{\alpha-1} \left(2 - \epsilon^{\alpha-1}\right)$ so when $\epsilon \to 0$, then $\epsilon^{\alpha-1} \to \infty$
In [ ]: