... and with the standard normal distribution under our belts, we can now turn to the more general form.
But first let's revisit variance once more and extend what we know.
As a case in point for (4), consider
\begin{align} \operatorname{Var}(X + X) &= \operatorname{Var}(2X) \\ &= 4 ~~ \operatorname{Var}(X) \\ &\neq 2 ~~ \operatorname{Var}(X) & \quad \blacksquare \\ \end{align}... and now we know enough about variance to return back to the general form of the normal distribution.
Let $X = \mu + \sigma \mathcal{Z}$, where
Then $X \sim \mathcal{N}(\mu, \sigma^2)$
It follows immediately that
\begin{align} \mathbb{E}(X) &= \mu \end{align}From what we know about variance,
\begin{align} \operatorname{Var}(\mu + \sigma \mathcal{Z}) &= \sigma^2 ~~ \operatorname{Var}(\mathcal{Z}) \\ &= \sigma^2 \end{align}Solving for $\mathcal{Z}$, we have
\begin{align} \mathcal{Z} &= \frac{X - \mu}{\sigma} \end{align}For the general normal distribution, we can standardize it to allow us to obtain both cdf and pdf.
Given $X \sim \mathcal{N}(\mu, \sigma^2)$, we can get the cdf and pdf
\begin{align} \text{cdf} ~~ P(X \le x) &= P\left(\frac{X-\mu}{\sigma} \le \frac{x - \mu}{\sigma}\right) \\ &= \Phi \left(\frac{x-\mu}{\sigma} \right) \\ \\ \Rightarrow \text{pdf} ~~ \Phi' \left(\frac{x-\mu}{\sigma} \right) &= \frac{1}{\sigma} ~~ \frac{1}{\sqrt{2\pi}} ~~ e^{-\frac{\left(\frac{x-\mu}{\sigma}\right)^2}{2}} \end{align}We can also do $-X$, but apply what we've just covered.
\begin{align} -X &= -\mu + \sigma (-\mathcal{Z}) \sim \mathcal{N}(-\mu, \sigma^2) \end{align}Later we will show that if $X_j \sim \mathcal{N}(\mu, \sigma^2)$ are independent (consider $j \in {1,2}$), then $X_1 + X_2 \sim \mathcal{N}(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)$.
Since $\Phi$ cannot be computed in terms of other functions, we have the 68-95-98.7% Rule.
If $X \sim \mathcal{N}(\mu, \sigma^2)$, then as a rule of thumb $\Phi$ takes on the following values with relation to $\sigma$:
\begin{align} P(\lvert X-\mu \rvert &\le \sigma) \approx 0.68 \\ P(\lvert X-\mu \rvert &\le 2 \sigma) \approx 0.95 \\ P(\lvert X-\mu \rvert &\le 3 \sigma) \approx 0.987 \end{align}Suppose we have the following: $\newcommand\T{\Rule{0pt}{1em}{.3em}}$ \begin{array}{|c|c|} \hline Prob \T & P_0 & P_1 & P_1 & P_3 & \dots \\\hline X \T & 0 & 1 & 2 & 3 & \dots \\\hline X^2 \T & 0^2 & 1^2 & 2^2 & 3^2 & \dots \\\hline \end{array}
And so you can see that the probabilities for $X$ also are the same for $X^2$. That means we should be able to do this:
\begin{align} \mathbb{E}(X) &= \sum_x x ~ P(X=x) \\ \mathbb{E}(X^2) &= \sum_x x^2 ~ P(X=x) \\ \end{align}Let $X \sim \operatorname{Pois}(\lambda)$.
Recall that $\operatorname{Var}(X) = \mathbb{E}X^2 - (\mathbb{E}X)^2$. We know that $\mathbb{E}(X) = \lambda$, so all we need to do is figure out what $\mathbb{E}(X^2)$ is.
\begin{align} \mathbb{E}(X^2) &= \sum_{k=0}^{\infty} k^2 ~ \frac{e^{-\lambda} \lambda^k}{k!} \\ \\ \text{recall that } \sum_{k=0}^{\infty} \frac{\lambda^k}{k!} &= e^{\lambda} & \quad \text{Taylor series for } e^x \\ \\ \sum_{k=1}^{\infty} \frac{k ~ \lambda^{k-1}}{k!} &= e^{\lambda} & \quad \text{applying the derivative operator} \\ \sum_{k=1}^{\infty} \frac{k ~ \lambda^{k}}{k!} &= \lambda ~ e^{\lambda} & \quad \text{muliply by } \lambda \text{, replenishing it} \\ \sum_{k=1}^{\infty} \frac{k^2 ~ \lambda^{k-1}}{k!} &= \lambda ~ e^{\lambda} + e^{\lambda} = e^{\lambda} (\lambda + 1) & \quad \text{applying the derivative operator again} \\ \sum_{k=1}^{\infty} \frac{k^2 ~ \lambda^{k}}{k!} &= \lambda e^{\lambda} (\lambda + 1) & \quad \text{replenish } \lambda \text{ one last time} \\ \\ \therefore \mathbb{E}(X^2) &= \sum_{k=0}^{\infty} k^2 ~ \frac{e^{-\lambda} \lambda^k}{k!} \\ &= e^{-\lambda} \lambda e^{\lambda} (\lambda + 1) \\ &= \lambda^2 + \lambda \\ \\ \operatorname{Var}(X) &= \mathbb{E}(X^2) - (\mathbb{E}X)^2 \\ &= \lambda^2 + \lambda - \lambda^2 \\ &= \lambda & \quad \blacksquare \end{align}Let $X \sim \operatorname{Binom}(n,p)$.
$\mathbb{E}(X) = np$.
Find $\operatorname{Var}(X)$ using all the tricks you have at your disposal.
Let's try applying (4) from the above Rules of Variance.
We can do so because $X \sim \operatorname{Binom}(n,p)$ means that the $n$ trials are independent Bernoulli.
\begin{align} X &= I_1 + I_2 + \dots + I_n & \quad \text{where } I_j \text{ are i.i.d. } \operatorname{Bern}(p) \\ \\ \Rightarrow X^2 &= I_1^2 + I_2^2 + \dots + I_n^2 + 2I_1I_2 + 2I_1I_3 + \dots + 2I_{n-1}I_n & \quad \text{don't worry, this is not as bad as it looks} \\ \\ \therefore \mathbb{E}(X^2) &= n \mathbb{E}(I_1^2) + 2 \binom{n}{2} \mathbb{E}(I_1I_2) & \quad \text{by symmetry} \\ &= n p + 2 \binom{n}{2} \mathbb{E}(I_1I_2) & \quad \text{since } \mathbb{E}(I_j^2) = \mathbb{E}(I_j) \\ &= n p + n (n-1) p^2 & \quad \text{since } I_1I_2 \text{ is the event that both } I_1 \text{ and } I_2 \text{ are successes} \\ &= np + n^2 p^2 - np^2 \\ \\ \operatorname{Var}(X) &= \mathbb{E}(X^2) - (\mathbb{E}X)^2 \\ &= np + n^2 p^2 - np^2 - (np)^2 \\ &= np - np^2 \\ &= np(1-p) \\ &= npq & \quad \blacksquare \end{align}Let $X \sim \operatorname{Geom}(p)$.
It has PDF $q^{k-1}p$.
Find $\operatorname{Var}(X)$.
Proving LOTUS for the discrete case, we will show $\mathbb{E}(g(X)) = \sum_{x} g(x) \, P(X=x)$.
Building on what we did when we proved linearity, we have
\begin{align} \underbrace{g(x) \, P(X=x)}_{\text{grouped "super-pebble"}} &= \underbrace{\sum_{s \in S} g(X(s)) \, P(\{s\})}_{\text{individual pebbles}} \\ &= \sum_{x} \sum_{s: X(s)=x} g(X(s)) \, P(\{s\}) \\ &= \sum_{x} g(x) \sum_{s: X(s)=x} P(\{s\}) \\ &= \sum_{x} g(x) P(X=x) & \quad \blacksquare \end{align}View Lecture 14: Location, Scale, and LOTUS | Statistics 110 on YouTube.