(Also known as the continuity theorem)
Let $X_n$ be a sequence of random variables, and $M_{X_n}(t) = \mathbf{E}\left[e^{tX_n}\right]$ is the corresponding sequence of moment generating functions.
If $_{X_n}(t) \to M_X(t) \text{ as } n\to \infty$ for $t$ closeto $0$, and $M_X$ is the MGF of $X$. Then, $X_n \to X$ in distribution.
Note: convergence in distribution is also called convergence in law.
Let $X_i: ~ i=1,..,n$ be $iid$ random variables.
Assume $\mu = \mathbf{E}[X_i]$ and $\sigma^2 = \mathbf{Var}[X_i] < \infty$.
Let $Z_i = \displaystyle \frac{X_i - \mu}{\sigma}$ (Z_i is standardized).
Let $S_n = \displaystyle \frac{\sum_{i=1}^n Z_i}{\sqrt{n}}$
Then, $S_n \to \mathcal{N}(0,1) \text{ as }n\to \infty$ (convergence in distribution)
Recall Taylor's formula: Let $M$ be a function defined for $t$ near $0$. Assume $M$ has $m$ continuous derivatives. Then $M(t) = \displaystyle \sum_{k=0}^m \displaystyle M^{(k)} \frac{1}{k!} t^k ~ + ~ \epsilon(t) t^{m})$ where $M^{(k)}$ is th kth derivative of $M$ at $t=0$ and $\epsilon(u) \to 0 \text{ as } u\to 0$.
Thsi theorem is a consequence of the mean-value theorem that says in the case $m=1$, for $b>0$ $M(b) - M(0) = (b-0)t M'(\xi)$ where $\xi \in (0,b)$
Now, apply this Taylor's formula with $m=2$ to prove CLT:
$M_{Z_i}(t) = \sum_{k=0}^2 M^{(k)}_{Z_i} (0) \frac{1}{k!} t^k ~+~ \epsilon(t) t^2$
Now, $$M_{Z_i}(t) = M_{Z_i}(0) ~+~ M'_{Z_i}(0) t + \frac{1}{2} M''_{Z_i}(0) t^2$$
$M_{Z_i} (0) = 1$
$M'_{Z_i} (0) = \text{first moment} = \mathbf{E}[Z_i] = 0$
$M'_{Z_i} (0) = \text{second moment} = \mathbf{E}[Z_i^2] = \mathbf{Var}[Z_i] - 0^2= 1$
therefore, $M_{Z_i}(t) = 1 + \frac{1}{2} t^2 + \epsilon(t^2) ~+~ \epsilon(t) t^{2}$
Therefore, for $S_n$ we have $\displaystyle M_{Z_1 + Z_2 + ... +Z_n} = \prod_{k=1}^n M_{Z_i}(t) = \left(1 + \frac{1}{2}t^2 + \epsilon(t) t^{2}\right)^n$
And for $S_n = \frac{}{\sqrt{n}}$:
$$\displaystyle M_{S_n} (t) = M_{Z_1 + ... + Z_n}\left(\frac{t}{\sqrt{n}} \right) =\\ \left(1 + \frac{1}{2}\frac{t^2}{n} + \epsilon(\frac{t}{\sqrt{n}}) \frac{t^2}{n}\right) = \\ \left(1 + \frac{\frac{1}{2} t^2 + \epsilon(\frac{t}{\sqrt{n}}) t^2}{n}\right)^n$$We just need to prove that this expression converges to $e^{t^2/2}$ because that is the MGF of $\mathcal{N}(0,1)$.
Just use the following lemma:
$$\lim_{n\to \infty} \left(1 + \frac{x}{n}\right)^n = e^{x}$$or better: $$\lim_{n\to \infty} \left(1 + \frac{x + g_n(x)}{n}\right)^n = e^x ~~ \text{ if } ~~ g_n(x) \to 0 \text{ as }n\to \infty \text{ uniformly in }x$$
Here, $g_n(x) = \epsilon(\frac{t}{\sqrt{n}}) t^2$ and $x=\frac{1}{2} t^2$
Some points to consider:
If by mistake, we write $S_n = \frac{Z_1 + Z_2 + ... + Z_n}{n}$ instead of $S_n = \frac{Z_1 + Z_2 + ... + Z_n}{\sqrt{n}}$, then $S_n$ is the empirical mean of $Z_i$'s and according to the law of large numbers it converges to zero.
If $S_n = \frac{Z_1 + Z_2 + ... + Z_n}{n^\frac{1}{4}}$, then $S_n \to \infty$ not normal distribution.
In other words:
The fluctuations of the number of successes in $n$ independent trials with success probability $p$, are of order $\sqrt{n}$ and within that scale. The size of each flucutation is approximately of normal (Gaussian) law with variance $p(1-p)$
In other words: $$Binomial(n,p) \approx np + \sqrt{n} \mathcal{N}\left(0, p(1-p)\right) $$
We find (not using CLT, but doing work by hand) that $X_n \to Poisson(\lambda) \text{ as } n\to \infty$
Let $U_i ~\text{ for }i:1..n \sim iid Unif(0,1)$
and $\bar{U}_n = \displaystyle \frac{U_1 + ... + U_n}{n}$
Find $\mathbf{P}\left[|\bar{U}_n - \frac{1}{2}| < 0.02\right] \ge 0.95$?
We make the leap of faith that $n$ will be large enough that CLT applies.
We start by standardizing $\bar{U}_n$: $\displaystyle \left\{\begin{array}{l}\mu_U = \frac{1}{2} \\ \sigma^2_U = \frac{1/12}{n}\end{array}\right.$
$$Z_n = \frac{\bar{U}_n - \frac{1}{2}}{\sqrt{\frac{1/12}{n}}} = \sqrt{12n} \left(\bar{U}_n - \frac{1}{2}\right)$$
CLT says $Z_n \sim \mathcal{N}(0,1)$
We want $\displaystyle \mathbf{P}\left[|\bar{U}_n - \frac{1}{2}| < 0.02\right] = \mathbf{P}\left[-0.02 < \bar{U}_n - \frac{1}{2} < 0.02\right]$
In [28]:
import scipy.stats
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
x = np.arange(-4,4,0.01)
pdf = scipy.stats.norm.pdf(x)
inx1 = np.where(x > -1.96)[0][0]
inx2 = np.where(x < 1.96)[0][-1]
plt.figure(figsize=(13,4))
plt.fill_between(x[inx1:inx2+1], pdf[inx1:inx2+1], alpha=0.5)
plt.plot(x, pdf, lw=2)
plt.plot([0,0], [0,np.max(pdf)], ls='dotted')
plt.text(1.5, 0.3, '$0.95$', size=34)
plt.arrow(1.4, 0.3, -0.5, -0.1, head_width=0.05, fc='k')#, head_length=0.1, fc='k', ec='k')
plt.xlim(-3, 3)
plt.xticks([-1.96, 0, 1.96], size=30)
plt.yticks([])
plt.show()
To solve the equation $\displaystyle \mathbf{P}\left[\sqrt{12n} \left|\bar{U}_n - \frac{1}{2}\right| < 0.02 \sqrt{12n} \right] = \mathbf{P}\left[|Z_n| < 0.02 \sqrt{12n}\right] \ge 0.95$, we can use the above figure.
$$0.02 \sqrt{12n} \ge 1.96 \Longrightarrow n \ge \frac{1.96}{0.02^2 \times 12} = 800.33 \Longrightarrow n\ge 801$$
In [ ]: