Central-Limit-Theorem



In [1]:
import numpy as np
from scipy.stats import norm
from matplotlib import pyplot as plt
%matplotlib inline

Distribution: Triangular distribution See: https://en.wikipedia.org/wiki/Triangular_distribution


In [2]:
def build_plot(n, subsets_num):
    values = np.random.triangular(0, 0.5, 1, size = (subsets_num,n))
    
    means = np.sort(np.sum(values, axis = 1) / n) 
    
    fit = norm.pdf(means, 0.5, np.sqrt(1./(24 * n))) # <=========== Theoretical distribution
    plt.xlabel('x')
    plt.ylabel('f(x)')
    plt.plot(means, fit,'-')
    plt.hist(means, bins = 7, normed=True)
    plt.xlim((0.3, 0.7))

For Triangular distribution with parameters a = 0, b = 1, c = 0.5 => mean = 0.5, variance = 1/24. So mean ~N(0.5, 1/(24 * n)), where n - number of experiments


In [3]:
build_plot(n = 100, subsets_num = 1000)



In [4]:
build_plot(n = 500, subsets_num = 1000)



In [5]:
build_plot(n = 1000, subsets_num = 1000)


Conclusions: We can approximate mean of random value that has unknow distribution using central limit theorem. We can evaluate this approximation using normal distribution N (E (X), Var (X) / n), where n is the number of experiments.


In [ ]: