Fit real histogram



In [1]:

    
%pylab inline









    



Populating the interactive namespace from numpy and matplotlib



In [2]:

    
from scipy.stats import norm

Create normally distributed data



In [3]:

    
# fake some data
data = norm.rvs(loc=0.0, scale=1.0, size =150)
plt.hist(data, rwidth=0.85, facecolor='black');
plt.ylabel('Number of events');
plt.xlabel('Value');

Obtain the fitting to a normal distribution

This is simply the mean and the standard deviation of the sample data



In [4]:

    
mean, stdev = norm.fit(data)
print('Mean =%f, Stdev=%f'%(mean,stdev))









    



Mean =0.050569, Stdev=1.022829

To adapt the normalized PDF of the normal distribution we simply have to multiply every value by the area of the histogram obtained

Get the histogram data from NumPy



In [5]:

    
histdata = plt.hist(data, bins=10, color='black', rwidth=.85) # we set 10 bins



In [8]:

    
counts, binedge = np.histogram(data, bins=10);
print(binedge)









    



[-2.58148003 -2.0579089  -1.53433777 -1.01076664 -0.48719551  0.03637562
  0.55994674  1.08351787  1.607089    2.13066013  2.65423126]



In [9]:

    
#G et bincenters from bin edges
bincenter = [0.5 * (binedge[i] + binedge[i+1]) for i in xrange(len(binedge)-1)]



In [10]:

    
bincenter









    Out[10]:





[-2.3196944640956341,
 -1.7961233353834976,
 -1.272552206671361,
 -0.7489810779592242,
 -0.22540994924708757,
 0.29816117946504916,
 0.82173230817718612,
 1.3453034368893229,
 1.8688745656014594,
 2.3924456943135963]



In [11]:

    
binwidth = (max(bincenter) - min(bincenter)) / len(bincenter) 
print(binwidth)









    



0.471214015841

Scale the normal PDF to the area of the histogram



In [12]:

    
x = np.linspace( start = -4 , stop = 4, num = 100)
mynorm = norm(loc = mean, scale = stdev)



In [13]:

    
# Scale Norm PDF to the area (binwidth)*number of samples of the histogram
myfit = mynorm.pdf(x)*binwidth*len(data)



In [14]:

    
# Plot everthing together
plt.hist(data, bins=10, facecolor='white', histtype='stepfilled');
plt.fill(x, myfit, 'r', alpha=.5);
plt.ylabel('Number of observations');
plt.xlabel('Value');