Histograms

We're going to be using histograms throughout the course, starting with the Planck function, which is really just a histogram of photon energies. As a demo, let's plot a grade distribution. The cell below generates 1000 random samples of a normal distribution with mean 75 and standard deviation 10



In [1]:

    
%matplotlib inline
import numpy.random as nr
import numpy as np
from matplotlib import pyplot as plt

the_mean=75
the_sigma=20.
numpoints=1000

Remove all grades below 0 or above 100



In [2]:

    
outRandom=nr.normal(the_mean,the_sigma,[numpoints,])
outRandom=outRandom[outRandom <= 100.]
outRandom=outRandom[outRandom >= 0.]



In [3]:

    
#
# histogram these 1000 grades into 1000 bins with uniform width
#
fig,ax = plt.subplots(1,1)
bin_edges=np.linspace(0,100,21,endpoint=True)
ax.hist(outRandom,bins=bin_edges)
_=ax.set(title='Grade distribution',xlabel='mark (%)',ylabel='Number in bin')

Note that UBC has grade boundaries that narrow for higher marks. Just counting the number in each bin and plotting it is seriously misleading if you expect the area of the bin to be proportional to the number in the bin.



In [4]:

    
#
# make a dictionary to hold the grade boundaries
#
bounds={'a+':90,'a':85,'a-':80,'b+':76,'b':72,'b-':68,'c+':64,'c':60,'c-':55,'d':50}
bounds=list(bounds.values())
bounds.sort()
#
# add the high and low edges
#
bounds.insert(0,0)
bounds.append(100)

fig,ax = plt.subplots(1,1)
out=ax.hist(outRandom,bins=bounds)
_=ax.set(title='Grade distribution UBC',xlabel='mark (%)',ylabel='Number in bin')

matplotlib accepts a "normed=True" flag that divides by the total number and the bin width, so that the histogram area integrates to 1.



In [5]:

    
fig,ax = plt.subplots(1,1)
out=ax.hist(outRandom,bins=bounds,normed=True)

If we want to put numbers/(bin width) on the y axis, we need to first do the histogram then divide the counts in each bin by the bin width before plotting



In [6]:

    
fig,ax = plt.subplots(1,1)
counts,edges=np.histogram(outRandom,bins=bounds)
widths=np.diff(edges)
counts_dens=counts/widths
left_edge = edges[:-1]
out=ax.bar(left_edge,counts_dens,width=widths)
_=ax.set(title='Grade density histogram UBC',xlabel='mark (%)',ylabel='Number in bin/(%)')

An idealized satellite sensor

Suppose I had 1 $m^2$ sensor that could count photons for 1 second, in a series of wavelength bins spaced 0.1 $\mu m$ apart. Each time a bin received 0.01 Joules, the sensor reports a count by writing out the wavelength of the bin that accumulated that energy. The data set consists of 46,000 wavelength values over 1 second, which means that the sensor received a flux of 46,000*0.01 = 460 $Joules/s/m^2 = 460\ W\,m^{-2}$.

1) dowload the 46,000 using a301utils.a301_readfile module:



In [7]:

    
from a301utils.a301_readfile import download
download('photon_data.csv')









    



downloaded photon_data.csv

2) read the file in using np.loadtxt



In [8]:

    
bin_wavelengths = np.loadtxt('photon_data.csv')



In [9]:

    
bin_wavelengths[:10]









    Out[9]:





array([ 14.6,   8.1,  25. ,  11.7,   5.7,  14.7,  43.7,   6.7,   9.7,  19.6])

The total $W/m^2$ is just the number measurements multiplied by 0.01 Joules for each measurement:



In [10]:

    
total_counts = len(bin_wavelengths)
total_flux = total_counts*0.01



In [11]:

    
total_flux









    Out[11]:





460.0

Now histogram the photon counts

Histogram this so the area sums to 460 $W/m^2$



In [12]:

    
fig,ax = plt.subplots(1,1)
#
# 51 edges from 0.1 to 60 microns
#
edges = np.linspace(0.1,60,51)
counts,edges=np.histogram(bin_wavelengths,bins=edges)
widths=np.diff(edges)
counts_dens=counts/widths/total_counts*total_flux
left_edge = edges[:-1]
out=ax.bar(left_edge,counts_dens,width=widths)
_=ax.set(title='Flux density ($W/m^2/\mu m$)',xlabel='wavelength ($\mu m$)',ylabel='$E_\lambda\ (W/m^2/\mu m)$')

#
# put the Planck curve on top of this
#
from a301lib.radiation import planckwavelen
Temp=300 #Kelvin
Elambda = planckwavelen(edges*1.e-6,Temp)*1.e-6  #convert from W/m^2/m to W/m^2/micron
ax.plot(edges,Elambda,linewidth=4,label='Planck curve')
_=ax.legend()

Summary: the Planck function describes the histogram of radiant flux emitted by a black body of temperature T



In [ ]: