We're going to be using histograms throughout the course, starting with the Planck function, which is really just a histogram of photon energies. As a demo, let's plot a grade distribution. The cell below generates 1000 random samples of a normal distribution with mean 75 and standard deviation 10
In [1]:
%matplotlib inline
import numpy.random as nr
import numpy as np
from matplotlib import pyplot as plt
the_mean=75
the_sigma=20.
numpoints=1000
Remove all grades below 0 or above 100
In [2]:
outRandom=nr.normal(the_mean,the_sigma,[numpoints,])
outRandom=outRandom[outRandom <= 100.]
outRandom=outRandom[outRandom >= 0.]
In [3]:
#
# histogram these 1000 grades into 1000 bins with uniform width
#
fig,ax = plt.subplots(1,1)
bin_edges=np.linspace(0,100,21,endpoint=True)
ax.hist(outRandom,bins=bin_edges)
_=ax.set(title='Grade distribution',xlabel='mark (%)',ylabel='Number in bin')
Note that UBC has grade boundaries that narrow for higher marks. Just counting the number in each bin and plotting it is seriously misleading if you expect the area of the bin to be proportional to the number in the bin.
In [4]:
#
# make a dictionary to hold the grade boundaries
#
bounds={'a+':90,'a':85,'a-':80,'b+':76,'b':72,'b-':68,'c+':64,'c':60,'c-':55,'d':50}
bounds=list(bounds.values())
bounds.sort()
#
# add the high and low edges
#
bounds.insert(0,0)
bounds.append(100)
fig,ax = plt.subplots(1,1)
out=ax.hist(outRandom,bins=bounds)
_=ax.set(title='Grade distribution UBC',xlabel='mark (%)',ylabel='Number in bin')
matplotlib accepts a "normed=True" flag that divides by the total number and the bin width, so that the histogram area integrates to 1.
In [5]:
fig,ax = plt.subplots(1,1)
out=ax.hist(outRandom,bins=bounds,normed=True)
If we want to put numbers/(bin width) on the y axis, we need to first do the histogram then divide the counts in each bin by the bin width before plotting
In [6]:
fig,ax = plt.subplots(1,1)
counts,edges=np.histogram(outRandom,bins=bounds)
widths=np.diff(edges)
counts_dens=counts/widths
left_edge = edges[:-1]
out=ax.bar(left_edge,counts_dens,width=widths)
_=ax.set(title='Grade density histogram UBC',xlabel='mark (%)',ylabel='Number in bin/(%)')
Suppose I had 1 $m^2$ sensor that could count photons for 1 second, in a series of wavelength bins spaced 0.1 $\mu m$ apart. Each time a bin received 0.01 Joules, the sensor reports a count by writing out the wavelength of the bin that accumulated that energy. The data set consists of 46,000 wavelength values over 1 second, which means that the sensor received a flux of 46,000*0.01 = 460 $Joules/s/m^2 = 460\ W\,m^{-2}$.
1) dowload the 46,000 using a301utils.a301_readfile module:
In [7]:
from a301utils.a301_readfile import download
download('photon_data.csv')
2) read the file in using np.loadtxt
In [8]:
bin_wavelengths = np.loadtxt('photon_data.csv')
In [9]:
bin_wavelengths[:10]
Out[9]:
The total $W/m^2$ is just the number measurements multiplied by 0.01 Joules for each measurement:
In [10]:
total_counts = len(bin_wavelengths)
total_flux = total_counts*0.01
In [11]:
total_flux
Out[11]:
Histogram this so the area sums to 460 $W/m^2$
In [12]:
fig,ax = plt.subplots(1,1)
#
# 51 edges from 0.1 to 60 microns
#
edges = np.linspace(0.1,60,51)
counts,edges=np.histogram(bin_wavelengths,bins=edges)
widths=np.diff(edges)
counts_dens=counts/widths/total_counts*total_flux
left_edge = edges[:-1]
out=ax.bar(left_edge,counts_dens,width=widths)
_=ax.set(title='Flux density ($W/m^2/\mu m$)',xlabel='wavelength ($\mu m$)',ylabel='$E_\lambda\ (W/m^2/\mu m)$')
#
# put the Planck curve on top of this
#
from a301lib.radiation import planckwavelen
Temp=300 #Kelvin
Elambda = planckwavelen(edges*1.e-6,Temp)*1.e-6 #convert from W/m^2/m to W/m^2/micron
ax.plot(edges,Elambda,linewidth=4,label='Planck curve')
_=ax.legend()
In [ ]: