Least Squares Fit

1) Without uncertainties

This code uses the expressions we derived in lecture to fit data to a straight line and determine the slope and intercept.

The expressions are:

1) Slope: $B=\frac{\sum (x_i- <x>)y_i}{\sum(x_i- <x>)x_i}$

2) Intercept: $A=<y>-B<x>$

Where $<x>$ denotes the average of all the x values and $<y>$ denotes the average of all the y values.

These are easily computed in Python: $<x>$=x.mean(x) and $<y>$=y.mean()

First we plot the data (taken from the ideal gas law experiment)


In [2]:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

def LineFit(x, y):
    '''Returns slope and y-intercept of linear fit to (x,y) data set'''
    xavg = x.mean()
    B = (y*(x-xavg)).sum()/(x*(x-xavg)).sum() % slope
    A = y.mean()-slope*xavg    % intercept
    return B, A

x = np.array([100., 75., 50., 20., 0., -80., -210.])
y = np.array([19.5, 18.0, 16.1, 14.8, 12.5, 10., 3.9])

plt.plot(x,y,'o')
plt.xlabel("T (C)", fontsize=18)
plt.ylabel("P (lb/in$^2$)", fontsize=18)
plt.axis([-300,100,0,20])
plt.show()


Now we plot the data along with the linear fit.


In [1]:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

def LineFit(x, y):
    '''Returns slope and y-intercept of linear fit to (x,y) data set'''
    xavg = x.mean()
    B = (y*(x-xavg)).sum()/(x*(x-xavg)).sum() # slope
    A = y.mean()-B*xavg    # intercept
    return B, A

x = np.array([100., 75., 50., 20., 0., -80., -210.])
y = np.array([19.5, 18.0, 16.1, 14.8, 12.5, 10., 3.9])

B,A = LineFit(x,y)
xfit = np.arange(-300.0,100.0,1)
yfit = A + B*xfit

plt.plot(x,y,'o')
plt.plot(xfit, yfit,'b-',label="Linear fit")
plt.xlabel("T (C)", fontsize=18)
plt.ylabel("P (lb/in$^2$)", fontsize=18)
plt.axis([-300,100,0,20])

plt.show()


With uncertainties

The following code shows how to include uncertainties in the data first in determining the slope and intercept of the fit.

The expressions for the intercept and slope are:

1) Slope: $B=\frac{\sum (x_i- <x>)y_i/\sigma_i^2}{\sum(x_i- <x>)x_i/\sigma_i^2}$ Where $\sigma_i$ is the uncertainty in the data point $y_i$.

2) Intercept: $A=<y>-B<x>$

The LineFit function also shows how uncertainties in the data propagate into uncertainties in the slope and intercept, denoted sb and sa.

The expressions for the uncertainties in the slope and intercept are given by the expressions for the propagation of errors:

3) The uncertainty in the slope is given by:

$\sigma_B^2=\sum (\frac{\partial B}{\partial y_i})^2 \sigma_i^2$

4) The uncertainty in the intercept is given by:

$\sigma_A^2=\sum (\frac{\partial A}{\partial y_i})^2 \sigma_i^2$

Convince yourselves that this code implements functions!

Finally, the code shows how errors in the slope and intercept translate into uncertainties in Boltzmann's constant from the pressure versus temperature data. Note a translation is required because there are uncertainties in the pressure (not the temperature). The initial uncertainty is thus in the slope of temperature vs. pressure (not the pressure versus temperature). These needs to be translated into an uncertainty in the slope of pressure vs. temperature, since Boltzmann's constant is related directly to this slope (not its inverse),


In [26]:
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

T = np.array([100., 75., 50., 20., 0., -80., -210.])
P = np.array([19.5, 18.0, 16.1, 14.8, 12.5, 10., 3.9])
PE = np.array([1.2, .3, .3, .5, 1., .6, .5])

def LineFit(x,y,err):
    err2 = err**2
    norm = (1./err2).sum()
    def hat(x):
        return (x/err2).sum()/norm
    xhat = hat(x)
    yhat = hat(y)
    diff = (x-xhat)
    b = (diff*y/err2).sum()/(diff*x/err2).sum()
    a = yhat - b*xhat
    sb = np.sqrt(1./((diff)*x/err2).sum())
    sa = np.sqrt(sb**2*(x**2/err2).sum()/norm)
    return a,b,sa,sb

b,m,berr,merr = LineFit(T, P, PE)

x = np.arange(-300.0,100.0,1)
y = m*x+b

plt.plot(x, y,'b-',label="Linear fit")
plt.plot(T,P,'o')
plt.errorbar(T,P,yerr=PE, fmt='ro',label="data", ecolor='black')
plt.xlabel("T (C)", fontsize=18)
plt.ylabel("P (lb/in$^2$)", fontsize=18)
plt.axis([-300,100,0,20])

plt.show()

print('Slope = {0:4.4f} +/- {1:4.4f} lb/(in^2 K)'.format(m,merr))
print('Intercept {0:4.4f} +/- {1:4.4f} lb/in^2'.format(b,berr))
print('=================================')
print('Absolute zero = {0:4.4f} +/- {1:4.4f} C'.format(-b/m, np.sqrt((berr/b)**2+(merr/m)**2)*np.abs(-b/m)))
print('=================================')     

def f(x):
    return m*x+b
chi2 = (((P-f(T))/PE)**2).sum()
print('Chi= {0:4.4f}'.format(chi2/5))
print('=================================')  
k_b = m*6895*4.65173*10**(-26)/(1.3)
k_b_err = (merr/m)*(k_b)
print('k_b = {0:4.4e} +/- {1:4.4e} J/K'.format(k_b,k_b_err))
print('=================================')


Slope = 0.0489 +/- 0.0019 lb/(in^2 K)
Intercept 13.9563 +/- 0.1717 lb/in^2
=================================
Absolute zero = -285.1682 +/- 11.4380 C
=================================
Chi= 1.0514
=================================
k_b = 1.2075e-23 +/- 4.6097e-25 J/K
=================================