Line Fitting

Learning goals

  • Stardard Deviation.
  • Chi Square Fitting.

Background

When you have a set of data, most of the time you find the mean of the data. The mean isn't always an accurate showing of the data. Many times you want to know how far off from the average the data actually is. To measure the amount of variation from the average you would find the standard deviation.

To find the standard deviation you need to know the mean value and each data point value.

\begin{equation*} stddev = \sqrt{(x_1 - meanvalue)^2)+(x_2 - meanvalue)^2)+ ... +(x_n - meanvalue)^2)} \end{equation*}

Let's code!

Here is a sample code with data collected from a decay experiment. The experiment was to take 100 pennies and roll them in a box. After each roll, all the tails were to be removed (the tails were the elements that decayed). The pennies were then collected and rolled again. This occurred until all the pennies were gone (had decayed). The experiment was performed 5 times. The code below contains arrays of the data from each trial.


In [4]:
import numpy as np


x1= np.array([0,1,2,3,4,5,6,7,8])
y1 = np.array([100,46,20,11,7,2,1,1,0])

x2 = np.array([0,1,2,3,4,5,6,7,8])
y2 = np.array([100,48,25,13,7,2,1,1,0])

x3 = np.array([0,1,2,3,4,5,6,7,8])
y3 = np.array([100,51,27,14,7,2,1,0,0])

x4 = np.array([0,1,2,3,4,5,6,7,8])
y4 = np.array([100,52,30,10,5,2,1,1,0])

x5 = np.array([0,1,2,3,4,5,6,7,8])
y5 = np.array([100,59,28,12,4,1,1,1,0])

Challenge!

Copy this sample code and use it to calculate the mean and standard deviation of the data. Plot the mean with standard deviation as the error bars.

Your graph should look something like the following sketch.


In [5]:
from IPython.display import Image
Image(filename='sketch_mean_points.jpeg')


Out[5]:

In [2]:
#your code here

Background

$\chi^2$ or chi square is a statistical test used to compare observed data with the expected data of a specific hypothesis. Chi square can be used to find the best-fit line of data points.

\begin{equation*} \chi^2 = \sum_{i} \frac{(y(x_i) - y_i)^2}{ \theta_{y_i}^2} \end{equation*}

Challenge!

From the given points and the plot you just created, you can see that the best-fit line would be an exponential function. $N(t) = N_0 e^{-k*t}$ Starting with the same sample code, use chi square to find the line of best fit.

Hint!

You need to find the best values of $N_0$ and $k$ that make the chi square the smallest value possible.


In [ ]:
import numpy as np


x1= np.array([0,1,2,3,4,5,6,7,8])
y1 = np.array([100,46,20,11,7,2,1,1,0])

x2 = np.array([0,1,2,3,4,5,6,7,8])
y2 = np.array([100,48,25,13,7,2,1,1,0])

x3 = np.array([0,1,2,3,4,5,6,7,8])
y3 = np.array([100,51,27,14,7,2,1,0,0])

x4 = np.array([0,1,2,3,4,5,6,7,8])
y4 = np.array([100,52,30,10,5,2,1,1,0])

x5 = np.array([0,1,2,3,4,5,6,7,8])
y5 = np.array([100,59,28,12,4,1,1,1,0])

Your best-fit line should look something like the following sketch. A best-fit line goes through as many points as possible.


In [6]:
from IPython.display import Image
Image(filename='chi_square_sketch.jpeg')


Out[6]:

In [3]:
#your code here