Fitting Models Exercise 1

Imports


In [37]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize as opt

Fitting a quadratic curve

For this problem we are going to work with the following model:

$$ y_{model}(x) = a x^2 + b x + c $$

The true values of the model parameters are as follows:


In [38]:
a_true = 0.5
b_true = 2.0
c_true = -4.0

First, generate a dataset using this model using these parameters and the following characteristics:

  • For your $x$ data use 30 uniformly spaced points between $[-5,5]$.
  • Add a noise term to the $y$ value at each point that is drawn from a normal distribution with zero mean and standard deviation 2.0. Make sure you add a different random number to each point (see the size argument of np.random.normal).

After you generate the data, make a plot of the raw data (use points).


In [3]:
np.random.normal?

In [39]:
xdata=np.linspace(-5,5,30)
dy=2
sigma=np.random.normal(0,dy,30)
ydata=a_true*xdata**2+b_true*xdata+c_true+sigma

In [ ]:
assert True # leave this cell for grading the raw data generation and plot

Now fit the model to the dataset to recover estimates for the model's parameters:

  • Print out the estimates and uncertainties of each parameter.
  • Plot the raw data and best fit of the model.

In [52]:
def model(x,a,b,c):
    y=a*x**2+b*x+c
    return y
def deviation(theta,x,y,dy):
    a=theta[0]
    b=theta[1]
    c=theta[2]
    return (y-a*x**2-b*x-c)/dy

In [48]:
xdata,ydata,sigma


Out[48]:
(array([-5.        , -4.65517241, -4.31034483, -3.96551724, -3.62068966,
        -3.27586207, -2.93103448, -2.5862069 , -2.24137931, -1.89655172,
        -1.55172414, -1.20689655, -0.86206897, -0.51724138, -0.17241379,
         0.17241379,  0.51724138,  0.86206897,  1.20689655,  1.55172414,
         1.89655172,  2.24137931,  2.5862069 ,  2.93103448,  3.27586207,
         3.62068966,  3.96551724,  4.31034483,  4.65517241,  5.        ]),
 array([ -2.14336593,  -1.31178469,  -0.79132589,  -6.77189696,
         -4.52277629,  -4.68647524,  -6.22112988,  -6.02823842,
         -3.5746268 ,  -7.70511673,  -6.91553005,  -3.65052848,
         -3.21869809,  -3.97090019,  -4.54031875,  -2.42029369,
         -4.44080753,  -1.86658074,   0.48509948,  -1.72095123,
          2.41718169,   4.73774042,   6.09712795,   4.15597987,
          8.87575675,  12.00999813,   9.95256261,  11.60798692,
         19.91751348,  16.75319222]),
 array([-0.64336593,  1.16324503,  2.5398275 , -2.70352597,  0.16390623,
         0.49961275, -0.65454248, -0.20005768,  2.39624121, -1.7104675 ,
        -1.01600567,  2.03496498,  2.13385839,  0.92981325, -0.21035442,
         1.22001547, -1.60905961,  0.03769988,  1.34300673, -2.02832341,
         0.82562402,  1.74309119,  1.5804811 , -2.00157067,  0.95839647,
         2.21392203, -1.84113537, -2.302239  ,  3.77185355, -1.74680778]))

In [49]:
opt.leastsq?

In [50]:
model_best,error_best=opt.curve_fit(model,xdata,ydata,dy)


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-50-318e8c308f4e> in <module>()
----> 1 model_best,error_best=opt.curve_fit(model,xdata,ydata,dy)

/usr/local/lib/python3.4/dist-packages/scipy/optimize/minpack.py in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, **kw)
    579     # Remove full_output from kw, otherwise we're passing it in twice.
    580     return_full = kw.pop('full_output', False)
--> 581     res = leastsq(func, p0, args=args, full_output=1, **kw)
    582     (popt, pcov, infodict, errmsg, ier) = res
    583 

/usr/local/lib/python3.4/dist-packages/scipy/optimize/minpack.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
    369     if not isinstance(args, tuple):
    370         args = (args,)
--> 371     shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
    372     m = shape[0]
    373     if n > m:

/usr/local/lib/python3.4/dist-packages/scipy/optimize/minpack.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
     18 def _check_func(checker, argname, thefunc, x0, args, numinputs,
     19                 output_shape=None):
---> 20     res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
     21     if (output_shape is not None) and (shape(res) != output_shape):
     22         if (output_shape[0] != 1):

/usr/local/lib/python3.4/dist-packages/scipy/optimize/minpack.py in _general_function(params, xdata, ydata, function)
    445 
    446 def _general_function(params, xdata, ydata, function):
--> 447     return function(xdata, *params) - ydata
    448 
    449 

TypeError: model() missing 2 required positional arguments: 'b' and 'c'

In [62]:
best_fit=opt.leastsq(deviation,np.array((1,2,-5)), args=(xdata, ydata, dy), full_output=True)
theta_best=best_fit[0]
theta_cov=best_fit[1]
print('a=',theta_best[0],'+/-',np.sqrt(theta_cov[0,0]))
print('b=',theta_best[1],'+/-',np.sqrt(theta_cov[1,1]))
print('c=',theta_best[2],'+/-',np.sqrt(theta_cov[2,2]))


a= 0.483498332995 +/- 0.0459058220552
b= 1.98811368567 +/- 0.122342723697
c= -3.62339891607 +/- 0.548231733411

In [60]:
plt.errorbar(xdata,ydata,dy,fmt='k.')
xfit=np.linspace(-5,5,100)
yfit=theta_best[0]*xfit**2+theta_best[1]*xfit+theta_best[2]
plt.plot(xfit,yfit)
plt.ylabel('y')
plt.xlabel('x')
plt.title('Quadratic Fit')


Out[60]:
<matplotlib.text.Text at 0x7f22863881d0>

In [ ]:
assert True # leave this cell for grading the fit; should include a plot and printout of the parameters+errors