Linear Regression, Optimization and Curve Fitting Exercises

Some of the data and examples are from the SciPy Lecture Notes and materials developed by Matt Moelter and Jodi Christiansen for PHYS 202 at Cal Poly.

Instructions: Create a new notebook called OptimizationExercises in your Optimization directory and solve the following problems inside it. Be sure to include the problem statements in a markdown cell above your solution. You don't need to put the "helper" code in the markdown cell, just implement the helper code in your code cell with your solution.

Preliminaries: At the top of your notebook, include a "Heading 1" cell with the title Linear Regression, Optimization and Curve Fitting Exercises. Then include the inline functions and libraries by adding a code cell that invokes the %pylab inline magic and imports the needed packages.

Question 1

(a) Write a function that can compute the weighted linear least squares (WLSQ) best fit line to a set of data. Here is the function template to use:

def weighted_llsq_fit(x,y,w):
    """Take in arrays representing (x,y) values for a set of linearly varying data and an array of weights w.
    Perform a weighted linear least squares regression.  Return the resulting slope and intercept
    parameters of the best fit line with their uncertainties.

    If the weights are all equal to one, the uncertainties on the parameters are calculated using the 
    non-weighted least squares equations."""

    #your code goes here...

    return slope,slerr,intercept,interr

Your function should test whether there are unequal weights on the data points and return the correctly calculated uncertainties for whichever case is requested.

Use the features of numpy arrays rather than loops to take advantage of their superior computational speed.



In [1]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(b) For the data below, use the uncertainties on the position values as weights to perform a WLSQ best fit. Print the results of the fit, both the fitted parameter values and their uncertainties.

time = np.array([1.,2.,3.,4.,5.,6.,7.,8.,9.,10.])
   pos = np.array([7.75, 7.33, 6.89, 6.45, 5.96, 5.55, 5.10, 4.49, 3.93, 3.58])
   sigp = np.array([0.02, 0.03, 0.03, 0.04, 0.05, 0.06, 0.08, 0.11, 0.14, 0.17])

If you did it right you should get $m$ = -0.4510 $\pm$ 0.0066 and $b$ = 8.2201 $\pm$ 0.0214.



In [2]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(c) Now make an errorbar plot that shows the (time,position) data with markers and error bars $\sigma_p$. Also plot lines representing the results from the LSQ and WSLQ fits to these data. Include a legend that shows the fitted line equations with the fit parameters as you did in the tour, with labels indicating which one is the weighted result and which the unweighted result.



In [3]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

Question 2

(a) Download the linked file waveform_2.npy by right-clicking on the link and selecting "Save Link As". Read in the data and plot it.



In [4]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(b) This data is more complicated than that in waveform_1.npy from the tour, but not considerably so. It looks like it could be modeled as the sum of a noise floor and three Gaussian peaks with different amplitude, mean, and sigma. Write a function to model this data.



In [5]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(c) By inspecting the graph, come up with a set of initial guesses for the parameters of the fit and fit the data using curve_fit. Print the resulting parameters and their uncertainties.



In [6]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

(d) Plot the data and the fit to see how well the model matches the data.



In [6]:

    
#Copy the exercise statement to a markdown cell in your notebook and then implement a solution in a code cell

All content is under a modified MIT License, and can be freely used and adapted. See the full license text here.