Interpolation

This lesson is based on materials developed by Matt Moelter and Jodi Christiansen for PHYS 202 at Cal Poly.


Instructions: Create a new directory called Interpolation with a notebook called InterpolationTour. Give it a heading 1 cell title Interpolation. Read this page, typing in the code in the code cells and executing them as you go.

Do not copy/paste.

Type the commands yourself to get the practice doing it. This will also slow you down so you can think about the commands and what they are doing as you type them.</font>

Save your notebook when you are done, then try the accompanying exercises.



In [ ]:
%pylab inline
import numpy as np
import matplotlib.pyplot as plt

Introduction

In this tour you will explore the concept of interpolation and learn how to use SciPy's built-in interpolation routines to approximate functions. A common use for interpolation is to smooth out a dataset that has relatively few datapoints.

Interpolation is a method of curve fitting a set of data to obtain values at unknown locations between two known points. There are different methods for interpolation, including the simplest - linear - which assumes that the function that connects the two known points is a straight line. Any values of the function for points between can be calculated from the interpolating function. Linear interpolation doesn't always give the best results, so higher order polynomial fits can be employed. Cubic is a common choice.

For more on this topic, read up on it via wikipedia.

Before we use interpolation, let's review using boolean masks. We'll set up an example for finding data with a mask and then see how interpolation can help us when the data are sparse.

Finding values with Boolean Masks

Sometimes we want to find a subset of the data points we’ve computed. Here’s an example. Lets look at the function $\sin(\theta)$ for $0\lt \theta\lt2\pi$.


In [ ]:
theta = np.arange(0.,2*np.pi,np.pi/100)
plt.plot(theta, np.sin(theta))
plt.show()

Now suppose that we want to plot just the data with $0\lt \sin{(\theta)}\lt 0.5$. We’re going to have to "find” the data with $0\lt \sin{(\theta)}$ and $\sin{(\theta)}\lt 0.5$. We can use a mask to accomplish this. Set the mask to be the subset of the array that satisfies the criteria:


In [ ]:
ii = (np.sin(theta)>0) & (np.sin(theta)<0.5)
plt.plot(theta,np.sin(theta),theta[ii],np.sin(theta[ii]),"r-")
plt.show()

Oops, this has a horizontal line between the two sections of the sin function that pass our criteria. ii is now just a boolean mask where theta meets the logical criteria. Print the mask to see what it contains. It should be True in the array element locations passing our criteria and False everywhere else.

Referring to theta[ii] will mean that we get just those elements of theta where the logical criteria was satisfied (where ii is True).

Now we can get rid of the ugly line. plot() has decided to connect the 2 parts of the selected values with a straight line. To get two separate line segments we will need to set one value along the ugly line to NaN (“Not a Number”). For this case we will find the offending element where “y equals max(y)” and then set that element to NaN.


In [ ]:
y = np.sin(theta[ii])
jj = (y == max(y))
y[jj] = NaN
plt.plot(theta, np.sin(theta))
plt.plot(theta[ii],y,linewidth=4)
plt.show()

Interpolation

What if our range is so small, that our boolean mask doesn’t find anything?

Create a new theta array that doesn't have a value of 0.5. When you try to find it with the mask, all entries in the mask will be False:


In [ ]:
theta = np.arange(0.,2*np.pi+0.1,np.pi/10)
ii = (theta == 0.5)
print ii

How can we numerically find a new value that is between the existing values? We’ll use interpolation.

We have to import the interpolation routines from scipy - the scientific python computing package.


In [ ]:
from scipy.interpolate import interp1d 
#that's a "one" as in "one-dimensional", not an "ell"

#Create the "interpolating function":
sinInterp = interp1d(theta, sin(theta))

#Set the x value where we want to know the function and 
#evaluate the interpolating function at that point:
newTheta = 0.5
newSin = sinInterp(newTheta)

#Alternatively, here's a way to do it in one line:
#newSin = interp1d(theta, np.sin(theta))(newTheta)

#Now plot the interpolated point as a big plus sign:
plt.plot(theta, np.sin(theta),newTheta, newSin, 'r+',markersize=20,markeredgewidth=3)
plt.show()

In [ ]:
#Learn more about interp1d:
help(interp1d)

We could also find a whole bunch of new values if we use an array:


In [ ]:
thetaNew = np.arange(0,2*np.pi+0.1,np.pi/30)
sinNew = interp1d(theta, np.sin(theta))(thetaNew)
plt.plot(theta, np.sin(theta), thetaNew, sinNew, 'r.')
plt.show()

Let’s look at how accurate our interpolated values are. For this we will compare the interpolated value to the exact value:


In [ ]:
#sinNew is the interpolated value, np.sin(thetaNew) is the exact value
plt.plot(thetaNew, sinNew - np.sin(thetaNew))
plt.ylabel('difference between interp and exact value')
plt.xlabel(r'$\theta$ (rad)',fontsize=20)
plt.show()

Good to about 1/100. But it’s easy to do better by using the curvature of the function instead of just straight line estimates between points. We’ll ask for cubic interpolation. Review the documentation to learn more about how this works.


In [ ]:
#cubic interpolation
sinNewC = interp1d(theta, np.sin(theta), kind='cubic')(thetaNew)

#Plot them both together
plt.subplot(2,1,1)
plt.plot(thetaNew, sinNew - np.sin(thetaNew),thetaNew, sinNewC - np.sin(thetaNew), 'r')

#Can't really see the variation in the red curve.  Let's zoom in:
plt.subplot(2,1,2)
plt.plot(thetaNew, sinNewC - np.sin(thetaNew),'r')

plt.show()

Yowza, that is quite a lot better.

Another example


In [ ]:
x = np.arange(0, 10) 
y = np.array([3.0, -4.0, -2.0, -1.0, 3.0, 6.0, 10.0, 8.0, 12.0, 20.0]) 
f = interp1d(x, y, kind = 'cubic') 
 
xint = 3.5 #location to evaluate
yint = f(xint) #interpolated point at that location
 
plt.plot(x, y, 'o', c = 'b') 
plt.plot(xint, yint, 's', c = 'r') 
 
plt.show()

In [ ]:
x = np.arange(0, 10) 
y = np.array([3.0, -4.0, -2.0, -1.0, 3.0, 6.0, 10.0, 8.0, 12.0, 20.0]) 
flin = interp1d(x, y) #linear is the default
fcub = interp1d(x, y, kind = 'cubic')

#Create a full interpolated curve
xint = np.arange(0, 9.01, 0.01) 
ylin = flin(xint) 
ycub = fcub(xint)

plt.subplot(211)
plt.plot(x, y, 'o', c = 'b') 
plt.plot(xint, ycub, '-r',label="cubic") 
plt.legend(loc="best")
plt.subplot(212)
plt.plot(x, y, 'o', c = 'b') 
plt.plot(xint, ylin, '-g',label="linear") 
plt.legend(loc="best")

plt.show()

Higher dimensions

Boolean masks work in 2D too. Recall the point potential function from the Graphics tour. Let's put it into a function that we can re-use. We'll save it to a file called Electrostatics.py and import it as a module:


In [ ]:
%%file Electrostatics.py

import numpy as np

def point_potential(x,y,q,Xc,Yc):
    """
    Return the electric potential at (x,y)
    for a point charge q at (Xc,Yc)
    
    Units returned are [Volts] if input 
    units are [meters] and [Coulombs]                                   
    """
    k = 8.987551787997912e9 #(Nm^2/C^2)                                                                        
    Vxy = k*q/np.sqrt(((x-Xc)**2 + (y-Yc)**2))
    return Vxy

def dipole_potential(x,y,q,d):
    """
    Return the electric potential at (x,y) for a 
    pair of point charges +/- q                                         
    separated by distance d along the x-axis 
    with their midpoint at (0,0).                                  
                                                                                                               
    Units returned are [Volts] if input
    units are [meters] and [Coulombs]                                   
    """
    Vxy=point_potential(x,y,q,-d/2.,0.) + point_potential(x,y,-q,+d/2.,0.)
    return Vxy

In [ ]:
from Electrostatics import *

In [ ]:
#Check out the help we created with those docstrings:
help(point_potential)
help(dipole_potential)

In [ ]:
#Create a couple of point potentials
x = np.arange(-5.,5.01,0.51)
y = np.arange(-5.,5.01,0.51)
xg,yg = np.meshgrid(x,y)
V1 = point_potential(xg,yg,1e-9,2.01,-2.01);
V2 = point_potential(xg,yg,-1e-9,-2.01,2.01);

In [ ]:
#Now plot in 3D
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(xg,yg,V1+V2,rstride = 1, cstride = 1,cmap=cm.coolwarm);

To find where V1+V2 = 0, we can find where (V1 == -V2):


In [ ]:
#Make the mask
ii = (V1 == -V2)

#Plot the x,y values using the mask:
plt.plot(xg[ii], yg[ii])  
plt.show()

These are the $(x,y)$ values where the potentials exactly cancel.

Of course, we could also make a contour plot and see the contour at V1+V2 = 0, but this method with the mask gives us access to the array elements for use in further calculations.

Interp2d

2D interpolation is also similar to the 1D analog. A simple way to increase the sampling is to make a new mesh of points and then use interp2d to get the 2D interpolation:


In [ ]:
from scipy.interpolate import interp2d
xnew = np.arange(-5.,5.01,0.1)
ynew = np.arange(-5.,5.01,0.1)
xng,yng = np.meshgrid(xnew,ynew)
#V1+V2 is the old grid using x,y
Vnew = interp2d(x,y,V1+V2,kind='linear')(xnew, ynew)

fig = figure()
ax = fig.gca(projection='3d')
ax.plot_surface(xng,yng,Vnew,cmap=cm.coolwarm);

You’ll notice that this procedure doesn’t reproduce peaks very well, but in between the peaks it's not too bad. Let’s check the accuracy of the calculation.


In [ ]:
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(xng,yng,Vnew-point_potential(xng,yng,1e-9,2.01,-2.01) \
                            -point_potential(xng,yng,-1e-9,-2.01,2.01));

It is pretty flat except right near the poles.


All content is under a modified MIT License, and can be freely used and adapted. See the full license text here.