This lesson is based on materials developed by Matt Moelter and Jodi Christiansen for PHYS 202 at Cal Poly.
Instructions: Create a new directory called Interpolation
with a notebook called InterpolationTour
. Give it a heading 1 cell title Interpolation. Read this page, typing in the code in the code cells and executing them as you go.
Do not copy/paste.
Type the commands yourself to get the practice doing it. This will also slow you down so you can think about the commands and what they are doing as you type them.</font>
Save your notebook when you are done, then try the accompanying exercises.
In [ ]:
%pylab inline
import numpy as np
import matplotlib.pyplot as plt
In this tour you will explore the concept of interpolation and learn how to use SciPy's built-in interpolation routines to approximate functions. A common use for interpolation is to smooth out a dataset that has relatively few datapoints.
Interpolation is a method of curve fitting a set of data to obtain values at unknown locations between two known points. There are different methods for interpolation, including the simplest - linear - which assumes that the function that connects the two known points is a straight line. Any values of the function for points between can be calculated from the interpolating function. Linear interpolation doesn't always give the best results, so higher order polynomial fits can be employed. Cubic is a common choice.
For more on this topic, read up on it via wikipedia.
Before we use interpolation, let's review using boolean masks. We'll set up an example for finding data with a mask and then see how interpolation can help us when the data are sparse.
Sometimes we want to find a subset of the data points we’ve computed. Here’s an example. Lets look at the function $\sin(\theta)$ for $0\lt \theta\lt2\pi$.
In [ ]:
theta = np.arange(0.,2*np.pi,np.pi/100)
plt.plot(theta, np.sin(theta))
plt.show()
Now suppose that we want to plot just the data with $0\lt \sin{(\theta)}\lt 0.5$. We’re going to have to "find” the data with $0\lt \sin{(\theta)}$ and $\sin{(\theta)}\lt 0.5$. We can use a mask to accomplish this. Set the mask to be the subset of the array that satisfies the criteria:
In [ ]:
ii = (np.sin(theta)>0) & (np.sin(theta)<0.5)
plt.plot(theta,np.sin(theta),theta[ii],np.sin(theta[ii]),"r-")
plt.show()
Oops, this has a horizontal line between the two sections of the sin
function that pass our criteria. ii
is now just a boolean mask where theta
meets the logical criteria. Print the mask to see what it contains. It should be True
in the array element locations passing our criteria and False
everywhere else.
Referring to theta[ii]
will mean that we get just those elements of theta
where the logical criteria was satisfied (where ii
is True
).
Now we can get rid of the ugly line. plot()
has decided to connect the 2 parts of the selected values with a straight line. To get two separate line segments we will need to set one value along the ugly line to NaN
(“Not a Number”). For this case we will find the offending element where “y equals max(y)
” and then set that element to NaN
.
In [ ]:
y = np.sin(theta[ii])
jj = (y == max(y))
y[jj] = NaN
plt.plot(theta, np.sin(theta))
plt.plot(theta[ii],y,linewidth=4)
plt.show()
What if our range is so small, that our boolean mask doesn’t find anything?
Create a new theta
array that doesn't have a value of 0.5. When you try to find it with the mask, all entries in the mask will be False
:
In [ ]:
theta = np.arange(0.,2*np.pi+0.1,np.pi/10)
ii = (theta == 0.5)
print ii
How can we numerically find a new value that is between the existing values? We’ll use interpolation.
We have to import the interpolation routines from scipy
- the scientific python computing package.
In [ ]:
from scipy.interpolate import interp1d
#that's a "one" as in "one-dimensional", not an "ell"
#Create the "interpolating function":
sinInterp = interp1d(theta, sin(theta))
#Set the x value where we want to know the function and
#evaluate the interpolating function at that point:
newTheta = 0.5
newSin = sinInterp(newTheta)
#Alternatively, here's a way to do it in one line:
#newSin = interp1d(theta, np.sin(theta))(newTheta)
#Now plot the interpolated point as a big plus sign:
plt.plot(theta, np.sin(theta),newTheta, newSin, 'r+',markersize=20,markeredgewidth=3)
plt.show()
In [ ]:
#Learn more about interp1d:
help(interp1d)
We could also find a whole bunch of new values if we use an array:
In [ ]:
thetaNew = np.arange(0,2*np.pi+0.1,np.pi/30)
sinNew = interp1d(theta, np.sin(theta))(thetaNew)
plt.plot(theta, np.sin(theta), thetaNew, sinNew, 'r.')
plt.show()
Let’s look at how accurate our interpolated values are. For this we will compare the interpolated value to the exact value:
In [ ]:
#sinNew is the interpolated value, np.sin(thetaNew) is the exact value
plt.plot(thetaNew, sinNew - np.sin(thetaNew))
plt.ylabel('difference between interp and exact value')
plt.xlabel(r'$\theta$ (rad)',fontsize=20)
plt.show()
Good to about 1/100. But it’s easy to do better by using the curvature of the function instead of just straight line estimates between points. We’ll ask for cubic
interpolation. Review the documentation to learn more about how this works.
In [ ]:
#cubic interpolation
sinNewC = interp1d(theta, np.sin(theta), kind='cubic')(thetaNew)
#Plot them both together
plt.subplot(2,1,1)
plt.plot(thetaNew, sinNew - np.sin(thetaNew),thetaNew, sinNewC - np.sin(thetaNew), 'r')
#Can't really see the variation in the red curve. Let's zoom in:
plt.subplot(2,1,2)
plt.plot(thetaNew, sinNewC - np.sin(thetaNew),'r')
plt.show()
Yowza, that is quite a lot better.
In [ ]:
x = np.arange(0, 10)
y = np.array([3.0, -4.0, -2.0, -1.0, 3.0, 6.0, 10.0, 8.0, 12.0, 20.0])
f = interp1d(x, y, kind = 'cubic')
xint = 3.5 #location to evaluate
yint = f(xint) #interpolated point at that location
plt.plot(x, y, 'o', c = 'b')
plt.plot(xint, yint, 's', c = 'r')
plt.show()
In [ ]:
x = np.arange(0, 10)
y = np.array([3.0, -4.0, -2.0, -1.0, 3.0, 6.0, 10.0, 8.0, 12.0, 20.0])
flin = interp1d(x, y) #linear is the default
fcub = interp1d(x, y, kind = 'cubic')
#Create a full interpolated curve
xint = np.arange(0, 9.01, 0.01)
ylin = flin(xint)
ycub = fcub(xint)
plt.subplot(211)
plt.plot(x, y, 'o', c = 'b')
plt.plot(xint, ycub, '-r',label="cubic")
plt.legend(loc="best")
plt.subplot(212)
plt.plot(x, y, 'o', c = 'b')
plt.plot(xint, ylin, '-g',label="linear")
plt.legend(loc="best")
plt.show()
Boolean masks work in 2D too. Recall the point potential function from the Graphics tour. Let's put it into a function that we can re-use. We'll save it to a file called Electrostatics.py
and import it as a module:
In [ ]:
%%file Electrostatics.py
import numpy as np
def point_potential(x,y,q,Xc,Yc):
"""
Return the electric potential at (x,y)
for a point charge q at (Xc,Yc)
Units returned are [Volts] if input
units are [meters] and [Coulombs]
"""
k = 8.987551787997912e9 #(Nm^2/C^2)
Vxy = k*q/np.sqrt(((x-Xc)**2 + (y-Yc)**2))
return Vxy
def dipole_potential(x,y,q,d):
"""
Return the electric potential at (x,y) for a
pair of point charges +/- q
separated by distance d along the x-axis
with their midpoint at (0,0).
Units returned are [Volts] if input
units are [meters] and [Coulombs]
"""
Vxy=point_potential(x,y,q,-d/2.,0.) + point_potential(x,y,-q,+d/2.,0.)
return Vxy
In [ ]:
from Electrostatics import *
In [ ]:
#Check out the help we created with those docstrings:
help(point_potential)
help(dipole_potential)
In [ ]:
#Create a couple of point potentials
x = np.arange(-5.,5.01,0.51)
y = np.arange(-5.,5.01,0.51)
xg,yg = np.meshgrid(x,y)
V1 = point_potential(xg,yg,1e-9,2.01,-2.01);
V2 = point_potential(xg,yg,-1e-9,-2.01,2.01);
In [ ]:
#Now plot in 3D
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(xg,yg,V1+V2,rstride = 1, cstride = 1,cmap=cm.coolwarm);
To find where V1+V2 = 0
, we can find where (V1 == -V2)
:
In [ ]:
#Make the mask
ii = (V1 == -V2)
#Plot the x,y values using the mask:
plt.plot(xg[ii], yg[ii])
plt.show()
These are the $(x,y)$ values where the potentials exactly cancel.
Of course, we could also make a contour plot and see the contour at V1+V2 = 0
, but this method with the mask gives us access to the array elements for use in further calculations.
2D interpolation is also similar to the 1D analog. A simple way to increase the sampling is to make a new mesh of points and then use interp2d
to get the 2D interpolation:
In [ ]:
from scipy.interpolate import interp2d
xnew = np.arange(-5.,5.01,0.1)
ynew = np.arange(-5.,5.01,0.1)
xng,yng = np.meshgrid(xnew,ynew)
#V1+V2 is the old grid using x,y
Vnew = interp2d(x,y,V1+V2,kind='linear')(xnew, ynew)
fig = figure()
ax = fig.gca(projection='3d')
ax.plot_surface(xng,yng,Vnew,cmap=cm.coolwarm);
You’ll notice that this procedure doesn’t reproduce peaks very well, but in between the peaks it's not too bad. Let’s check the accuracy of the calculation.
In [ ]:
fig = plt.figure()
ax = fig.gca(projection='3d')
ax.plot_surface(xng,yng,Vnew-point_potential(xng,yng,1e-9,2.01,-2.01) \
-point_potential(xng,yng,-1e-9,-2.01,2.01));
It is pretty flat except right near the poles.