Very often we have measured a series of physical quantities $y_{i}$ that have a dependendence on some variables $x_{i}$ and want to know the expected values of $y$ for values of $x$ that we didn't measure.
If the values of $x$ are inside the range of $x_{i}$ we call this procedure interpolation. If they are beyond we call it extrapolation.
The process of interpolation is usually done in two steps:
A proper interpolation procedure in physics should actually create a model that provides the analytical function over which we expect that all the points will lie, including the desired points we haven't measured. However, such physical insight is not always available. That's why we will define a mathematical framework for the interpolation, forgetting (gasp!) that we want to apply this techniques in a physical setting.
In [2]:
from scipy.interpolate import interp1d
%pylab inline
In [3]:
#generate data for a known funcion
x = linspace(0.0,10.0,21)
y = sin(x**1.5 + 2.0)
x_true = linspace(0.0,10.0,1000)
y_true = sin(x_true**1.5 + 2.0)
In [4]:
#first visualization
plot(x_true, y_true)
scatter(x,y)
legend(['true','data'])
#make a nice plot
ax = axes()
ax.set_xlabel("$x$",fontsize=20)
ax.set_ylabel("$f(x)$",fontsize=20)
ax.set_title("$\mathrm{data\ and\ function}$", fontsize=20)
filename = 'figure1'
savefig(filename + '.pdf',format = 'pdf', transparent=True)
In the linear interpolation a line is drawn between two adjacednt points $(x_0, y_0)$ and $(x_1,y_1)$.
In that case the interpolating function can be defined from
$$ \frac{y-y_0}{x-x_0} = \frac{y_1-y_0}{x_1-x_0} $$solving for the value of $y$
$$ y = y_0 + (x-x_0)\frac{y_1-y_0}{x_1-x_0} $$Generalizing this logic, one could suggest that for $N$ points in the the plane $(x_i,y_i)$ there will be a unique polinomial of degree less $N-1$ that passes through all the points, $P(x_i)=y_i$, $i=1,\ldots,N$.
The polynomial $P$ can be written as:
$$ P(x) = \sum_{i=1}^N\left(\prod_{j=1,j\neq i}^N \frac{x - x_{j}}{x_{i}-x_{j}}\right) y_{i} $$or in a equivalent way
$$ P(x) = \sum_{i=1}^N P_{i} (x) $$where $$ P_i (x) = y_i \prod_{j=1, j\neq i}^{N} \frac{x-x_{j}}{x_i - x_j} $$
Which is known as the lagrangian form of the polynomial.
In the case of $N=2$
$$ y = y_0 \frac{x-x_1}{x_0-x_1} + y_1\frac{x-x_0}{x_1-x_0} $$
In [5]:
x_in = array([0,1])
y_in = array([1,4])
def poli(x, x_in, y_in):
value = y_in[0]*((x-x_in[1])/(x_in[0]-x_in[1])) + y_in[1]*((x-x_in[0])/(x_in[1]-x_in[0]))
return value
In [6]:
x_l = linspace(0.0,3.0,100)
y_l = poli(x_l, x_in, y_in)
plot(x_l,y_l)
scatter(x_in, y_in)
Out[6]:
In the case of $N=3$
In [7]:
x_in = array([0,1,5])
y_in = array([1,4,-2])
def poli(x, x_in, y_in):
p0 = y_in[0]*((x-x_in[1])/(x_in[0]-x_in[1]))*((x-x_in[2])/(x_in[0]-x_in[2]))
p1 = y_in[1]*((x-x_in[0])/(x_in[1]-x_in[0]))*((x-x_in[2])/(x_in[1]-x_in[2]))
p2 = y_in[2]*((x-x_in[0])/(x_in[2]-x_in[0]))*((x-x_in[1])/(x_in[2]-x_in[1]))
value = p0 + p1 + p2
return value
In [9]:
x_l = linspace(0.0,5.0,100)
y_l = poli(x_l, x_in, y_in)
print len(y_l)
plot(x_l,y_l)
scatter(x_in, y_in)
Out[9]:
In [8]:
#defines a linear interpolation
f = interp1d(x, y)
In [9]:
#other kinds of interpolation are 'linear','nearest', 'zero', 'slinear', 'quadratic'
scatter(x,y)
plot(x_true, y_true)
plot(x_true, f(x_true))
legend(['true', 'linear', 'data'])
Out[9]:
The problem with the linear interpolations is that the derivatives clearly jump from one side to the other of the data points. In the case of using all the points to define a polynomial the problem is that the values of the function can be too off from the true function.
A way to solve this problem is the method of splines. A spline is a piecewise polynomial, each piece joining to points. Depending on the kind of spline one is doing, one is looking for different characteristics for this set of polinomial.
For instance, in the case of a cubic spline one is looking for segments of polynomials that will be smooth in the first derivative and continuous in the second derivative.
A Practical Guide to Splines (revised edition) by Carl de Boor is one of the best references if you really want to dig into the subject.
Wikipedia gives us a nice example of a spline in three pieces that has a bell shape.
In [10]:
x1 = linspace(-2.0, -1.0,100)
x2 = linspace(-1.0,1.0,100)
x3 = linspace(1.0,2.0,100)
y1 = 0.25*(x1+ 2.0)**3.0
y2 = 0.25*(3*abs(x2)**3.0 -6.0*x2**2 + 4.0)
y3 = 0.25*(2.0-x3)**3.0
plot(x1,y1, color='red')
plot(x2,y2, color='blue')
plot(x3,y3, color='green')
Out[10]:
In [11]:
# this is the result of a cubic spline
f2 = interp1d(x, y, kind='cubic')
scatter(x,y)
plot(x_true, y_true)
plot(x_true, f2(x_true))
legend(['true', 'cubic', 'data'])
Out[11]:
In [12]:
# I create a 2D function
def my_func(x, y):
return exp(-((x-0.5)**2)/0.1) * sin(20.0*(y**2))
In [13]:
#that I only now on 500 points
points = np.random.rand(500, 2)
values = my_func(points[:,0], points[:,1])
In [14]:
#here i define a dense grid in x-y with ranges [0,1], [0,1] to see the complete funcion
grid_x, grid_y = mgrid[0:1:200j, 0:1:200j]
In [15]:
imshow(my_func(grid_x, grid_y).T, extent=(0,1,0,1), origin='lower')
plot(points[:,0], points[:,1], 'k.', ms=2, alpha = 0.8)
ax = axes()
ax.set_xlabel("$x$",fontsize=20)
ax.set_ylabel("$y$",fontsize=20)
ax.set_title("$\mathrm{data\ and\ function}$", fontsize=20)
Out[15]:
In [16]:
from scipy.interpolate import griddata
grid_zA = griddata(points, values, (grid_x, grid_y), method='nearest')
grid_zB = griddata(points, values, (grid_x, grid_y), method='linear')
grid_zC = griddata(points, values, (grid_x, grid_y), method='cubic')
In [17]:
imshow(grid_zA.T, extent=(0,1,0,1), origin='lower')
plot(points[:,0], points[:,1], 'k.', ms=2, alpha = 0.8)
ax = axes()
ax.set_xlabel("$x$",fontsize=20)
ax.set_ylabel("$y$",fontsize=20)
ax.set_title("$\mathrm{data\ and\ nearest\ interpolation}$", fontsize=20)
Out[17]:
In [18]:
imshow(grid_zB.T, extent=(0,1,0,1), origin='lower')
plot(points[:,0], points[:,1], 'k.', ms=2, alpha = 0.8)
ax = axes()
ax.set_xlabel("$x$",fontsize=20)
ax.set_ylabel("$y$",fontsize=20)
ax.set_title("$\mathrm{data\ and\ linear\ interpolation}$", fontsize=20)
Out[18]:
In [19]:
imshow(grid_zC.T, extent=(0,1,0,1), origin='lower')
plot(points[:,0], points[:,1], 'k.', ms=2, alpha = 0.8)
ax = axes()
ax.set_xlabel("$x$",fontsize=20)
ax.set_ylabel("$y$",fontsize=20)
ax.set_title("$\mathrm{data\ and\ cubic\ interpolation}$", fontsize=20)
Out[19]:
This brief outline is based on the the scipy documentation webpage: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html
Write a program in Python that given a set of N points in $x$ and $y$ returns the interpolating polinomial. Test it with the following arrays
In [20]:
x_in = array([1.0,2.0,4.0,5.0,6.0,7.0,8.0])
y_in = array([3.0,-2.0,2.0,4.0,1.0,2.0,3.0])
In [21]:
scatter(x_in, y_in)
Out[21]:
Find the interpolating polinomials for the three set of points taken from the following function
In [37]:
def p(x):
return 1.0/(1.0+25.0*x**2)
In [38]:
a = linspace(-1.0,1.0,3)
print p(a)
b = linspace(-1.0,1.0,5)
print p(b)
c = linspace(-1.0,1.0,9)
print p(c)
In [49]:
scatter(a,p(a), color="black", alpha=1.0, s=30)
scatter(b,p(b), color="red", alpha=0.7,s=120)
scatter(c,p(c), color="green", alpha=0.4,s=500)
Out[49]: