**Learning Objective:** Learn to interpolate 1d and 2d datasets of structured and unstructured points using SciPy.

```
In [1]:
```%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

We have already seen how to evaluate a Python function at a set of numerical points:

$$ f(x) \rightarrow f_i = f(x_i) $$Here is an array of points:

```
In [2]:
```x = np.linspace(0,4*np.pi,10)
x

```
Out[2]:
```

This creates a new array of points that are the values of $\sin(x_i)$ at each point $x_i$:

```
In [3]:
```f = np.sin(x)
f

```
Out[3]:
```

```
In [4]:
```plt.plot(x, f, marker='o')
plt.xlabel('x')
plt.ylabel('f(x)');

```
```

This plot shows that the points in this numerical array are an approximation to the actual function as they don't have the function's value at all possible points. In this case we know the actual function ($\sin(x)$). What if we only know the value of the function at a limited set of points, and don't know the analytical form of the function itself? This is common when the data points come from a set of measurements.

Interpolation is a numerical technique that enables you to construct an approximation of the actual function from a set of points:

$$ \{x_i,f_i\} \rightarrow f(x) $$It is important to note that unlike curve fitting or regression, interpolation doesn't not allow you to incorporate a *statistical model* into the approximation. Because of this, interpolation has limitations:

- It cannot accurately construct the function's approximation outside the limits of the original points.
- It cannot tell you the analytical form of the underlying function.

Once you have performed interpolation you can:

- Evaluate the function at other points not in the original dataset.
- Use the function in other calculations that require an actual function.
- Compute numerical derivatives or integrals.
- Plot the approximate function on a finer grid that the original dataset.

**Warning:**

The different functions in SciPy work with a range of different 1d and 2d arrays. To help you keep all of that straight, I will use lowercase variables for 1d arrays (`x`

, `y`

) and uppercase variables (`X`

,`Y`

) for 2d arrays.

`interp1d`

:

```
In [7]:
```from scipy.interpolate import interp1d

Let's create the numerical data we will use to build our interpolation.

```
In [8]:
```x = np.linspace(0,4*np.pi,10) # only use 10 points to emphasize this is an approx
f = np.sin(x)

To create our approximate function, we call `interp1d`

as follows, with the numerical data. Options for the `kind`

argument includes:

`linear`

: draw a straight line between initial points.`nearest`

: return the value of the function of the nearest point.`slinear`

,`quadratic`

,`cubic`

: use a spline (particular kinds of piecewise polynomial of a given order.

The most common case you will want to use is `cubic`

spline (try other options):

```
In [9]:
```sin_approx = interp1d(x, f, kind='cubic')

`sin_approx`

variabl that `interp1d`

returns is a callable object that can be used to compute the approximate function at other points. Compute the approximate function on a fine grid:

```
In [10]:
```newx = np.linspace(0,4*np.pi,100)
newf = sin_approx(newx)

```
In [12]:
```plt.plot(x, f, marker='o', linestyle='', label='original data')
plt.plot(newx, newf, marker='.', label='interpolated');
plt.legend();
plt.xlabel('x')
plt.ylabel('f(x)');

```
```

```
In [13]:
```plt.plot(newx, np.abs(np.sin(newx)-sin_approx(newx)))
plt.xlabel('x')
plt.ylabel('Absolute error');

```
```

`interp1d`

when the x data is not regularly spaced. To show this, let's repeat the above analysis with randomly distributed data in the range $[0,4\pi]$. Everything else is the same.

```
In [14]:
```x = 4*np.pi*np.random.rand(15)
f = np.sin(x)

```
In [15]:
```sin_approx = interp1d(x, f, kind='cubic')

```
In [16]:
```# We have to be careful about not interpolating outside the range
newx = np.linspace(np.min(x), np.max(x),100)
newf = sin_approx(newx)

```
In [17]:
```plt.plot(x, f, marker='o', linestyle='', label='original data')
plt.plot(newx, newf, marker='.', label='interpolated');
plt.legend();
plt.xlabel('x')
plt.ylabel('f(x)');

```
```

```
In [18]:
```plt.plot(newx, np.abs(np.sin(newx)-sin_approx(newx)))
plt.xlabel('x')
plt.ylabel('Absolute error');

```
```

Notice how the absolute error is larger in the intervals where there are no points.

For the 2d case we want to construct a scalar function of two variables, given

$$ {x_i, y_i, f_i} \rightarrow f(x,y) $$For now, we will assume that the points $\{x_i,y_i\}$ are on a structured grid of points. This case is covered by the `interp2d`

function:

```
In [19]:
```from scipy.interpolate import interp2d

Here is the actual function we will use the generate our original dataset:

```
In [20]:
```def wave2d(x, y):
return np.sin(2*np.pi*x)*np.sin(3*np.pi*y)

Build 1d arrays to use as the structured grid:

```
In [21]:
```x = np.linspace(0.0, 1.0, 10)
y = np.linspace(0.0, 1.0, 10)

Build 2d arrays to use in computing the function on the grid points:

```
In [22]:
```X, Y = np.meshgrid(x, y)
Z = wave2d(X, Y)

Here is a scatter plot of the points overlayed with the value of the function at those points:

```
In [23]:
```plt.pcolor(X, Y, Z)
plt.colorbar();
plt.scatter(X, Y);
plt.xlim(0,1)
plt.ylim(0,1)
plt.xlabel('x')
plt.ylabel('y');

```
```

You can see in this plot that the function is not smooth as we don't have its value on a fine grid.

Now let's compute the interpolated function using `interp2d`

. Notice how we are passing 2d arrays to this function:

```
In [24]:
```wave2d_approx = interp2d(X, Y, Z, kind='cubic')

Compute the interpolated function on a fine grid:

```
In [27]:
```xnew = np.linspace(0.0, 1.0, 40)
ynew = np.linspace(0.0, 1.0, 40)
Xnew, Ynew = np.meshgrid(xnew, ynew) # We will use these in the scatter plot below
Fnew = wave2d_approx(xnew, ynew) # The interpolating function automatically creates the meshgrid!

```
In [28]:
```Fnew.shape

```
Out[28]:
```

Plot the original course grid of points, along with the interpolated function values on a fine grid:

```
In [29]:
```plt.pcolor(xnew, ynew, Fnew);
plt.colorbar();
plt.scatter(X, Y, label='original points')
plt.scatter(Xnew, Ynew, marker='.', color='green', label='interpolated points')
plt.xlim(0,1)
plt.ylim(0,1)
plt.xlabel('x')
plt.ylabel('y');
plt.legend(bbox_to_anchor=(1.2, 1), loc=2, borderaxespad=0.);

```
```

`griddata`

function:

```
In [30]:
```from scipy.interpolate import griddata

There is an important difference between `griddata`

and the `interp1d`

/`interp2d`

:

`interp1d`

and`interp2d`

return callable Python objects (functions).`griddata`

returns the interpolated function evaluated on a finer grid.

This means that you have to pass `griddata`

an array that has the finer grid points to be used. Here is the course unstructured grid we will use:

```
In [41]:
```x = np.random.rand(100)
y = np.random.rand(100)

Notice how we pass these 1d arrays to our function and don't use `meshgrid`

:

```
In [42]:
```f = wave2d(x, y)

It is clear that our grid is very unstructured:

```
In [43]:
```plt.scatter(x, y);
plt.xlim(0,1)
plt.ylim(0,1)
plt.xlabel('x')
plt.ylabel('y');

```
```

`griddata`

we need to compute the final (strcutured) grid we want to compute the interpolated function on:

```
In [44]:
```xnew = np.linspace(x.min(), x.max(), 40)
ynew = np.linspace(y.min(), y.max(), 40)
Xnew, Ynew = np.meshgrid(xnew, ynew)

```
In [45]:
```Xnew.shape, Ynew.shape

```
Out[45]:
```

```
In [46]:
```Fnew = griddata((x,y), f, (Xnew, Ynew), method='cubic', fill_value=0.0)

```
In [47]:
```Fnew.shape

```
Out[47]:
```

```
In [48]:
```plt.pcolor(Xnew, Ynew, Fnew, label="points")
plt.colorbar()
plt.scatter(x, y, label='original points')
plt.scatter(Xnew, Ynew, marker='.', color='green', label='interpolated points')
plt.xlim(0,1)
plt.ylim(0,1)
plt.xlabel('x')
plt.ylabel('y');
plt.legend(bbox_to_anchor=(1.2, 1), loc=2, borderaxespad=0.);

```
```

`nan`

).