We sometimes wish to solve problems of the form

$$ \boldsymbol{A} \boldsymbol{x} = \boldsymbol{b} $$where $\boldsymbol{A}$ is a $m \times n$ matrix, where $m > n$. Clearly $\boldsymbol{A}$ is not square, and in general no solution to the problem exists. This is a typical of an over-determined problem - we have more equations than unknowns. A classical example is when trying to fit an $k$th-order polynomial to $p > k + 1$ data points - the degree of the polynomial is not high enough to construct an interpolating polynomial.

In this notebook we assume that $\boldsymbol{A}$ has full rank, i.e. the columns of $\boldsymbol{A}$ are linearly independent. We will look at the case when $\boldsymbol{A}$ is not full rank later.

Before computing least-squares problems, we start with examples of polynomial interpolation.

```
In [1]:
```import numpy as np
N = 20
x_p = np.linspace(-np.pi, np.pi, N)
y_p = np.sin(x_p)

We use the variable `N`

to hold the number of points so we can change it if we want to experiment.

We can plot the points:

```
In [2]:
```%matplotlib inline
import matplotlib.pyplot as plt
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title('Points on a sine graph')
plt.plot(x_p, y_p,'ro');

```
```

With 20 data points, we can interpolate the points with a polynomial $f(x)$ of degree 19:

$$ f = c_{19} x^{19} + c_{18} x^{18} + \ldots + c_{1} x + c_{0}. $$We can find the polynomial coefficients $c_{i}$ by solving $\boldsymbol{A} \boldsymbol{c} = \boldsymbol{y}$, where $\boldsymbol{A}$ is the Vandermonde matrix:

$$ \boldsymbol{A} = \begin{bmatrix} x_{1}^{19} & x_{1}^{18} & \ldots & x_{1}^{2} & x_{1} & 1 \\ x_{2}^{19} & x_{2}^{18} & \ldots & x_{2}^{2} & x_{2} & 1 \\ \vdots & \vdots & \vdots & \ldots & \vdots \\ x_{20}^{19} & x_{20}^{18} & \ldots & x_{20}^{2} & x_{20} & 1 \end{bmatrix} $$the vector $\boldsymbol{c}$ contains the polynomial coefficient

$$ \boldsymbol{c} = \begin{bmatrix} c_{19} & c_{20} & \ldots & c_{0} \end{bmatrix}^{T} $$and the vector $\boldsymbol{y}$ contains the points $y_{i}$ that we wish to fit, which is this example are the points on the sine graph.

*Note: the ordering in each row of the Vandermonde matrix above is reversed with respect to what you will find in most books. We do this because earlier versions of the NumPy built-in function for generating the Vandermonde matrix use the above ordering. Later verions provide an option to generate the more conventional ordering.*

Using the NumPy built-in function to generate the Vandermonde matrix:

```
In [3]:
```A = np.vander(x_p, N)

We can solve the system to find the coefficients:

```
In [4]:
```c = np.linalg.solve(A, y_p)

`poly1d`

to turn the coefficients into a polynomial object, and it can display a representation of the polynomial:

```
In [5]:
```p = np.poly1d(c)
print(p)

```
```

To plot the fitted polynomial, we evaluate the polynomial at 200 points:

```
In [6]:
```# Create an array of 200 equally spaced points on [-pi, pi]
x_fit = np.linspace(-np.pi, np.pi, 200)
# Evaluate the polynomial at the points
y_fit = p(x_fit)
# Plot the interpolating polynomial and the sample points
plt.xlabel('$x$')
plt.ylabel('$f$')
plt.title('Points on a sine graph interpolate by a polynomial')
plot = plt.plot(x_p, y_p, 'ro', x_fit, y_fit,'b-');

```
```

We can see from the graph that the polynomial closely ressambles the sine function.

```
In [7]:
```def y_p_noise(noise):
return y_p + np.random.uniform(-noise/2.0, noise/2.0, len(y_p))

We can plot the points with noise of amplitude $0.02$:

```
In [8]:
```plt.xlabel('x')
plt.ylabel('$y$')
plt.ylim(-1.1, 1.1)
plt.title('Points on a noisy sine')
y_noise = y_p_noise(0.02)
plt.plot(x_p, y_noise, 'ro');

```
```

To the eye, the noise canniot be detected.

We can solve the system to find the coefficients:

```
In [9]:
```c_noise = np.linalg.solve(A, y_noise)

```
In [10]:
```p_noise = np.poly1d(c_noise)

```
In [11]:
```y_fit = p_noise(x_fit)
plt.xlabel('$x$')
plt.ylabel('$f$')
plt.title('Points on a sine graph with noise interpolated by a polynomial')
plt.plot(x_p, y_p, 'ro', x_fit, y_fit,'b-');

```
```

The points are clearly interpolated, but the the result is now terrible near the boundaries of the interval, with large spikes. The spikes are known as Runge's phenomenon. A similar effect with a Fourier basis at discontinuties is known as 'Gibb's phenomenon'.

This is a common problem with polynomial fitting. With the exact sine points, we were lucky. With well-chosen, non-uniform interpolation points it is possible to improve the interpolation.

```
In [12]:
```from ipywidgets import widgets
from ipywidgets import interact
@interact(noise=(0.0, 0.25, 0.005))
def plot_interp_sine(noise=0.005):
y_noise = y_p_noise(noise)
c_noise = np.linalg.solve(A, y_noise)
p_noise = np.poly1d(c_noise)
y_noise_fit = p_noise(x_fit)
plt.xlabel('x')
plt.ylabel('$y$')
plt.title('Points on a noisy sine (noise amplitude: {})'.format(noise))
plt.plot(x_p, y_noise, 'ro', x_fit, y_noise_fit,'b-');

```
```

The Vandermonde matrix is notoriously ill-conditioned. Computing the condition number:

```
In [13]:
```print("Condition number of the Vandermonde matrix: {}".format(np.linalg.cond(A, 2)))

```
```

We will now looking at fitting a polynomial of degree $k < n + 1$ to points on the sine graph. The degree of the polynomial is not high enough to interpolate all points, so we will compute a best-fit in the least-squares sense.

We have seen in lectures that solving the least squares solution involves solving

$$ \boldsymbol{A}^{T}\boldsymbol{A} \boldsymbol{c} = \boldsymbol{A}^{T} \boldsymbol{y} $$If we want ot fit a $5$th-order polynomial to 20 data points, $\boldsymbol{A}$ is the $20 \times 6$ matrix:

$$ \boldsymbol{A} = \begin{bmatrix} x_{1}^{5} & x_{1}^{4} & \ldots & x_{1}^{2} & x_{1} & 1 \\ x_{2}^{5} & x_{2}^{4} & \ldots & x_{2}^{2} & x_{2} & 1 \\ \vdots & \vdots & \vdots & \ldots & \vdots \\ \vdots & \vdots & \vdots & \ldots & \vdots \\ x_{20}^{5} & x_{20}^{4} & \ldots & x_{20}^{2} & x_{20} & 1 \end{bmatrix} $$and $\boldsymbol{c}$ contains the $6$ polynomial coefficients

$$ \boldsymbol{c} = \begin{bmatrix} c_{0} & c_{1} & c_{2} & c_{3} & c_{4} \end{bmatrix} $$and $\boldsymbol{y}$ contains the 20 points we want to fit.

```
In [14]:
```A = np.vander(x_p, 6)

```
In [15]:
```ATA = (A.T).dot(A)
c_ls = np.linalg.solve(ATA, (A.T).dot(y_p))
p_ls = np.poly1d(c_ls)
print(p_ls)

```
```

Plotting the polynomial:

```
In [16]:
```# Evaluate polynomial at some points
y_ls = p_ls(x_fit)
# Plot
plt.xlabel('$x$')
plt.ylabel('$f$')
plt.ylim(-1.1, 1.1)
plt.title('Least-squares fit of 20 points on the sine graph with a $6$th-order polynomial')
plt.plot(x_p, y_p, 'ro', x_fit, y_ls,'b-');

```
```

```
In [17]:
```@interact(order=(0, 19))
def plot(order):
# Create Vandermonde matrix
A = np.vander(x_p, order + 1)
ATA = (A.T).dot(A)
c_ls = np.linalg.solve(ATA, (A.T).dot(y_p))
p_ls = np.poly1d(c_ls)
# Evaluate polynomial at some points
y_ls = p_ls(x_fit)
# Plot
plt.xlabel('$x$')
plt.ylabel('$f$')
plt.ylim(-1.2, 1.2)
plt.title('Least-squares fit of 20 points on the sine graph with a ${}$th-order polynomial'.format(order))
plt.plot(x_p, y_p, 'ro', x_fit, y_ls,'b-');

```
```

`polyfit`

to shorten the code. Moreover, `plolyfit`

will uses a different solution algorithm from what we have above, namely a singular value decomposition, to compute the same problem but with less susceptibility to round-off errors. We will write a short function to make is easy to vary the polnomial degree;

```
In [18]:
```def plot(order=10):
# Compute the coefficients of a nth order polynomial that fits the data points (x_p, y_p)
c_ls = np.polyfit(x_p, y_p, order)
# Create a polynomial object from the coefficients
p_ls = np.poly1d(c_ls)
# Evaluate the polynomial at the plotting points
y_ls = p_ls(x_fit)
# Plot
plt.xlabel('$x$')
plt.ylabel('$f$')
plt.ylim(-1.1, 1.1)
plt.title('Least-squares fit of 20 points on the sine graph with a polynomial of degree {}.'.format(order))
plt.plot(x_p, y_p, 'ro', x_fit, y_ls,'b-');

Starting with degree 3:

```
In [19]:
```plot(3)

```
```

```
In [20]:
```plot(2)

```
```

Clearly the quadratic fit is very poor.

Creating an interactive plot with a slider for the polynomial degree:

```
In [21]:
```interact(plot, order=(0, 19));

```
```

```
In [22]:
```# Compute least squares fit to sine points with noise
def plot(order=10, noise=0.0):
# Generate points on sine graph with nosie
y_noise = y_p_noise(noise)
# Compute the coefficients of a nth order polynomial that fits the data points (x_p, y_p)
c_ls = np.polyfit(x_p, y_noise, order)
# Create a polynomial object from the coefficients
p_ls = np.poly1d(c_ls)
# Evaluate the polynomial at the plotting points
y_ls = p_ls(x_fit)
# Plot
plt.xlabel('$x$')
plt.ylabel('$f$')
plt.ylim(-1.1, 1.1)
plt.title('Least-squares fit of 20 points on the sine graph with a polynomial of degree {}.'.format(order))
plt.plot(x_p, y_noise, 'ro', x_fit, y_ls,'b-');

We start by fitting a polynomial of degree 12:

```
In [23]:
```plot(12, 0.02)

```
```

The fit looks very good, and note that there is no discernible noise at the ends of the interval.

We now make an interactive plot to explore the interaction between noise and polynomial degree.

```
In [24]:
```interact(plot, order=(0, 19), noise=(0.0, 0.4, 0.005));

```
```

```
In [25]:
```A = np.vander(x_p, 20)
print("Condition number of A (Vandermonde matrix, 20): {}".format(np.linalg.cond(A)))
print("Condition number of (A.T)A (Vandermonde matrix, 20): {}".format(np.linalg.cond((A.T).dot(A))))

```
```