The notebook demonstrate the computation and use of some important concepts in linear algebra. NumPy is used for the numerical computations.
The $l_{p}$-norm,of a vector $\boldsymbol{x} \in \mathbb{C}^{n}$ is
$$ \| \boldsymbol{x} \|_{p} = \left( \sum_{i=1}^{n} |x_{i}|^{p} \right)^{1/p} $$Recall that when $p = \infty$, we have have the maxiumum norm:
$$ \| \boldsymbol{x} \|_{\infty} = \max(|x_{1}|, \ldots , |x_{n}|) $$NumPy can compute $l_{p}$ norms of vectors. To see how, we first import NumPy and create a random vectors of length 10:
In [1]:
import numpy as np
np.random.seed(2)
x = np.random.rand(10) + 1j*np.random.rand(10)
print(x)
We can now compute a number of $l_{p}$ norms of $\boldsymbol{x}$:
In [2]:
for p in range(1, 5):
x_norm = np.linalg.norm(x, p)
print("The l_{} norm of x is: {}".format(p, x_norm))
For the $l_{\infty}$ norm:
In [3]:
x_inf = np.linalg.norm(x, np.inf)
print("The max norm of x is: {}".format(x_inf))
Norms of matrices can also be computed. The more interesting (and abstract) norms are operator norms. These are also known as induced norms.
For an $n \times n$ matrix $\boldsymbol{A}$, the norm of the matrix is a measure of the 'maximum change' in relative length it can induce when applied to a vector. If we consider:
$$ \| \boldsymbol{A} \boldsymbol{x} \| \le C \| \boldsymbol{x}\| \quad \forall \boldsymbol{x} \in \mathbb{C}^{d}, $$then the smallest possible $C$ is the norm of $\boldsymbol{A}$. The norm of $\boldsymbol{A}$ is denoted by $\|\boldsymbol{A}\|$:
$$ \| \boldsymbol{A} \boldsymbol{x} \| \le \| \boldsymbol{A}\| \| \boldsymbol{x}\| \quad \forall \boldsymbol{x} \in \mathbb{C}^{d}, $$This can be rearranged to provide the usual definition of a matrix norm:
$$ \| \boldsymbol{A} \| = \max_{\boldsymbol{x} \in \mathbb{C}^{n} \backslash \boldsymbol{0}} \frac{\| \boldsymbol{A} \boldsymbol{x}\|}{\|\boldsymbol{x}\| } $$To compute actual norms of a matrix, we need to choose how we measure the length of a vector, i.e. which norm to use. If we choose the $l_{2}$-norm, then:
$$ \| \boldsymbol{A} \|_{2} = \max_{\boldsymbol{x} \in \mathbb{C}^{n} \backslash \boldsymbol{0}} \frac{\| \boldsymbol{A} \boldsymbol{x}\|_{2}}{\|\boldsymbol{x}\|_{2} } $$As discussed in the lectures, some norms are relatively inexpensive to compute for large matrices, and others are expensive. We can again use NumPy to compute some matrix norms. We first create a matrix filled with random numbers:
In [4]:
A = np.random.rand(5, 5) + 1j*np.random.rand(5, 5)
print(A)
and then compute some norms:
In [5]:
print("The 1-norm of A is: {}".format(np.linalg.norm(A, 1)))
print("The 2-norm of A is: {}".format(np.linalg.norm(A, 2)))
print("The max-norm of A is: {}".format(np.linalg.norm(A, np.inf)))
It sometimes convenient to work with matrix norms that are similar to vector norms. A commonly used matrix norm is the Frobenius norm. It is analogous to the $l_{2}$ norm of a vector, and is defined by:
$$ \|\boldsymbol{A} \|_{F} = \left( \sum_{i}\sum_{i} a_{ij}^{2} \right)^{1/2}. $$To compute the Frobenius norm:
In [6]:
A_frobenius = np.linalg.norm(A, 'fro')
print("The Frobenius norm of A is: {}".format(A_frobenius))
The condition number of a matrix is important when working with matrices numerically because is tells us something about how stable algorithms will be with respect to round-off errors, and how fast some iterative techniques will converge. Recall that the condition number $\kappa$ of a matrix $\boldsymbol{A}$ is defined as:
$$ \kappa(\boldsymbol{A}) = \| \boldsymbol{A} \| \|\boldsymbol{A}^{-1}\| $$If we use the 2-norm, it was shown that:
$$ \kappa_{2}(\boldsymbol{A}) = \frac{\sqrt{\lambda_{\max}(\boldsymbol{A}^{T}\boldsymbol{A})}}{\sqrt{\lambda_{\min}(\boldsymbol{A}^{T}\boldsymbol{A})}} $$It was shown in lectures that when solving $\boldsymbol{A} \boldsymbol{x} = \boldsymbol{b}$, if the condition number of $\boldsymbol{A}$ is large then small errors in $\boldsymbol{b}$ can manifest themselves as large errors in the solution, $\boldsymbol{x}$. We explore this now for the notoriously ill-conditioned Hilbert matrix. Entries of the Hilbert matrix $\boldsymbol{H}$ are given by
$$ H_{ij} = \frac{1}{i + j - 1}. $$We can use a SciPy function to create a $n \times n$ Hilbert matrix:
In [7]:
import scipy.linalg as la
H = la.hilbert(6)
print(H)
print("Condition number is: {}".format(np.linalg.cond(A, 2)))
Even for this small Hilbert matrix, the condition number is large.
We now experiment with solving $\boldsymbol{A} (\boldsymbol{x}+ \delta \boldsymbol{x}) = \boldsymbol{b} + \delta \boldsymbol{b}$, and compare the error $\|\delta{\boldsymbol{x}}\| / \|\boldsymbol{x}\|$ to $\|\delta{\boldsymbol{b}}\| / \|\boldsymbol{b}\|$. We will presume that the NumPy linear solvers can cope with the exact system $\boldsymbol{A}\boldsymbol{x} =\boldsymbol{b}$ (in practice this will be an issue).
We first construct $\boldsymbol{b}$, $\delta\boldsymbol{b}$ and $\boldsymbol{b} + \delta\boldsymbol{b}$:
In [8]:
b = np.ones(H.shape[0])
b_delta = 1.0e-6*np.random.rand(H.shape[0])
# Perturbed RHS
b1 = b + b_delta
We now solve for $\boldsymbol{A} \boldsymbol{x}= \boldsymbol{b}$ and $\boldsymbol{A} (\boldsymbol{x}+ \delta \boldsymbol{x}) = \boldsymbol{b} + \delta \boldsymbol{b}$:
In [9]:
x = np.linalg.solve(H, b)
x1 = np.linalg.solve(H, b1)
We now compare $\|\delta{\boldsymbol{x}}\| / \|\boldsymbol{x}\|$ and $\|\delta{\boldsymbol{b}}\| / \|\boldsymbol{b}\|$ using the $l_{2}$-norm:
In [10]:
error_x = np.linalg.norm(x - x1, 2)/np.linalg.norm(x, 2)
error_b = np.linalg.norm(b_delta, 2)/np.linalg.norm(b, 2)
print("Relative error in x and b: {}, {}".format(error_x, error_b))
Even for this small Hilbert matrix, a small error in $\boldsymbol{b}$ leads to a much larger error in $\boldsymbol{x}$. This will get worse with problem size. We'll now put the test in side a loop to test larger matrix sizes:
In [11]:
for n in (20, 100, 1000):
H = la.hilbert(n)
print("- For {} x {} matrix, the condition number is: {}".format(n, n, np.linalg.cond(H, 2)))
b = np.ones(H.shape[0])
b_delta = 1.0e-5*np.random.rand(H.shape[0])
b1 = b + b_delta
x = np.linalg.solve(H, b)
x1 = np.linalg.solve(H, b1)
error_x = np.linalg.norm(x - x1, 2)/np.linalg.norm(x, 2)
error_b = np.linalg.norm(b_delta, 2)/np.linalg.norm(b, 2)
print(" Relative error in x and b: {}, {}".format(error_x, error_b))
It was discussed in lectures that the condition number of a matrix and its determinant are not necessarily related. Some small examples were presented. Here we consider some larger problems.
We consider an $n \times n$ upper triangular matrix filled with two, and one on the diagonal:
$$ \boldsymbol{A} = \begin{bmatrix} 1 & 2 & \ldots & 2 \\ & 1 & \ddots & \vdots \\ & & \ddots & 2 \\ & & & 1 \end{bmatrix} $$This matrix has a determinant of one, and a condition number that grows with $n$. We can explore this with NumPy for increasing $n$.
In [12]:
def test_matrix(n):
A = np.zeros((n, n))
A[np.triu_indices(n)] = 2.0
np.fill_diagonal(A, 1.0)
return A
for n in (2, 10, 100, 500):
A = test_matrix(n)
print("- Matrix size: {} x {}".format(n, n))
print(" * l_2 condition number is: {}".format(np.linalg.cond(A, 2)))
print(" * determinant is: {}".format(np.linalg.det(A)))