Vectors, matrices and norms

The notebook demonstrate the computation and use of some important concepts in linear algebra. NumPy is used for the numerical computations.

Vector norms

The $l_{p}$-norm,of a vector $\boldsymbol{x} \in \mathbb{C}^{n}$ is

$$ \| \boldsymbol{x} \|_{p} = \left( \sum_{i=1}^{n} |x_{i}|^{p} \right)^{1/p} $$

Recall that when $p = \infty$, we have have the maxiumum norm:

$$ \| \boldsymbol{x} \|_{\infty} = \max(|x_{1}|, \ldots , |x_{n}|) $$

NumPy can compute $l_{p}$ norms of vectors. To see how, we first import NumPy and create a random vectors of length 10:


In [1]:
import numpy as np
np.random.seed(2)

x = np.random.rand(10) + 1j*np.random.rand(10)
print(x)


[0.4359949 +0.62113383j 0.02592623+0.52914209j 0.54966248+0.13457995j
 0.43532239+0.51357812j 0.4203678 +0.18443987j 0.33033482+0.78533515j
 0.20464863+0.85397529j 0.61927097+0.49423684j 0.29965467+0.84656149j
 0.26682728+0.07964548j]

We can now compute a number of $l_{p}$ norms of $\boldsymbol{x}$:


In [2]:
for p in range(1, 5):
    x_norm = np.linalg.norm(x, p)
    print("The l_{} norm of x is: {}".format(p, x_norm))


The l_1 norm of x is: 6.685801095190599
The l_2 norm of x is: 2.202178795407456
The l_3 norm of x is: 1.547749584847131
The l_4 norm of x is: 1.3088747768586464

For the $l_{\infty}$ norm:


In [3]:
x_inf = np.linalg.norm(x, np.inf)
print("The max norm of x is: {}".format(x_inf))


The max norm of x is: 0.8980307744980833

Matrix norms

Norms of matrices can also be computed. The more interesting (and abstract) norms are operator norms. These are also known as induced norms.

Operator norms

For an $n \times n$ matrix $\boldsymbol{A}$, the norm of the matrix is a measure of the 'maximum change' in relative length it can induce when applied to a vector. If we consider:

$$ \| \boldsymbol{A} \boldsymbol{x} \| \le C \| \boldsymbol{x}\| \quad \forall \boldsymbol{x} \in \mathbb{C}^{d}, $$

then the smallest possible $C$ is the norm of $\boldsymbol{A}$. The norm of $\boldsymbol{A}$ is denoted by $\|\boldsymbol{A}\|$:

$$ \| \boldsymbol{A} \boldsymbol{x} \| \le \| \boldsymbol{A}\| \| \boldsymbol{x}\| \quad \forall \boldsymbol{x} \in \mathbb{C}^{d}, $$

This can be rearranged to provide the usual definition of a matrix norm:

$$ \| \boldsymbol{A} \| = \max_{\boldsymbol{x} \in \mathbb{C}^{n} \backslash \boldsymbol{0}} \frac{\| \boldsymbol{A} \boldsymbol{x}\|}{\|\boldsymbol{x}\| } $$

To compute actual norms of a matrix, we need to choose how we measure the length of a vector, i.e. which norm to use. If we choose the $l_{2}$-norm, then:

$$ \| \boldsymbol{A} \|_{2} = \max_{\boldsymbol{x} \in \mathbb{C}^{n} \backslash \boldsymbol{0}} \frac{\| \boldsymbol{A} \boldsymbol{x}\|_{2}}{\|\boldsymbol{x}\|_{2} } $$

As discussed in the lectures, some norms are relatively inexpensive to compute for large matrices, and others are expensive. We can again use NumPy to compute some matrix norms. We first create a matrix filled with random numbers:


In [4]:
A = np.random.rand(5, 5) + 1j*np.random.rand(5, 5)
print(A)


[[0.50524609+0.42754596j 0.0652865 +0.43674726j 0.42812233+0.77655918j
  0.09653092+0.53560417j 0.12715997+0.95374223j]
 [0.59674531+0.54420816j 0.226012  +0.08209492j 0.10694568+0.3663424j
  0.22030621+0.8508505j  0.34982629+0.40627504j]
 [0.46778748+0.02720237j 0.20174323+0.24717724j 0.64040673+0.06714437j
  0.48306984+0.99385201j 0.50523672+0.97058031j]
 [0.38689265+0.80025835j 0.79363745+0.60181712j 0.58000418+0.76495986j
  0.1622986 +0.16922545j 0.70075235+0.29302323j]
 [0.96455108+0.52406688j 0.50000836+0.35662428j 0.88952006+0.04567897j
  0.34161365+0.98315345j 0.56714413+0.44135492j]]

and then compute some norms:


In [5]:
print("The 1-norm of A is: {}".format(np.linalg.norm(A, 1)))
print("The 2-norm of A is: {}".format(np.linalg.norm(A, 2)))
print("The max-norm of A is: {}".format(np.linalg.norm(A, np.inf)))


The 1-norm of A is: 4.0707141285209385
The 2-norm of A is: 3.5641387076264954
The max-norm of A is: 4.362031308991134

Vector-like norms

It sometimes convenient to work with matrix norms that are similar to vector norms. A commonly used matrix norm is the Frobenius norm. It is analogous to the $l_{2}$ norm of a vector, and is defined by:

$$ \|\boldsymbol{A} \|_{F} = \left( \sum_{i}\sum_{i} a_{ij}^{2} \right)^{1/2}. $$

To compute the Frobenius norm:


In [6]:
A_frobenius = np.linalg.norm(A, 'fro')
print("The Frobenius norm of A is: {}".format(A_frobenius))


The Frobenius norm of A is: 3.8758610051584017

Condition number

The condition number of a matrix is important when working with matrices numerically because is tells us something about how stable algorithms will be with respect to round-off errors, and how fast some iterative techniques will converge. Recall that the condition number $\kappa$ of a matrix $\boldsymbol{A}$ is defined as:

$$ \kappa(\boldsymbol{A}) = \| \boldsymbol{A} \| \|\boldsymbol{A}^{-1}\| $$

If we use the 2-norm, it was shown that:

$$ \kappa_{2}(\boldsymbol{A}) = \frac{\sqrt{\lambda_{\max}(\boldsymbol{A}^{T}\boldsymbol{A})}}{\sqrt{\lambda_{\min}(\boldsymbol{A}^{T}\boldsymbol{A})}} $$

Effect of poor conditioning on errors

It was shown in lectures that when solving $\boldsymbol{A} \boldsymbol{x} = \boldsymbol{b}$, if the condition number of $\boldsymbol{A}$ is large then small errors in $\boldsymbol{b}$ can manifest themselves as large errors in the solution, $\boldsymbol{x}$. We explore this now for the notoriously ill-conditioned Hilbert matrix. Entries of the Hilbert matrix $\boldsymbol{H}$ are given by

$$ H_{ij} = \frac{1}{i + j - 1}. $$

We can use a SciPy function to create a $n \times n$ Hilbert matrix:


In [7]:
import scipy.linalg as la

H = la.hilbert(6)
print(H)
print("Condition number is: {}".format(np.linalg.cond(A, 2)))


[[1.         0.5        0.33333333 0.25       0.2        0.16666667]
 [0.5        0.33333333 0.25       0.2        0.16666667 0.14285714]
 [0.33333333 0.25       0.2        0.16666667 0.14285714 0.125     ]
 [0.25       0.2        0.16666667 0.14285714 0.125      0.11111111]
 [0.2        0.16666667 0.14285714 0.125      0.11111111 0.1       ]
 [0.16666667 0.14285714 0.125      0.11111111 0.1        0.09090909]]
Condition number is: 20.69957742847763

Even for this small Hilbert matrix, the condition number is large.

We now experiment with solving $\boldsymbol{A} (\boldsymbol{x}+ \delta \boldsymbol{x}) = \boldsymbol{b} + \delta \boldsymbol{b}$, and compare the error $\|\delta{\boldsymbol{x}}\| / \|\boldsymbol{x}\|$ to $\|\delta{\boldsymbol{b}}\| / \|\boldsymbol{b}\|$. We will presume that the NumPy linear solvers can cope with the exact system $\boldsymbol{A}\boldsymbol{x} =\boldsymbol{b}$ (in practice this will be an issue).

We first construct $\boldsymbol{b}$, $\delta\boldsymbol{b}$ and $\boldsymbol{b} + \delta\boldsymbol{b}$:


In [8]:
b = np.ones(H.shape[0])
b_delta = 1.0e-6*np.random.rand(H.shape[0])

# Perturbed RHS
b1  = b + b_delta

We now solve for $\boldsymbol{A} \boldsymbol{x}= \boldsymbol{b}$ and $\boldsymbol{A} (\boldsymbol{x}+ \delta \boldsymbol{x}) = \boldsymbol{b} + \delta \boldsymbol{b}$:


In [9]:
x  = np.linalg.solve(H, b)
x1 = np.linalg.solve(H, b1)

We now compare $\|\delta{\boldsymbol{x}}\| / \|\boldsymbol{x}\|$ and $\|\delta{\boldsymbol{b}}\| / \|\boldsymbol{b}\|$ using the $l_{2}$-norm:


In [10]:
error_x = np.linalg.norm(x - x1, 2)/np.linalg.norm(x, 2)
error_b = np.linalg.norm(b_delta, 2)/np.linalg.norm(b, 2)
print("Relative error in x and b: {}, {}".format(error_x, error_b))


Relative error in x and b: 0.00019481894129697632, 5.49377662028406e-07

Even for this small Hilbert matrix, a small error in $\boldsymbol{b}$ leads to a much larger error in $\boldsymbol{x}$. This will get worse with problem size. We'll now put the test in side a loop to test larger matrix sizes:


In [11]:
for n in (20, 100, 1000):
    H = la.hilbert(n)
    print("- For {} x {} matrix, the condition number is: {}".format(n, n, np.linalg.cond(H, 2)))
    
    b       = np.ones(H.shape[0])
    b_delta = 1.0e-5*np.random.rand(H.shape[0])
    b1  = b + b_delta
    
    x  = np.linalg.solve(H, b)
    x1 = np.linalg.solve(H, b1)
    
    error_x = np.linalg.norm(x - x1, 2)/np.linalg.norm(x, 2)
    error_b = np.linalg.norm(b_delta, 2)/np.linalg.norm(b, 2)
    print("    Relative error in x and b: {}, {}".format(error_x, error_b))


- For 20 x 20 matrix, the condition number is: 1.3553657908688225e+18
    Relative error in x and b: 343.4261381702514, 4.941473864651237e-06
- For 100 x 100 matrix, the condition number is: 2.0170220389490536e+19
    Relative error in x and b: 194.95957972574547, 5.44736914896816e-06
- For 1000 x 1000 matrix, the condition number is: 7.028291839626664e+20
    Relative error in x and b: 752.8792223883698, 5.69814859779505e-06

Condition number versus determinant

It was discussed in lectures that the condition number of a matrix and its determinant are not necessarily related. Some small examples were presented. Here we consider some larger problems.

We consider an $n \times n$ upper triangular matrix filled with two, and one on the diagonal:

$$ \boldsymbol{A} = \begin{bmatrix} 1 & 2 & \ldots & 2 \\ & 1 & \ddots & \vdots \\ & & \ddots & 2 \\ & & & 1 \end{bmatrix} $$

This matrix has a determinant of one, and a condition number that grows with $n$. We can explore this with NumPy for increasing $n$.


In [12]:
def test_matrix(n):
    A = np.zeros((n, n))
    A[np.triu_indices(n)] = 2.0
    np.fill_diagonal(A, 1.0)
    return A

for n in (2, 10, 100, 500):
    A = test_matrix(n)
    print("- Matrix size: {} x {}".format(n, n))
    print("   * l_2 condition number is: {}".format(np.linalg.cond(A, 2)))
    print("   * determinant is: {}".format(np.linalg.det(A)))


- Matrix size: 2 x 2
   * l_2 condition number is: 5.828427124746189
   * determinant is: 1.0
- Matrix size: 10 x 10
   * l_2 condition number is: 161.44763879758878
   * determinant is: 1.0
- Matrix size: 100 x 100
   * l_2 condition number is: 16210.722720219908
   * determinant is: 1.0
- Matrix size: 500 x 500
   * l_2 condition number is: 405284.0679029256
   * determinant is: 1.0