Today we will talk about **matrix factorizations**

- LU decomposition and Gaussian elimination -- already covered
- QR decomposition and Gram-Schmidt algorithm
- Schur decomposition and QR-algorithm
- Few words about the SVD decomposition

We already introduced QR decomposition some time ago, but now we are going to do it in more details.

These are the basic **matrix factorizations** in numerical linear algebra.

In numerical linear algebra we need to solve different tasks, for example:

- Solve linear systems $Ax = f$
- Compute eigenvalues / eigenvectors
- Compute singular values / singular vectors
- Compute inverses, even sometimes determinants
- Compute
**matrix functions**like $\exp(A), \cos(A)$ (these are not elementwise functions)

In order to do this, we represent the matrix as a sum and/or product of matrices with **simpler structure**,

such that we can solve abovementioned tasks faster / in a more stable form.

What is a **simpler structure**?

For **sparse matrices** the sparse constraints are often included in the factorizations.

For **Toeplitz matrices** an important class of matrices is the class of matrices with **low displacement rank**,

which is based on the low-rank matrices.

The class of **low-rank matrices** and **block low-rank matrices** is everywhere.

Any nonsingular matrix can be factored as

$$A = P L U, $$where $P$ is a permutation matrix, $L$ is a **lower triangular matrix**, $U$ is an **upper triangular**

**Main goal** of the LU decomposition is to solve linear systems, because

and this reduces to the solution of two linear systems $$ L y = f, \quad U x = y. $$

The next decomposition: **QR** decomposition. Again from the name it is clear that a matrix is represented as a product

$$
A = Q R,
$$
where $Q$ is an **orthogonal (unitary)** matrix and $R$ is **upper triangular**.

The matrix sizes: $Q$ is $n \times m$, $R$ is $m \times m$ if $n\geq m$.

QR-factorization is defined for **any rectangular matrix**.

This algorithm plays a crucial role in many problems:

- Computing orthogonal bases in a linear space
- Used in the preprocessing step for the SVD
- QR-algorithm for the computation of eigenvectors and eigenvalues (one of the 10 most important algorithms of the 20th century) is based on the QR-decomposition
- Solving overdetermined systems of linear equations

Suppose we need to solve

$$\Vert A x - f \Vert_2 \rightarrow \min_x,$$where $A$ is $n \times m$, $n \geq m$.

Then we factorize

$$A = Q R,$$and

$z = R x$, we have

$$\Vert Q z - f \Vert_2 \rightarrow \min, $$thus

$$z = (Q^* Q)^{-1} Q^* f = Q^* f,$$and $x$ can be recovered from

$$R x = z.$$Note that this is a square system of linear equations.

One of the efficient ways to solve really overdetermined ($n\gg m$) system of linear equations is to use Kaczmarz method.

Instead of solving all equations, pick one randomly, which reads

$$a^{\top}_i x = f_i,$$and given an approximation $x_k$ try to find $x_{k+1}$ as

$$x_{k+1} = \arg \min_x \Vert x - x_k \Vert, \quad \mbox{s. t.} \quad a^{\top}_i x = f_i.$$a simple analysis gives

$$x_{k+1} = x_k - \frac{(a_i, x_k) - b_i}{(a_i, a_i)} a_i. $$A cheap update, but the analysis is quite complicated. If time permits, we will cover randomized algorithms of linear algebra in a separate lecture.

**Theorem:**
Every rectangular $n \times m$ matrix, $n \geq m$ matrix has a QR-decomposition.

There are several ways to prove it and compute it:

- Theoretical: using the Gram matrices and Cholesky factorization
- Geometrical: using the Gram-Schmidt orthogonalization
- Practical: using Householder/Givens transformations

Assume that $A$ has **full column rank**. Then, it is simple to show that $A^* A$ is positive definite:

Therefore, $A^* A = R^* R$ always exists.

Then the matrix $A R^{-1}$ is unitary:

$$ (A R^{-1})^* (AR^{-1})= R^{-*} A^* A R^{-1} = R^{-*} R^* R R^{-1} = I. $$When an $n \times m$ matrix does not have full column rank, it is said to be **rank-deficient**.

The QR-decomposition, however, also exists.

For any **rank-deficient matrix** there is a sequence of full-column rank matrices $A_k$ such that $A_k \rightarrow A$ (why?).

Each $A_k$ can be decomposed as $A_k = Q_k R_k$.

The set of all unitary matrices is compact, thus there exists a converging subsequence $Q_{n_k} \rightarrow Q$ (why?),

and $Q^* A_k \rightarrow Q^* A = R$, which is triangular.

```
In [1]:
```import numpy as np
n = 20
r = 8
#a = np.random.randn(n, r)
a = [[1.0/(i+j+0.5) for i in range(r)] for j in range(n)]
a = np.array(a)
q, Rmat = np.linalg.qr(a)
e = np.eye(r)
print 'Built-in QR orth', np.linalg.norm(np.dot(q.T, q) - e)
gram_matrix = a.T.dot(a)
Rmat1 = np.linalg.cholesky(gram_matrix)
q1 = np.dot(a, np.linalg.inv(Rmat1.T))
print 'Via Cholesky:', np.linalg.norm(np.dot(q1.T, q1) - e)

```
```

QR-decomposition is a mathematical way of writing down the Gram-Schmidt orthogonalization process.

Given a sequence of vectors $a_1, \ldots, a_m$ we want to find orthogonal basis $q_1, \ldots, q_m$ such that every $a_i$ is a linear combination of such vectors.

**Gram-Schmidt:**

- $q_1 := a_1/\Vert a_1 \Vert$
- $q_2 := a_2 - (a_2, q_1) q_1, \quad q_2 := q_2/\Vert q_2 \Vert$
- $q_3 := a_3 - (a_3, q_1) q_1 - (a_3, q_2) q_2, \quad q_2 := q_3/\Vert q_3 \Vert$
- And go on

Note that the transformation from $Q$ to $A$ has triangular structure, since from the $k$-th vector we subtract only the previous ones. It follows from the fact that the product of triangular matrices is a triangular matrix.

Gram-Schmidt can be very unstable (i.e., the produced vectors will be not orthogonal, especially if $q_k$ has small norm).

This is called **loss of orthogonality**.

There is a remedy, called **modified Gram-Schmidt** method.
Instead of doing

we do it step-by-step. First we set $q_k := a_k$ and orthogonalize sequentially:

$$ q_k := q_k - (q_k, q_1)q_1, \quad q_k := q_{k} - (q_k,q_2)q_2, \ldots $$In exact arithmetic, it is the same. In floating point it is absolutely different!

Note that the complexity is $\mathcal{O}(nm^2)$ operations

If $A = QR$, then

$$
R = Q^* A,
$$
and we need to find a certain orthogonal matrix $Q$ that brings a matrix into orthogonal form.

For simplicity, we will look for an $n \times n$ matrix such that
$$
Q^* A = \begin{bmatrix}

- &
*&*\ 0 &*&*\ 0 & 0 & * \ &0_{(n-m) \times m} \end{bmatrix} $$

We will do it column-by-column.

First, we find a Householder matrix $H_1 = (I - 2 uu^{\top})$ such that (we illustrate on a $4 \times 3$ matrix)

Then, $$ H_2 H_1 A = \begin{bmatrix}

```
* & * & * \\
0 & * & * \\
0 & 0 & * \\
0 & 0 & *
```

\end{bmatrix}, $$ where $$ H_2 = \begin{bmatrix} 1 & 0 \\ 0 & H'_2, \end{bmatrix} $$ and $H'_2$ is a $3 \times 3$ Householder matrix.

Finally, $$ H_3 H_2 H_1 A = \begin{bmatrix}

```
* & * & * \\
0 & * & * \\
0 & 0 & * \\
0 & 0 & 0
```

\end{bmatrix}, $$

You can try to implement it yourself, it is simple.

In reality, since this is a dense matrix factorization, you should implement the algorithm in terms of blocks.

Instead of using Householder transformation, we use **block Householder** transformation of the form

where $U^* U = I$.

This allows us to use BLAS-3 operations.

The QR-decomposition can be also used to compute the (numerical) rank of the matrix, see

Rank-Revealing QR Factorizations and the Singular Value Decomposition, Y. P. Hong; C.-T. Pan

It is done via so-called **rank-revealing** factorization.

It is based on the representation

$$P A = Q R, $$where $P$ is the permutation matrix (it permutes columns), and $R$ has the block form

$$R = \begin{bmatrix} R_{11} & R_{12} \\ 0 & R_{22}\end{bmatrix}.$$The goal is to find $P$ such that the norm of $R_{22}$ is **small**, so you can find the numerical rank by looking at it.

An estimate is $\sigma_{r+1} \leq \Vert R_{22} \Vert_2$ (check why).

LU and QR decompositions can be computed using **direct methods** in finite amount of operations.

What about Schur form and SVD?

They can not be computed by **direct methods** (why?) they can only be computed by **iterative methods**.

Although iterative methods still have the same $\mathcal{O}(n^3)$ complexity in floating point arithmetic thanks to fast convergence.

Then we can write down the iterative process

$$ Q_{k+1} R_{k+1} = A Q_k, \quad Q_{k+1}^* A = R_{k+1} Q^*_k $$Introduce

$$A_k = Q^* _k A Q_k = Q^*_k Q_{k+1} R_{k+1} = \widehat{Q}_k R_{k+1}$$and the new approximation reads

$$A_{k+1} = Q^*_{k+1} A Q_{k+1} = ( Q_{k+1}^* A = R_{k+1} Q^*_k) = R_{k+1} \widehat{Q}_k.$$So we arrive at the standard form of the QR algorithm:

The final formulas are then written in the classical **QRRQ**-form:

- Start from $A_0 = A$.
- Compute QR factorization of $A_k = Q_k R_k$.
- Set $A_{k+1} = R_k Q_k$.

Iterate until $A_k$ is **triangular enough** (e.g. norm of subdiagonal part is small enough).

**Statement**

Matrices $A_k$ are unitary similar to $A$

$A_k = Q^*_{k-1} A_{k-1} Q_{k-1} = (Q_{k-1} \ldots Q_1)^* A (Q_{k-1} \ldots Q_1)$

and the product of unitary matrices is a unitary matrix.

Complexity of each step is $\mathcal{O}(n^3)$, if a general QR-decomposition is done.

Our hope is that $A_k$ will be **very close to the triangular matrix** for suffiently large $k$.

```
In [5]:
```import numpy as np
n = 4
a = [[1.0/(i + j + 0.5) for i in range(n)] for j in range(n)]
niters = 100
for k in range(niters):
q, rmat = np.linalg.qr(a)
a = rmat.dot(q)
print 'Leading 3x3 block of a:'
print a[:3, :3]

```
```

The convergence of the QR-algorithm is from the **largest eigenvalues** to the smallest.

At least 2-3 iterations is needed for an eigenvalue.

Each step is one QR-factorization and one matrix-by-matrix product. $\mathcal{O}(n^3)$ complexity.

It means, $\mathcal{O}(n^4)$ complexity?

Fortunately, not.

We can also speedup the QR-algorithm by using **shifts**, since $A_k - \lambda I$ has the same Schur vectors.

We will discuss these details tomorrow.

```
In [6]:
```from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()

```
Out[6]:
```