# Linear algebra background

### Questions

• What is a vector and a matrix?
• List main properties of vectors and matrices
• What are computational complexities of basic matrix-vectors operations?
• What functions of vectors and matrices exist?
• What types of matrices do you know?

## Main objects

• A vector: $x = (x_1, \ldots, x_n) \in \mathbb{R}^n$
• A matrix: $A = (a_{ij}) \in \mathbb{R}^{m \times n}$, $i=1,\ldots,m$, $j = 1,\ldots,n$
• A system of linear equations: $Ax = b$

### Properties

• Definition A finite set of nonzero vectors $\{x_1, \ldots, x_p \}$ is called linear dependent, if there exist scalars $c_1, \ldots, c_p$ not all zero such that $$\sum\limits_{i=1}^p c_i x_i = 0.$$
• Definition A rank of a matrix $A$ is a number of linear independent columns
• Definition A matrix $A$ is called positive-definite, if for any nonzero vector $x$ the scalar $x^{\top}A x$ is positive: $$A \succ 0 \iff x^{\top}A x > 0$$ In the same way, one can define negative-definite matrix and positive/negatve semidefinite matrix.
• Definition Square real matrix is called orthogonal if $$AA^{\top} = I$$
• Definition A trace of square matrix $A$ is a sum of its diagonal elements: $$\mathrm{trace}(A) = \sum_{i=1}^n a_{ii}$$

### Theorems

• Theorem on matrix rank: row matrix rank equals column matrix rank
• Sylvester's criterion: a matrix is positive definite iff all its principal minors are positive

### Operations

• Adding and subtracting of vectors, multiplication vector by scalar are performed elementwise - $O(n)$ operations (flops)
• Dot product: $$\langle x, y \rangle = \sum_{i=1}^n x_i y_i - O(n) \text{ operations}$$
• Matrix by vector product: $$(Ax)_i = \sum_{k=1}^n a_{ik}x_k,$$ where $A \in \mathbb{R}^{m \times n}$ and $x \in \mathbb{R}^n$:
• for dense matrix $A$ — $O(mn)$
• for sparse matrix $A$ where $N \ll mn$ nonzero elements — $O(N)$
• for lowrank matrix $A = UV^{\top}$, where $U \in \mathbb{R}^{m \times p}, \; V \in \mathbb{R}^{n \times p}$ and $p \ll \min(n, m)$ — $O(p(n + m))$
• for some special matrices (Toeplitz matrix, circulant, $\mathcal{H}$ and $\mathcal{H}^2$ matrices) multiplication by vector can be performed faster
• Matrix by matrix multiplication: $$A = BC \qquad a_{ij} = \sum_{k=1}^p b_{ik}c_{kj},$$ where $A \in \mathbb{R}^{m \times n}$, $B \in \mathbb{R}^{m \times p}$ и $C \in \mathbb{R}^{p \times n}$
• for dense matrices $B$ and $C$ — $O(mnp)$ operations
• significantly faster if one or both matrices $B$ and $C$ are sparse

## System of linaer equations

$$Ax = b$$

For arbitrary square matrix $A \in \mathbb{R}^{n \times n}$ solving system of linear equations costs $O(n^3)$ operations

### Questions

• When system of linear equations is feasible?
• What is inverse matrix?
• How to solve system of linear equations?
• For what matrices system of linear equations can be solved faster than $O(n^3)$?

### For what matrices system of linear equations can be solved faster than $O(n^3)$?

• Diagonal matrices $A$ — $O(n)$ operations
• Lower- and uppertriangular matrices — $O(n^2)$ operations. The methods to solve such systems is called forward and backward substitutions respectively
• Orthogonal matrices $A$
• for arbitrary orthogomal matrix — $O(n^2)$ operations
• if one knows structure if the orthogonal matrix (for example $A = I - 2uu^{\top}$, $u$ is a vector) — $O(n)$ operations

Q: What geometrically means multiplication matrix $A = I - 2uu^{\top}$ by any vector $x$?

## Factor-based method to solve system of linear equations $Ax = b$

1. Represent matrix $A$ as product of $k$ "simple" matrices (diagonal, lower- and uppertriangular, and others...) $$A = A_1A_2\ldots A_k$$
2. To get $x$ solve $k$ systems of linear equations $$A_1x_1 = b \quad A_2x_2 = x_1 \quad \ldots \quad A_kx = x_{k-1}$$ This can be done fast since matrices $A_i$ are "simple".

This method is appropriate to solve a $m$ systems of linear equations with constant matrix $A$ and different right-hand side $b_i, \; i=1,\ldots,m$:

• Matrix $A$ factorization in product of simple matrices is performed only once
• After that one can solve $m$ systems of linear equations for every right-hand side $b_i$

### LU factorization

Theorem. Any nonsingular matrix $A$ can be represented in the following form $$A = PLU,$$ where $P$ is a permutation matrix, $L$ is lowertriangular matrix and $U$ is uppertriangular matrix

LU factorization requires $2n^3/3$ operations

Q: What is complexity of solving system of linear equations $Ax=b$ given LU factorization of matrix $A$?

Q: Why do we need permutation matrix $P$?

### Cholesky factorization

Theorem. Any symmetrix and positive definite matrix $A$ can be represebted in the following form $$A = LL^{\top},$$ where $L$ is a lowertriangulat matrix.

Cholesky factorization requires $n^3/3$ operations.

Q: What is complexity of solving system of linear equations $Ax=b$ given Cholesky factorization of matrix $A$?

## System of linear equations in the form $(A + BC)x = b$

• $A$ is "simple" matrix, i.e. system $Ax = b$ can be solved fast
• $BC$ is a representation of lowrank matrix

How to solve such linear systems fast, you will describe in your homework :)

## Eigenvectors and eigenvalues

### Questions

1. What are eigenvectors and eigenvalues?
2. What matrices are diagonalized and unitary diagonalized?
3. How to compute full and partial spectrum of the matrix?

Definition. Nonzero vector $x$ is called eigenvector of a transformation given by matrix $A$ if $$Ax = \lambda x,$$ where $\lambda$ is eigenvalue corresponding to the eigenvector $x$.

If matrix $A$ has $n$ linear independent eigenvector, then it can be represented in the form of spectral decomposition: $$A = X\Lambda X^{-1},$$ where $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_n)$, and $X = [x_1, \ldots, x_n]$.

### Theorems

1. Matrix is unitary diagonalized iff it is normal: $$AA^* = A^*A$$
2. $$\mathrm{tr}(A) = \sum_{i=1}^n \lambda_i \quad \det(A) = \prod_{i=1}^n \lambda_i$$
3. A matrix is positive definite iff all its eigenvalues are positive.

### How to compute eigenvectors and eigenvalues?

1. Complexity of computing all eigenvectors and eigenvalues is $O(n^3)$ - Jacobi eigenvalue algorithm, QR-algorithm, using Schur decomposition...
2. But full spectral decomposition is rarely needed. Usually only some leading eigenvalues-eigenvectors are required. To solve this problem, one can use power method and inverse iteration method

## Recap

1. Linear algebra is a main tool in investigating optimization methods
2. Matrix and vector operations
3. Solving system of linear equations
4. Computing eigenvalues and eigenvectors