Linear algebra background

Questions

  • What is a vector and a matrix?
  • List main properties of vectors and matrices
  • What are computational complexities of basic matrix-vectors operations?
  • What functions of vectors and matrices exist?
  • What types of matrices do you know?

Main objects

  • A vector: $x = (x_1, \ldots, x_n) \in \mathbb{R}^n$
  • A matrix: $A = (a_{ij}) \in \mathbb{R}^{m \times n}$, $i=1,\ldots,m$, $j = 1,\ldots,n$
  • A system of linear equations: $Ax = b$

Properties

  • Definition A finite set of nonzero vectors $\{x_1, \ldots, x_p \}$ is called linear dependent, if there exist scalars $c_1, \ldots, c_p$ not all zero such that $$ \sum\limits_{i=1}^p c_i x_i = 0. $$
  • Definition A rank of a matrix $A$ is a number of linear independent columns
  • Definition A matrix $A$ is called positive-definite, if for any nonzero vector $x$ the scalar $x^{\top}A x$ is positive: $$ A \succ 0 \iff x^{\top}A x > 0 $$ In the same way, one can define negative-definite matrix and positive/negatve semidefinite matrix.
  • Definition Square real matrix is called orthogonal if $$ AA^{\top} = I $$
  • Definition A trace of square matrix $A$ is a sum of its diagonal elements: $$ \mathrm{trace}(A) = \sum_{i=1}^n a_{ii} $$

Theorems

  • Theorem on matrix rank: row matrix rank equals column matrix rank
  • Sylvester's criterion: a matrix is positive definite iff all its principal minors are positive

Operations

  • Adding and subtracting of vectors, multiplication vector by scalar are performed elementwise - $O(n)$ operations (flops)
  • Dot product: $$ \langle x, y \rangle = \sum_{i=1}^n x_i y_i - O(n) \text{ operations} $$
  • Matrix by vector product: $$ (Ax)_i = \sum_{k=1}^n a_{ik}x_k, $$ where $A \in \mathbb{R}^{m \times n}$ and $x \in \mathbb{R}^n$:
    • for dense matrix $A$ — $O(mn)$
    • for sparse matrix $A$ where $N \ll mn$ nonzero elements — $O(N)$
    • for lowrank matrix $A = UV^{\top}$, where $U \in \mathbb{R}^{m \times p}, \; V \in \mathbb{R}^{n \times p}$ and $p \ll \min(n, m)$ — $O(p(n + m))$
    • for some special matrices (Toeplitz matrix, circulant, $\mathcal{H}$ and $\mathcal{H}^2$ matrices) multiplication by vector can be performed faster
  • Matrix by matrix multiplication: $$ A = BC \qquad a_{ij} = \sum_{k=1}^p b_{ik}c_{kj}, $$ where $A \in \mathbb{R}^{m \times n}$, $B \in \mathbb{R}^{m \times p}$ и $C \in \mathbb{R}^{p \times n}$
    • for dense matrices $B$ and $C$ — $O(mnp)$ operations
    • significantly faster if one or both matrices $B$ and $C$ are sparse

System of linaer equations

$$ Ax = b $$

For arbitrary square matrix $A \in \mathbb{R}^{n \times n}$ solving system of linear equations costs $O(n^3)$ operations

Questions

  • When system of linear equations is feasible?
  • What is inverse matrix?
  • How to solve system of linear equations?
  • For what matrices system of linear equations can be solved faster than $O(n^3)$?

For what matrices system of linear equations can be solved faster than $O(n^3)$?

  • Diagonal matrices $A$ — $O(n)$ operations
  • Lower- and uppertriangular matrices — $O(n^2)$ operations. The methods to solve such systems is called forward and backward substitutions respectively
  • Orthogonal matrices $A$
    • for arbitrary orthogomal matrix — $O(n^2)$ operations
    • if one knows structure if the orthogonal matrix (for example $A = I - 2uu^{\top}$, $u$ is a vector) — $O(n)$ operations

Q: What geometrically means multiplication matrix $A = I - 2uu^{\top}$ by any vector $x$?

Factor-based method to solve system of linear equations $Ax = b$

  1. Represent matrix $A$ as product of $k$ "simple" matrices (diagonal, lower- and uppertriangular, and others...) $$ A = A_1A_2\ldots A_k $$
  2. To get $x$ solve $k$ systems of linear equations $$ A_1x_1 = b \quad A_2x_2 = x_1 \quad \ldots \quad A_kx = x_{k-1} $$ This can be done fast since matrices $A_i$ are "simple".

This method is appropriate to solve a $m$ systems of linear equations with constant matrix $A$ and different right-hand side $b_i, \; i=1,\ldots,m$:

  • Matrix $A$ factorization in product of simple matrices is performed only once
  • After that one can solve $m$ systems of linear equations for every right-hand side $b_i$

LU factorization

Theorem. Any nonsingular matrix $A$ can be represented in the following form $$ A = PLU, $$ where $P$ is a permutation matrix, $L$ is lowertriangular matrix and $U$ is uppertriangular matrix

LU factorization requires $2n^3/3$ operations

Q: What is complexity of solving system of linear equations $Ax=b$ given LU factorization of matrix $A$?

Q: Why do we need permutation matrix $P$?

Cholesky factorization

Theorem. Any symmetrix and positive definite matrix $A$ can be represebted in the following form $$ A = LL^{\top}, $$ where $L$ is a lowertriangulat matrix.

Cholesky factorization requires $n^3/3$ operations.

Q: What is complexity of solving system of linear equations $Ax=b$ given Cholesky factorization of matrix $A$?

System of linear equations in the form $(A + BC)x = b$

  • $A$ is "simple" matrix, i.e. system $Ax = b$ can be solved fast
  • $BC$ is a representation of lowrank matrix

How to solve such linear systems fast, you will describe in your homework :)

Eigenvectors and eigenvalues

Questions

  1. What are eigenvectors and eigenvalues?
  2. What matrices are diagonalized and unitary diagonalized?
  3. How to compute full and partial spectrum of the matrix?

Definition. Nonzero vector $x$ is called eigenvector of a transformation given by matrix $A$ if $$ Ax = \lambda x, $$ where $\lambda$ is eigenvalue corresponding to the eigenvector $x$.

If matrix $A$ has $n$ linear independent eigenvector, then it can be represented in the form of spectral decomposition: $$ A = X\Lambda X^{-1}, $$ where $\Lambda = \mathrm{diag}(\lambda_1, \ldots, \lambda_n)$, and $X = [x_1, \ldots, x_n]$.

Theorems

  1. Matrix is unitary diagonalized iff it is normal: $$ AA^* = A^*A $$
  2. $$ \mathrm{tr}(A) = \sum_{i=1}^n \lambda_i \quad \det(A) = \prod_{i=1}^n \lambda_i $$
  3. A matrix is positive definite iff all its eigenvalues are positive.

How to compute eigenvectors and eigenvalues?

  1. Complexity of computing all eigenvectors and eigenvalues is $O(n^3)$ - Jacobi eigenvalue algorithm, QR-algorithm, using Schur decomposition...
  2. But full spectral decomposition is rarely needed. Usually only some leading eigenvalues-eigenvectors are required. To solve this problem, one can use power method and inverse iteration method

Recap

  1. Linear algebra is a main tool in investigating optimization methods
  2. Matrix and vector operations
  3. Solving system of linear equations
  4. Computing eigenvalues and eigenvectors