Eigenvalue Decomposition - Perturbation Theory

Prerequisites

The reader should be familiar with basic linear algebra concepts and facts about eigenvalue decomposition.

Competences

The reader should be able to understand and check the facts about perturbations of eigenvalues and eigenvectors.

Norms

In order to measure changes, we need to define norms. For more details and the proofs of the Facts below, see R. Byers and B. N. Datta, Vector and Matrix Norms, Error Analysis, Efficiency, and Stability and the references therein.

Definitions

Norm on a vector space $X$ is a real-valued function $\| \phantom{x} \| : X\to \mathbb{R}$ with the following properties:

  1. Positive definiteness. $\| x\|\geq 0$ and $\|x\|=0$ if and only if $x$ is the zero vector.
  2. Homogeneity. $\| \lambda x\|=|\lambda| \|x\|$
  3. Triangle inequality. $\| x+y\| \leq \|x\|+\|y\|$

Commonly encountered vector norms for $x\in\mathbb{C}^n$ are:

  • Hölder norm or $p$-norm: for $p\geq 1$, $\|x\|_p=\big(|x_1|^p+|x_2|^p+\cdots |x_n|^p)^{1/p}$,
  • Sum norm or $1$-norm: $\|x\|_1=|x_1|+|x_2|+\cdots |x_n|$,
  • Euclidean norm or $2$-norm: $\|x\|_2=\sqrt{|x_1|^2+|x_2|^2+\cdots |x_n|^2}$,
  • Sup-norm or $\infty$-norm: $\|x\|_\infty = \max\limits_{i=1,\ldots,n} |x_i|$.

Vector norm is absolute if $\||x|\|=\|x\|$.

Vector norm is monotone if $|x|\leq |y|$ implies $\|x\|\leq \|y\|$.

From every vector norm we can derive a corresponding induced matrix norm (also, operator norm or natural norm):

$$\|A\| = \max\limits_{x\neq 0} \frac{\|Ax\|}{\|x\|}=\max\limits_{\|x\|=1} \|Ax\|.$$

For matrix $A\in\mathbb{C}^{m\times n}$ we define:

  • Maximum absolute column sum norm: $\|A\|_1=\max\limits_{1\leq j \leq n} \sum_{i=1}^m |a_{ij}|$,
  • Spectral norm: $\|A\|_2=\sqrt{\rho(A^*A)}=\sigma_{\max}(A)$ (largest singular value of $A$),
  • Maximum absolute row sum norm: $\|A\|_{\infty}=\max\limits_{1\leq i \leq m} \sum_{j=1}^n |a_{ij}|$,
  • Euclidean norm or Frobenius norm: $\|A\|_F =\sqrt{\sum_{i,j} |a_{ij}|^2}=\sqrt{\mathop{\mathrm{tr}}(A^*A)}$.

Matrix norm is consistent if $\|A\cdot B\|\leq \|A\| \cdot \| B\|$, where $A$ and $B$ are compatible for matrix multiplication.

Matrix norm is absolute if $\||A|\|=\|A\|$.

Examples


In [1]:
import Random
Random.seed!(425)
x=rand(-4:4,5)


Out[1]:
5-element Array{Int64,1}:
 -2
 -3
 -2
  4
  2

In [2]:
using LinearAlgebra
norm(x,1), norm(x), norm(x,Inf)


Out[2]:
(13.0, 6.082762530298219, 4.0)

In [3]:
A=rand(-4:4,7,5)


Out[3]:
7×5 Array{Int64,2}:
  0  -3   2   0   4
  1   1   4  -3  -3
  3   2   1  -3   0
 -4  -4  -2  -1  -3
  2   3  -4   2   0
 -1  -1   4  -3  -3
 -1  -3  -2   4   4

In [4]:
norm(A,1), norm(A), norm(A,2), norm(A,Inf), 
opnorm(A), opnorm(A,1), opnorm(A,Inf)


Out[4]:
(81.0, 15.7797338380595, 15.7797338380595, 4.0, 11.053558535027305, 19.0, 14.0)

In [5]:
# Frobenius norm
norm(vec(A))


Out[5]:
15.7797338380595

Facts

  1. $\|x\|_1$, $\|x\|_2$, $\|x\|_\infty$ and $\|x\|_p$ are absolute and monotone vector norms.
  2. A vector norm is absolute iff it is monotone.
  3. Convergence. $x_k\to x_*$ iff for any vector norm $\|x_k-x_*\|\to 0$.
  4. Any two vector norms are equivalent in the sense that, for all $x$ and some $\alpha,\beta>0$ $$ \alpha \|x\|_\mu \leq \|x\|_\nu \leq \beta \|x\|_\mu. $$ In particular:
    • $\|x\|_2 \leq \|x\|_1\leq \sqrt{n}\|x\|_2$,
    • $\|x\|_\infty \leq \|x\|_2\leq \sqrt{n}\|x\|_\infty$,
    • $\|x\|_\infty \leq \|x\|_1\leq n\|x\|_\infty$.
  5. Cauchy-Schwartz inequality. $|x^*y|\leq \|x\|_2\|y\|_2$.
  6. Hölder inequality. If $p,q\geq 1$ and $\displaystyle\frac{1}{p}+\frac{1}{q}=1$, then $$|x^*y|\leq \|x\|_p\|y\|_q. $$
  7. $\|A\|_1$, $\|A\|_2$ and $\|A\|_\infty$ are induced by the corresponding vector norms.
  8. $\|A\|_F$ is not an induced norm.
  9. $\|A\|_1$, $\|A\|_2$, $\|A\|_\infty$ and $\|A\|_F$ are consistent.
  10. $\|A\|_1$, $\|A\|_\infty$ and $\|A\|_F$ are absolute. However, $\||A|\|_2\neq \|A\|_2$.
  11. Any two matrix norms are equivalent in the sense that, for all $A$ and some $\alpha,\beta>0$ $$ \alpha \|A\|_\mu \leq \|A\|_\nu \leq \beta \|A\|_\mu. $$ In particular:
    • $\frac{1}{\sqrt{n}}\|A\|_\infty \leq \|A\|_2\leq \sqrt{m}\|A\|_\infty$,
    • $\|A\|_2 \leq \|A\|_F\leq \sqrt{n}\|A\|_2$,
    • $\frac{1}{\sqrt{m}}\|A\|_1 \leq \|A\|_2\leq \sqrt{n}\|A\|_1$.
  12. $\|A\|_2\leq \sqrt{\|A\|_1 \|A\|_\infty}$.
  13. $\|AB\|_F\leq \|A\|_F\|B\|_2$ and $\|AB\|_F\leq \|A\|_2\|B\|_F$.
  14. If $A=xy^*$, then $\|A\|_2=\|A\|_F=\|x\|_2\|y\|_2$.
  15. $\|A^*\|_2=\|A\|_2$ and $\|A^*\|_F=\|A\|_F$.
  16. For a unitary matrix $U$ of compatible dimension, $$\|AU\|_2=\|A\|_2,\quad \|AU\|_F=\|A\|_F,\quad \|UA\|_2=\|A\|_2,\quad \|UA\|_F=\|A\|_F. $$
  17. For $A$ square, $\rho(A)\leq\|A\|$.
  18. For $A$ square, $A^k\to 0$ iff $>\rho(A)<1$.

In [6]:
A


Out[6]:
7×5 Array{Int64,2}:
  0  -3   2   0   4
  1   1   4  -3  -3
  3   2   1  -3   0
 -4  -4  -2  -1  -3
  2   3  -4   2   0
 -1  -1   4  -3  -3
 -1  -3  -2   4   4

In [7]:
# Absolute norms
opnorm(A,1), opnorm(abs.(A),1), opnorm(A,Inf), opnorm(abs.(A),Inf), norm(A),
norm(abs.(A)),  opnorm(A),opnorm(abs.(A))


Out[7]:
(19.0, 19.0, 14.0, 14.0, 15.7797338380595, 15.7797338380595, 11.053558535027305, 14.024662857881664)

In [8]:
# Equivalence of norms
m,n=size(A)
opnorm(A,Inf)\sqrt(n),opnorm(A), sqrt(m)*opnorm(A,Inf)


Out[8]:
(0.15971914124998499, 11.053558535027305, 37.04051835490427)

In [9]:
opnorm(A), norm(A), sqrt(n)*opnorm(A)


Out[9]:
(11.053558535027305, 15.7797338380595, 24.716508277594045)

In [10]:
opnorm(A,1)\sqrt(m),opnorm(A), sqrt(n)*opnorm(A,1)


Out[10]:
(0.1392500690033995, 11.053558535027305, 42.48529157249601)

In [11]:
# Fact 12
opnorm(A), sqrt(opnorm(A,1)*opnorm(A,Inf))


Out[11]:
(11.053558535027305, 16.30950643030009)

In [12]:
# Fact 13
B=rand(n,rand(1:9))
norm(A*B), norm(A)*opnorm(B), opnorm(A)*norm(B)


Out[12]:
(26.159344584614846, 56.03006853665045, 42.775144883730995)

In [13]:
# Fact 14
x=rand(ComplexF64,10)
y=rand(ComplexF64,10)
A=x*y'
opnorm(A), norm(A), norm(x)*norm(y)


Out[13]:
(6.3784685576847435, 6.378468557684745, 6.378468557684745)

In [14]:
# Fact 15
A=rand(-4:4,7,5)+im*rand(-4:4,7,5)
opnorm(A), opnorm(A'), norm(A), norm(A')


Out[14]:
(16.043071925866585, 16.043071925866588, 22.47220505424423, 22.47220505424423)

In [15]:
# Unitary invariance - generate random unitary matrix U
U,R=qr(rand(ComplexF64,size(A)));

In [16]:
opnorm(A), opnorm(U*A), norm(A), norm(U*A)


Out[16]:
(16.043071925866585, 16.043071925866588, 22.47220505424423, 22.472205054244235)

In [17]:
# Spectral radius
A=rand(ComplexF64,7,7)
maximum(abs,eigvals(A)), opnorm(A,Inf), opnorm(A,1), opnorm(A), norm(A)


Out[17]:
(5.01817683943979, 5.917412407080195, 6.607140833160132, 5.22269135612056, 5.814158973116375)

In [18]:
# Fact 18
B=A/(maximum(abs,eigvals(A))+2)
@show maximum(abs,eigvals(B))
norm(B^100)


maximum(abs, eigvals(B)) = 0.7150257045732034
Out[18]:
2.8058388159466584e-15

Errors and condition numbers

We want to answer the question:

How much the value of a function changes with respect to the change of its argument?

Definitions

For function $f(x)$ and argument $x$, the absolute error with respect to the perturbation of the argument $\delta x$ is

$$ \| f(x+\delta x)-f(x)\| = \frac{\| f(x+\delta x)-f(x)\|}{\| \delta x \|} \|\delta x\| \equiv \kappa \|\delta x\|. $$

The condition or condition number $\kappa$ tells how much does the perturbation of the argument increase. (Its form resembles derivative.)

Similarly, the relative error with respect to the relative perturbation of the argument is

$$ \frac{\| f(x+\delta x)-f(x)\|}{\| f(x) \|}= \frac{\| f(x+\delta x)-f(x)\|\cdot \|x\| }{\|\delta x\| \cdot\| f(x)\|} \cdot \frac{\|\delta x\|}{\|x\|} \equiv \kappa_{rel} \frac{\|\delta x\|}{\|x\|}. $$

Peturbation bounds

Definitions

Let $A\in\mathbb{C}^{n\times n}$.

Pair $(\lambda,x)\in\mathbb{C}\times\mathbb{C}^{n\times n}$ is an eigenpair of $A$ if $x\neq 0$ and $Ax=\lambda x$.

Triplet $(y,\lambda,x)\in\times\mathbb{C}^{n}\times\mathbb{C}\times\mathbb{C}^{n}$ is an eigentriplet of $A$ if $x,y\neq 0$ and $Ax=\lambda x$ and $y^*A=\lambda y^*$.

Eigenvalue matrix is a diagonal matrix $\Lambda=\mathop{\mathrm{diag}}(\lambda_1,\lambda_2,\ldots,\lambda_n)$.

If all eigenvalues are real, they can be increasingly ordered. $\Lambda^\uparrow$ is the eigenvalue matrix of increasingly ordered eigenvalues.

$\tau$ is a permutation of $\{1,2,\ldots,n\}$.

$\tilde A=A+\Delta A$ is a perturbed matrix, where $\Delta A$ is perturbation. $(\tilde \lambda,\tilde x)$ are the eigenpairs of $\tilde A$.

Condition number of a nonsingular matrix $X$ is $\kappa(X)=\|X\| \|X^{-1}\|$.

Let $X,Y\in\mathbb{C}^{n\times k}$ with $\mathop{\mathrm{rank}}(X)=\mathop{\mathrm{rank}}(Y)=k$. The canonical angles between their column spaces, $\theta_i$, are defined by $\cos \theta_i=\sigma_i$, where $\sigma_i$ are the singular values of the matrix $$(Y^*Y)^{-1/2}Y^*X(X^*X)^{-1/2}.$$ The canonical angle matrix between $X$ and $Y$ is $$\Theta(X,Y)=\mathop{\mathrm{diag}}(\theta_1,\theta_2,\ldots,\theta_k). $$

Facts

Bounds become more strict as matrices have more structure. Many bounds have versions in spectral norm and Frobenius norm. For more details and the proofs of the Facts below, see R.-C. Li, Matrix Perturbation Theory, and the references therein.

  1. There exists $\tau$ such that $$\|\Lambda- \tilde\Lambda_\tau\|_2\leq 4(\|A\|_2+\|\tilde A\|_2)^{1-1/n}\|\Delta A\|_2^{1/n}.$$

  2. First-order perturbation bounds. Let $(y,\lambda,x)$ be an eigentriplet of a simple $\lambda$. $\Delta A$ changes $\lambda$ to $\tilde\lambda=\lambda+ \delta\lambda$, where $$ \delta\lambda=\frac{y^*(\Delta A)x}{y^*x}+O(\|\Delta A\|_2^2). $$

  3. Let $\mu$ be a semisimple eigenvalue of $A$ with multiplicitiy $k$, and let $X,Y\in \mathbb{C}^{n\times k}$ be the matrices of the corresponding right and left eigenvectors, that is, $AX=\lambda X$ and $Y^*A=\lambda Y^*$, such that $Y^*X=I_k$. $\Delta A$ changes the $k$ copies of $\mu$ to $\tilde \mu=\mu+\delta\mu_i$, where $\delta\mu_i$ are the eigenvalues of $Y^*(\Delta A) X$ up to $O(\|\Delta A\|_2^2)$.

  4. Perturbations and the inverse: if $\|A\|_p<1$, then $I-A$ is nonsingular and (for proof see GVL p. 74 (98)) $$ (I-A)^{-1}=\sum\limits_{k=0}^\infty A^k $$ with $$ \|(I-A)^{-1}\|_p \leq \frac{1}{1-\|A\|_p},\qquad \|(I-A)^{-1}-I\|_p \leq \frac{\|A\|_p}{1-\|A\|_p}. $$

  5. Geršgorin Circle Theorem. If $X^{-1} A X=D+F$, where $D=\mathop{\mathrm{diag}}(d_1,\ldots,d_n)$ and $F$ has zero diagonal entries, then (for proof see GVL p.357 (381)) $$\sigma(A) \subseteq \bigcup\limits_{i=1}^n D_i,$$ where $$D_i=\big\{z\in\mathbb{C} : |z-d_i| \leq \sum\limits_{j=1}^n |f_{ij}| \big\}. $$ Moreover, by continuity, if a connected component of $D$ consists of $k$ circles, it contains $k$ eigenvalues.

  6. Bauer-Fike Theorem. If $A$ is diagonalizable and $A=X\Lambda X^{-1}$ is its eigenvalue decomposition, then (for proof see GVL p.357 (381)) $$ \max_i\min_j |\tilde \lambda_i - \lambda_j|\leq \|X^{-1}(\Delta A)X\|_p\leq \kappa_p(X)\|\Delta A\|_p. $$

  7. If $A$ and $\tilde A$ are diagonalizable, there exists $\tau$ such that $$\|\Lambda-\tilde\Lambda_\tau\|_F\leq \sqrt{\kappa_2(X)\kappa_2(\tilde X)}\|\Delta A\|_F. $$ If $\Lambda$ and $\tilde\Lambda$ are real, then $$ \|\Lambda^\uparrow-\tilde\Lambda^\uparrow\|_{2,F} \leq \sqrt{\kappa_2(X)\kappa_2(\tilde X)}\|\Delta A\|_{2,F}. $$

  8. If $A$ is normal, there exists $\tau$ such that $\|\Lambda-\tilde\Lambda_\tau\|_F\leq\sqrt{n}\|\Delta A\|_F$.

  9. Hoffman-Wielandt Theorem. If $A$ and $\tilde A$ are normal, there exists $\tau$ such that $\|\Lambda-\tilde\Lambda_\tau\|_F\leq\|\Delta A\|_F$.

  10. If $A$ and $\tilde A$ are Hermitian, for any unitarily invariant norm $\|\Lambda^\uparrow-\tilde\Lambda^\uparrow\| \leq \|\Delta A\|$. In particular, \begin{align*} \max_i|\lambda^\uparrow_i-\tilde\lambda^\uparrow_i|&\leq \|\Delta A\|_2,\\ \sqrt{\sum_i(\lambda^\uparrow_i-\tilde\lambda^\uparrow_i)^2}&\leq \|\Delta A\|_F. \end{align*}

  11. Residual bounds. Let $A$ be Hermitian. For some $\tilde\lambda\in\mathbb{R}$ and $\tilde x\in\mathbb{C}^n$ with $\|\tilde x\|_2=1$, define residual $r=A\tilde x-\tilde\lambda\tilde x$. Then $|\tilde\lambda-\lambda|\leq \|r\|_2$ for some $\lambda\in\sigma(A)$.

  12. Let, in addition, $\tilde\lambda=\tilde x^* A\tilde x$, let $\lambda$ be closest to $\tilde\lambda$ and $x$ be its unit eigenvector, and let $$\eta=\mathop{\mathrm{gap}}(\tilde\lambda)= \min_{\lambda\neq\mu\in\sigma(A)}|\tilde\lambda-\mu|.$$ If $\eta>0$, then $$ |\tilde\lambda-\lambda|\leq \frac{\|r\|_2^2}{\eta},\quad \sin\theta(x,\tilde x)\leq \frac{\|r\|_2}{\eta}. $$

  13. Let $A$ be Hermitian, $X\in\mathbb{C}^{n\times k}$ have full column rank, and $M\in\mathcal{H}_k$ having eigenvalues $\mu_1\leq\mu_2\leq\cdots\leq\mu_k$. Set $R=AX-XM$. Then there exist $\lambda_{i_1}\leq\lambda_{i_2}\leq\cdots\leq\lambda_{i_k}\in\sigma(A)$ such that \begin{align*} \max_{1\leq j\leq k} |\mu_j-\lambda_{i_j}|& \leq \frac{\|R\|_2}{\sigma_{\min}(X)},\\ \sqrt{\sum_{j=1}^k (\mu_j-\lambda_{i_j})^2}&\leq \frac{\|R\|_F}{\sigma_{\min}(X)}. \end{align*} (The indices $i_j$ need not be the same in the above formulae.)

  14. If, additionally, $X^*X=I$ and $M=X^*AX$, and if all but $k$ of $A$'s eigenvalues differ from every one of $M$'s eigenvalues by at least $\eta>0$, then $$\sqrt{\sum_{j=1}^k (\mu_j-\lambda_{i_j})^2}\leq \frac{\|R\|_F^2}{\eta\sqrt{1-\|R\|_F^2/\eta^2}}.$$

  15. Let $A=\begin{bmatrix} M & E^* \\ E & H \end{bmatrix}$ and $\tilde A=\begin{bmatrix} M & 0 \\ 0 & H \end{bmatrix}$ be Hermitian, and set $\eta=\min |\mu-\nu|$ over all $\mu\in\sigma(M)$ and $\nu\in\sigma(H)$. Then $$ \max |\lambda_j^\uparrow -\tilde\lambda_j^\uparrow| \leq \frac{2\|E\|_2^2}{\eta+\sqrt{\eta^2+4\|E\|_2^2}}. $$

  16. Let $$ \begin{bmatrix} X_1^*\\ X_2^* \end{bmatrix} A \begin{bmatrix} X_1 & X_2 \end{bmatrix}= \begin{bmatrix} A_1 & \\ & A_2 \end{bmatrix}, \quad \begin{bmatrix} X_1 & X_2 \end{bmatrix} \textrm{unitary}, \quad X_1\in\mathbb{C}^{n\times k}. $$ Let $Q\in\mathbb{C}^{n\times k}$ have orthonormal columns and for a Hermitian $k\times k$ matrix $M$ set $R=AQ-QM$. Let $\eta=\min|\mu-\nu|$ over all $\mu\in\sigma(M)$ and $\nu\in\sigma(A_2)$. If $\eta > 0$, then $$ \|\sin\Theta(X_1,Q)\|_F\leq \frac{\|R\|_F}{\eta}. $$

Example - Nondiagonalizable matrix


In [19]:
A=[-3 7 -1; 6 8 -2; 72 -28 19]


Out[19]:
3×3 Array{Int64,2}:
 -3    7  -1
  6    8  -2
 72  -28  19

In [20]:
# (Right) eigenvectors
λ,X=eigen(A)
λ


Out[20]:
3-element Array{Float64,1}:
 -6.000000000000005
 14.999999758522048
 15.000000241477958

In [21]:
X


Out[21]:
3×3 Array{Float64,2}:
  0.235702  -0.218218   0.218218
 -0.235702  -0.436436   0.436436
 -0.942809   0.872872  -0.872872

In [22]:
cond(X)


Out[22]:
9.091581993893509e7

In [23]:
X[:,2]+X[:,3]


Out[23]:
3-element Array{Float64,1}:
 -2.007421329164316e-8
 -4.0148426638797474e-8
 -2.5092766420264923e-8

In [24]:
# Left eigenvectors
λ1,Y=eigen(Matrix(A'))


Out[24]:
Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
values:
3-element Array{Complex{Float64},1}:
 -5.999999999999998 + 0.0im
 14.999999999999993 - 2.0088262607214127e-7im
 14.999999999999993 + 2.0088262607214127e-7im
vectors:
3×3 Array{Complex{Float64},2}:
     0.894427+0.0im      0.970143-0.0im             0.970143+0.0im
    -0.447214+0.0im  -7.58506e-16+1.62404e-8im  -7.58506e-16-1.62404e-8im
 -6.07504e-17+0.0im      0.242536-4.0601e-9im       0.242536+4.0601e-9im

In [25]:
cond(Y)


Out[25]:
1.1178754628722218e8

In [26]:
# Try k=2,3
k=3
Y[:,k]'*A-λ[k]*Y[:,k]'


Out[26]:
1×3 Adjoint{Complex{Float64},Array{Complex{Float64},1}}:
 -2.34268e-7-1.94885e-7im  9.60124e-15-3.9217e-15im  -5.8567e-8-4.87212e-8im

In [27]:
ΔA=rand(3,3)/20
B=A+ΔA


Out[27]:
3×3 Array{Float64,2}:
 -2.96795    7.04707  -0.999704
  6.01817    8.02801  -1.96442
 72.0172   -27.9568   19.0161

In [28]:
norm(ΔA)


Out[28]:
0.08966213190999892

In [29]:
μ,Z=eigen(B)


Out[29]:
Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
values:
3-element Array{Complex{Float64},1}:
 -5.959852183114043 + 0.0im
 15.018000044707051 - 0.7293167022721797im
 15.018000044707051 + 0.7293167022721797im
vectors:
3×3 Array{Complex{Float64},2}:
 -0.236204+0.0im  -0.212242-0.0389745im  -0.212242+0.0389745im
   0.23407+0.0im  -0.421248-0.0775078im  -0.421248+0.0775078im
   0.94309+0.0im   0.877483-0.0im         0.877483+0.0im

In [30]:
# Fact 2
δλ=μ[1]-λ[1]


Out[30]:
0.040147816885962584 + 0.0im

In [31]:
k=1
Y[:,k]'*ΔA*X[:,k]/(Y[:,k]X[:,k])


Out[31]:
0.03991290270173844 + 0.0im

Example - Jordan form


In [32]:
n=10
c=0.5
J=Bidiagonal(c*ones(n),ones(n-1),'U')


Out[32]:
10×10 Bidiagonal{Float64,Array{Float64,1}}:
 0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅    ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0   ⋅ 
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5  1.0
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   0.5

In [33]:
# Accurately defined eigenvalues
λ=eigvals(J)


Out[33]:
10-element Array{Float64,1}:
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5
 0.5

In [34]:
# Only one eigenvector
X=eigvecs(J)


Out[34]:
10×10 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [35]:
x=eigvecs(J)[:,1]
y=eigvecs(J')[:,1]


Out[35]:
10-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 1.0

In [36]:
y'*Matrix(J)-0.5*y'


Out[36]:
1×10 Adjoint{Float64,Array{Float64,1}}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [37]:
# Just one perturbed element in the lower left corner
ΔJ=sqrt(eps())*[zeros(n-1);1]*Matrix(I,1,n)


Out[37]:
10×10 Array{Float64,2}:
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0.0         0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 1.49012e-8  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0

In [38]:
B=J+ΔJ
μ=eigvals(B)


Out[38]:
10-element Array{Complex{Float64},1}:
 0.3350615111920276 + 0.0im
 0.3665619595167544 - 0.09694841124280963im
 0.3665619595167544 + 0.09694841124280963im
 0.4490312038888047 - 0.15686582456677023im
 0.4490312038888047 + 0.15686582456677023im
 0.5509687960268455 - 0.15686582462828694im
 0.5509687960268455 + 0.15686582462828694im
 0.6334380405156275 - 0.09694841134170258im
 0.6334380405156275 + 0.09694841134170258im
 0.6649384889119079 + 0.0im

In [39]:
# Fact 2
maximum(abs,λ-μ)


Out[39]:
0.16493848891190788

In [40]:
y'*ΔJ*x/(y⋅x)


Out[40]:
Inf

However, since $B$ is diagonalizable, we can apply Bauer-Fike theorem to it:


In [41]:
μ,Y=eigen(B)


Out[41]:
Eigen{Complex{Float64},Complex{Float64},Array{Complex{Float64},2},Array{Complex{Float64},1}}
values:
10-element Array{Complex{Float64},1}:
 0.3350615111920276 + 0.0im
 0.3665619595167544 - 0.09694841124280963im
 0.3665619595167544 + 0.09694841124280963im
 0.4490312038888047 - 0.15686582456677023im
 0.4490312038888047 + 0.15686582456677023im
 0.5509687960268455 - 0.15686582462828694im
 0.5509687960268455 + 0.15686582462828694im
 0.6334380405156275 - 0.09694841134170258im
 0.6334380405156275 + 0.09694841134170258im
 0.6649384889119079 + 0.0im
vectors:
10×10 Array{Complex{Float64},2}:
   -0.986304+0.0im     0.986304-0.0im          …     0.986304+0.0im
    0.162679+0.0im     -0.13161-0.0956206im          0.162679+0.0im
  -0.0268321+0.0im   0.00829158+0.0255188im         0.0268321+0.0im
  0.00442565+0.0im    0.0013676-0.00420904im       0.00442565+0.0im
 -0.00072996+0.0im  -0.00059055+0.000429059im      0.00072996+0.0im
 0.000120398+0.0im  0.000120398-6.48524e-14im  …  0.000120398+0.0im
 -1.98583e-5+0.0im  -1.60657e-5-1.16724e-5im       1.98583e-5+0.0im
   3.2754e-6+0.0im   1.01216e-6+3.11509e-6im        3.2754e-6+0.0im
  -5.4024e-7+0.0im   1.66943e-7-5.13799e-7im        5.4024e-7+0.0im
  8.91064e-8+0.0im  -7.20886e-8+5.23754e-8im       8.91064e-8+0.0im

In [42]:
cond(Y)


Out[42]:
1.1068834616372574e7

In [43]:
opnorm(inv(Y)*ΔJ*Y), cond(Y)*opnorm(ΔJ)


Out[43]:
(0.16493848884660864, 0.1649384888466086)

Example - Normal matrix


In [44]:
using SpecialMatrices

In [46]:
n=6
C=SpecialMatrices.Circulant(rand(-5:5,n))


Out[46]:
6×6 Circulant{Int64}:
  5  -3  -4   4   1   5
  5   5  -3  -4   4   1
  1   5   5  -3  -4   4
  4   1   5   5  -3  -4
 -4   4   1   5   5  -3
 -3  -4   4   1   5   5

In [47]:
λ=eigvals(Matrix(C))


Out[47]:
6-element Array{Complex{Float64},1}:
 -3.999999999999991 + 0.0im
 3.5000000000000004 - 11.258330249197702im
 3.5000000000000004 + 11.258330249197702im
  8.000000000000004 + 0.0im
                9.5 - 2.5980762113533156im
                9.5 + 2.5980762113533156im

In [48]:
ΔC=rand(n,n)*0.0001


Out[48]:
6×6 Array{Float64,2}:
 3.92691e-5  7.27366e-5  6.90261e-5  2.42908e-5  1.81437e-5  2.76241e-5
 7.05105e-5  7.89702e-5  2.29716e-5  2.9037e-5   1.53071e-5  8.42495e-5
 6.43087e-5  6.39203e-6  9.46879e-5  5.87238e-5  7.24736e-5  1.52988e-5
 2.8127e-5   1.03887e-5  5.02815e-5  8.84835e-6  6.11838e-6  7.97288e-5
 4.14654e-5  4.70945e-5  5.89819e-5  8.2312e-5   3.28422e-6  8.94333e-5
 8.22961e-5  4.63776e-5  7.59579e-5  6.51096e-5  2.87437e-5  7.06696e-5

In [49]:
@show opnorm(ΔC)
μ=eigvals(C+ΔC)


opnorm(ΔC) = 0.00030545005746327017
Out[49]:
6-element Array{Complex{Float64},1}:
 -3.9999781998620403 + 0.0im
  3.5000115132451928 - 11.258315332858457im
  3.5000115132451928 + 11.258315332858457im
   8.000289874208157 + 0.0im
   9.499980514227342 - 2.5980719544580677im
   9.499980514227342 + 2.5980719544580677im

Example - Hermitian matrix


In [50]:
m=10
n=6
A=rand(m,n)
# Some scaling
D=Diagonal((rand(n).-0.5)*exp(20))
A=A*D


Out[50]:
10×6 Array{Float64,2}:
 1.25966e8       7.02544e6  -9.43559e7  …       -1.05693e6  8.73886e7
 9.47744e7       5.04155e6  -2.16113e8          -2.86674e6  1.83839e8
 1.6272e8        7.91704e6  -2.64161e7          -7.92199e6  9.82895e7
 2.79339e7       5.03135e6  -8.48784e7          -4.81438e6  1.02891e8
 1.15825e8       2.46463e6  -1.69725e8          -7.27184e6  1.35496e8
 1.41399e8       8.36392e6  -2.08999e8  …       -5.67275e6  3.04284e7
 1.02057e8       4.21508e6  -2.19684e8     -878544.0        2.25064e8
 2.01415e8       4.00323e6  -1.59822e8          -5.10901e6  1.58651e8
 1.61597e8  684769.0        -1.73175e8          -4.94004e6  2.2382e8
 1.99626e8       3.75969e6  -1.94627e8          -5.73346e6  1.32728e8

In [51]:
using Statistics
A=cor(A)


Out[51]:
6×6 Array{Float64,2}:
  1.0        -0.0972412  -0.108436   0.0413051  -0.327659    0.0368238
 -0.0972412   1.0         0.381665   0.271144    0.0222139  -0.767267
 -0.108436    0.381665    1.0       -0.315365   -0.30521    -0.405039
  0.0413051   0.271144   -0.315365   1.0         0.601295   -0.0240472
 -0.327659    0.0222139  -0.30521    0.601295    1.0         0.342861
  0.0368238  -0.767267   -0.405039  -0.0240472   0.342861    1.0

In [52]:
ΔA=cor(rand(m,n)*D)*1e-5


Out[52]:
6×6 Array{Float64,2}:
  1.0e-5       4.32061e-6  -1.75158e-6   1.50722e-6   2.5446e-6   -2.42234e-6
  4.32061e-6   1.0e-5      -3.58807e-6   2.37927e-6   3.40892e-7   2.09382e-6
 -1.75158e-6  -3.58807e-6   1.0e-5      -3.43031e-6   2.97274e-6  -1.24208e-6
  1.50722e-6   2.37927e-6  -3.43031e-6   1.0e-5       4.80153e-6  -4.98282e-6
  2.5446e-6    3.40892e-7   2.97274e-6   4.80153e-6   1.0e-5      -2.26834e-6
 -2.42234e-6   2.09382e-6  -1.24208e-6  -4.98282e-6  -2.26834e-6   1.0e-5

In [53]:
B=A+ΔA


Out[53]:
6×6 Array{Float64,2}:
  1.00001    -0.0972369  -0.108438   0.0413066  -0.327657    0.0368214
 -0.0972369   1.00001     0.381662   0.271146    0.0222142  -0.767265
 -0.108438    0.381662    1.00001   -0.315369   -0.305207   -0.405041
  0.0413066   0.271146   -0.315369   1.00001     0.6013     -0.0240522
 -0.327657    0.0222142  -0.305207   0.6013      1.00001     0.342859
  0.0368214  -0.767265   -0.405041  -0.0240522   0.342859    1.00001

In [54]:
λ,U=eigen(A) 
μ=eigvals(B)
[λ μ]


Out[54]:
6×2 Array{Float64,2}:
 0.134355  0.134367
 0.240809  0.240816
 0.552617  0.552625
 1.11466   1.11467
 1.75775   1.75776
 2.19981   2.19982

In [55]:
norm(ΔA)


Out[55]:
2.948088199478021e-5

In [56]:
# ?round

In [57]:
# Residual bounds - how close is μ, y to λ[2],X[:,2]
k=3
μ=round(λ[k],sigdigits=2)
y=round.(U[:,k],sigdigits=2)
y=y/norm(y)


Out[57]:
6-element Array{Float64,1}:
  0.35953291062718784
 -0.039948101180798654
  0.7789879730255738
  0.14980537942799493
  0.30959778415118955
  0.3795069612175872

In [58]:
μ


Out[58]:
0.55

In [59]:
# Fact 9
r=A*y-μ*y


Out[59]:
6-element Array{Float64,1}:
 -7.610655168160951e-5
  0.0006875869884527333
  0.0008604983365066987
  0.0027992496470867073
  0.0030679917800974232
  0.0016941327135482631

In [60]:
minimum(abs,μ.-λ), norm(r)


Out[60]:
(0.0026171574694046074, 0.0046192513745848445)

In [61]:
# Fact 10 - μ is Rayleigh quotient
μ=y⋅(A*y)
r=A*y-μ*y


Out[61]:
6-element Array{Float64,1}:
 -0.001020818291133485
  0.0007925549595029432
 -0.0011863770989723466
  0.0024056197556484166
  0.0022544900044583205
  0.0006969369885712895

In [62]:
η=min(abs(μ-λ[k-1]),abs(μ-λ[k+1]))


Out[62]:
0.3118185216602711

In [63]:
μ-λ[k], norm(r)^2/η


Out[63]:
(1.0451051339344097e-5, 4.628704108553157e-5)

In [64]:
# Eigenvector bound
# cos(θ)
cosθ=dot(y,U[:,k])
# sin(θ)
sinθ=sqrt(1-cosθ^2)
sinθ,norm(r)/η


Out[64]:
(0.003128568489950806, 0.01218368762036048)

In [65]:
# Residual bounds - Fact 13
U=eigvecs(A)
Q=round.(U[:,1:3],sigdigits=2)
# Orthogonalize
F=qr(Q)
X=Matrix(F.Q)
M=X'*A*X
R=A*X-X*M
μ=eigvals(M)
R


Out[65]:
6×3 Array{Float64,2}:
 -0.000456137  -0.00310462   0.000923572
  0.00284021   -0.00429453  -0.000566894
 -0.00155528   -0.00204963   0.00115593
  0.00503226   -0.00094277  -0.0023043
  0.00382348    0.00272604  -0.00247637
 -0.00118206    0.00484457  -0.000377551

In [66]:
λ


Out[66]:
6-element Array{Float64,1}:
 0.13435492470934074
 0.24080908686047292
 0.5526171574694047
 1.1146596965216333
 1.7577479353532806
 2.1998111990858704

In [67]:
μ


Out[67]:
3-element Array{Float64,1}:
 0.13438771985318887
 0.24085290581036162
 0.5526281413932816

In [68]:
# The entries of μ are not ordered - which algorithm was called?
issymmetric(M)


Out[68]:
false

In [69]:
M=Hermitian(M)
R=A*X-X*M
μ=eigvals(M)


Out[69]:
3-element Array{Float64,1}:
 0.13438771985318895
 0.24085290581036148
 0.5526281413932815

In [70]:
η=λ[4]-λ[3]


Out[70]:
0.5620425390522287

In [71]:
norm(λ[1:3]-μ), norm(R)^2/η


Out[71]:
(5.582354712031574e-5, 0.0002316576033240425)

In [72]:
A


Out[72]:
6×6 Array{Float64,2}:
  1.0        -0.0972412  -0.108436   0.0413051  -0.327659    0.0368238
 -0.0972412   1.0         0.381665   0.271144    0.0222139  -0.767267
 -0.108436    0.381665    1.0       -0.315365   -0.30521    -0.405039
  0.0413051   0.271144   -0.315365   1.0         0.601295   -0.0240472
 -0.327659    0.0222139  -0.30521    0.601295    1.0         0.342861
  0.0368238  -0.767267   -0.405039  -0.0240472   0.342861    1.0

In [73]:
# Fact 13
M=A[1:3,1:3]
H=A[4:6,4:6]
E=A[4:6,1:3]
# Block-diagonal matrix
B=cat(M,H,dims=(1,2))


Out[73]:
6×6 Array{Float64,2}:
  1.0        -0.0972412  -0.108436   0.0        0.0        0.0
 -0.0972412   1.0         0.381665   0.0        0.0        0.0
 -0.108436    0.381665    1.0        0.0        0.0        0.0
  0.0         0.0         0.0        1.0        0.601295  -0.0240472
  0.0         0.0         0.0        0.601295   1.0        0.342861
  0.0         0.0         0.0       -0.0240472  0.342861   1.0

In [74]:
η=minimum(abs,eigvals(M)-eigvals(H))
μ=eigvals(B)
[λ μ]


Out[74]:
6×2 Array{Float64,2}:
 0.134355  0.297292
 0.240809  0.618158
 0.552617  0.951067
 1.11466   1.02069
 1.75775   1.43078
 2.19981   1.68202

In [75]:
2*norm(E)^2/(η+sqrt(η^2+4*norm(E)^2))


Out[75]:
1.028679107284606

In [76]:
# Eigenspace bounds - Fact 14
B=A+ΔA
μ,V=eigen(B)


Out[76]:
Eigen{Float64,Float64,Array{Float64,2},Array{Float64,1}}
values:
6-element Array{Float64,1}:
 0.1343671788674145
 0.24081590442837636
 0.5526254737571562
 1.1146730780095462
 1.7577617744522933
 2.1998165904852134
vectors:
6×6 Array{Float64,2}:
 -0.150289    0.267253   0.360222    0.842404   0.258021    0.00315969
  0.565226    0.465177  -0.0400703   0.116816  -0.472223   -0.475273
 -0.0700672  -0.216443   0.779651   -0.295464   0.0656684  -0.498775
  0.135886   -0.649499   0.148268    0.352715  -0.609192    0.205376
 -0.450652    0.482535   0.307121   -0.204349  -0.52861     0.385438
  0.656865    0.103191   0.380095   -0.152459   0.235745    0.578436

In [77]:
# sin(Θ(U[:,1:3],V[:,1:3]))
X=U[:,1:3]
Q=V[:,1:3]
cosθ=svdvals(sqrt(Q'*Q)*Q'*X*sqrt(X'*X))
sinθ=sqrt.(1 .-cosθ.^2)


Out[77]:
3-element Array{Float64,1}:
 1.1920928955078125e-7
 1.971519923615451e-6
 5.452684898236986e-6

In [78]:
# Bound
M=Q'*A*Q


Out[78]:
3×3 Array{Float64,2}:
  0.134355    -3.7484e-6    3.68613e-6
 -3.7484e-6    0.240809    -2.59057e-6
  3.68613e-6  -2.59057e-6   0.552617

In [79]:
R=A*Q-Q*M


Out[79]:
6×3 Array{Float64,2}:
 -6.96578e-7  -2.86858e-6   2.09432e-6
  2.01828e-6  -2.31992e-7  -8.23208e-7
  5.43376e-7   1.04884e-6  -1.0602e-6
  1.40336e-6  -1.0944e-6    2.13044e-7
  8.96453e-7   7.28267e-7  -6.76272e-7
 -1.51341e-6   3.81222e-7   5.66406e-7

In [80]:
eigvals(M), λ


Out[80]:
([0.13435492471516355, 0.24080908687332286, 0.5526171574792029], [0.13435492470934074, 0.24080908686047292, 0.5526171574694047, 1.1146596965216333, 1.7577479353532806, 2.1998111990858704])

In [81]:
η=abs(eigvals(M)[3]-λ[4])
norm(sinθ), norm(R)/η


Out[81]:
(5.799385679827611e-6, 9.445799452076198e-6)