The reader should be familiar with basic linear algebra concepts and notebooks related to eigenvalue decomposition.
The reader should be able to undestand and check the facts about singular value decomposition.
There are many excellent books on the subject. Here we list a few:
J. W. Demmel, Applied Numerical Linear Algebra
G. H. Golub and C. F. Van Loan, Matrix Computations
N. Higham, Accuracy and Stability of Numerical Algorithms
L. Hogben, ed., Handbook of Linear Algebra
For more details and the proofs of the Facts below, see R. C. Li, Matrix Perturbation Theory and R. Mathias, Singular Values and Singular Value Inequalities and the references therein.
Let $A\in\mathbb{C}^{m\times n}$ and let $q=\min\{m,n\}$.
The singular value decomposition (SVD) of $A$ is
$$A=U\Sigma V^*,$$where $U\in\mathbb{C}^{m\times m}$ and $V\in\mathbb{C}^{n\times n}$ are unitary, and $\Sigma=\mathop{\mathrm{diag}}(\sigma_1,\sigma_2,\ldots)\in\mathbb{R}^{m\times n}$ with all $\sigma_j\geq 0$.
Here $\sigma_j$ is the singular value, $u_j\equiv U_{:,j}$ is the corresponding left singular vector, and $v_j\equiv V_{:,j}$ is the corresponding right singular vector.
The set of singular values is $sv(A)=\{\sigma_1,\sigma_2,\ldots,\sigma_{q}\}$.
We assume that singular values are ordered, $\sigma_1\geq\sigma_2\geq\cdots\sigma_q\geq 0$.
The Jordan-Wielandt matrix is the Hermitian matrix $$ J=\begin{bmatrix}0 & A \\ A^* & 0 \end{bmatrix} \in \mathbb{C}^{(m+n) \times (m+n)}. $$
There are many facts related to the singular value problem for general matrices. We state some basic ones:
If $A\in\mathbb{R}^{m\times n}$, then $U$ and $V$ are real.
Singular values are unique (uniquely determined by the matrix).
$\sigma_j(A^T)=\sigma_j(A^*)=\sigma_j(\bar A)=\sigma_j(A)$ for $j=1,2,\ldots,q$.
$A v_j=\sigma_j u_{j}$ and $A^* u_{j}=\sigma_j v_{j}$ for $j=1,2,\ldots,q$.
$A=\sigma_1 u_{1} v_{1}^* + \sigma_2 u_{2} v_{2}^* +\cdots + \sigma_q u_{q} v_{q}^*$.
Unitary invariance. For any unitary $U\in\mathbb{C}^{m\times m}$ and $V\in\mathbb{C}^{n\times n}$, $sv(A)=sv(UAV)$.
There exist unitary matrices $U\in\mathbb{C}^{m\times m}$ and $V\in\mathbb{C}^{n\times n}$ such that $A=UBV$ if and only if $sv(A)=sv(B)$.
SVD of $A$ is related to eigenvalue decompositions of Hermitian matrices $A^*A=V\Sigma^T\Sigma V^*$ and $AA^*=U\Sigma\Sigma^TU^*$. Thus, $\sigma_j^2(A)=\lambda_j(A^*A)=\lambda_j(AA^*)$ for $j=1,2,\ldots,q$.
The eigenvalues of Jordan-Wielandt matrix are $\pm \sigma_1(A), \pm\sigma_2(A), \cdots,\pm\sigma_q(A)$ together with $|m-n|$ zeros. The eigenvectors are obtained from an SVD of $A$. This relationship is used to deduce singular value results from the results for eigenvalues of Hermitian matrices.
$\mathop{\mathrm{trace}}(|A|_{spr})=\sum_{i=1}^q \sigma_i$, where $|A|_{spr}=(A^*A)^{1/2}$.
If $A$ is square, then $|\det(A)|=\prod_{i=1}^n \sigma_i$.
If $A$ is square, then $A$ is singular $\Leftrightarrow$ $\sigma_j(A)=0$ for some $j$.
Min-max Theorem. It holds: \begin{align*} \sigma_k& =\max_{\dim(W)=k}\min_{x\in W, \ \|x\|_2=1} \|Ax\|_2\\ & =\min_{\dim(W)=n-k+1}\max_{x\in W, \ \|x\|_2=1} \|Ax\|_2. \end{align*}
$\|A\|_2=\sigma_1(A)$.
For $B\in\mathbb{C}^{m\times n}$, $$ |\mathop{\mathrm{trace}}(AB^*)|\leq \sum_{j=1}^q \sigma_j(A)\sigma_j(B). $$
Interlace Theorems. Let $B$ denote $A$ with the one of its rows or columns deleted. Then $$ \sigma_{j+1}(A)\leq \sigma_j(B)\leq \sigma_j(A),\quad j=1,\ldots,q-1. $$ Let $B$ denote $A$ with a row and a column deleted. Then $$ \sigma_{j+2}(A)\leq \sigma_j(B)\leq \sigma_j(A),\quad j=1,\ldots,q-2. $$
Weyl Inequalities. For $B\in\mathbb{C}^{m\times n}$, it holds: \begin{align*} \sigma_{j+k-1}(A+B)& \leq \sigma_j(A)+\sigma_k(B), \quad j+k\leq q+1,\\ \sum_{j=1}^k \sigma_j(A+B)& \leq \sum_{j=1}^k \sigma_j(A) + \sum_{j=1}^k \sigma_j(B), \quad k=1,\ldots,q. \end{align*}
In [1]:
using SymPy
In [2]:
A=[ 3 2 1
-5 -1 -4
5 0 2]
Out[2]:
In [3]:
@vars x
Out[3]:
In [4]:
B=A'*A
Out[4]:
In [5]:
using LinearAlgebra
In [6]:
# Characteristic polynomial p_B(λ)
eye(n)=Matrix{Int}(I,n,n)
p(x)=simplify(det(B-x*eye(3)))
p(x)
Out[6]:
In [7]:
λ=solve(p(x),x)
Out[7]:
In [8]:
map(Float64,λ)
Out[8]:
In [9]:
V=Array{Any}(undef,3,3)
for j=1:3
V[:,j]=nullspace(B-map(Float64,λ[j])*I)
end
V
Out[9]:
In [10]:
U=Array{Any}(undef,3,3)
for j=1:3
U[:,j]=nullspace(A*A'-map(Float64,λ[j])*I)
end
U
Out[10]:
In [11]:
σ=sqrt.(λ)
Out[11]:
In [12]:
V'*A'*A*V
Out[12]:
In [13]:
V*=Diagonal(sign.(U'*A*V))
Out[13]:
In [14]:
A-U*Diagonal(map(Float64,σ))*V'
Out[14]:
In [15]:
S=svd(A)
Out[15]:
In [16]:
S.Vt
Out[16]:
In [17]:
typeof(S)
Out[17]:
In [18]:
U₁=S.U
σ₁=S.S
V₁=S.V
Out[18]:
In [19]:
A-S.U*Diagonal(S.S)*S.Vt
Out[19]:
Observe the signs of the columns!
In [20]:
V
Out[20]:
In [21]:
U₁
Out[21]:
In [22]:
U
Out[22]:
In [23]:
?svd;
In [24]:
import Random
Random.seed!(421)
m=5
n=3
q=min(m,n)
A=rand(ComplexF64,m,n)
Out[24]:
In [25]:
U,σ,V=svd(A,full=true)
U
Out[25]:
In [26]:
V
Out[26]:
In [27]:
U,σ,V=svd(A)
U
Out[27]:
In [28]:
σ
Out[28]:
In [29]:
V
Out[29]:
In [30]:
norm(A-U[:,1:q]*Diagonal(σ)*V'), norm(U'*U-I), norm(V'*V-I)
Out[30]:
In [31]:
# Fact 4
@show k=rand(1:q)
norm(A*V[:,k]-σ[k]*U[:,k],Inf), norm(A'*U[:,k]-σ[k]*V[:,k],Inf)
Out[31]:
In [32]:
λ₁,V₁=eigen(A'*A)
Out[32]:
In [33]:
sqrt.(λ₁)
Out[33]:
In [34]:
λU,U₁=eigen(A*A')
Out[34]:
In [35]:
V
Out[35]:
In [36]:
V₁
Out[36]:
In [37]:
abs.(V'*V₁)
Out[37]:
Explain non-uniqueness of $U$ and $V$!
In [38]:
# Jordan-Wielandt matrix
J=[zero(A*A') A; A' zero(A'*A)]
Out[38]:
In [39]:
round.(abs.(J),digits=2)
Out[39]:
In [40]:
λJ,UJ=eigen(J)
Out[40]:
In [41]:
λJ
Out[41]:
In [42]:
m=8
n=5
q=min(m,n)
A=rand(-9:9,m,n)
Out[42]:
In [43]:
U,σ,V=svd(A)
Out[43]:
In [44]:
# Fact 10
tr(sqrt(A'*A)), sum(σ)
Out[44]:
In [45]:
# Fact 11
B=rand(n,n)
det(B), prod(svdvals(B))
Out[45]:
In [46]:
# Fact 14
norm(A), opnorm(A), σ[1]
Out[46]:
In [47]:
# Fact 15
B=rand(m,n)
abs(tr(A*B')), sum(svdvals(A)⋅svdvals(B))
Out[47]:
In [48]:
# Interlace Theorems (repeat several times)
j=rand(1:q)
σBrow=svdvals(A[[1:j-1;j+1:m],:])
σBcol=svdvals(A[:,[1:j-1;j+1:n]])
j, σ, σBrow, σBcol
Out[48]:
In [49]:
σ[1:end].>=σBrow, σ[1:end-1].>=σBcol,
σ[2:end].<=σBrow[1:end-1], σ[2:end].<=σBcol
Out[49]:
In [50]:
# Weyl Inequalities
B=rand(m,n)
μ=svdvals(B)
γ=svdvals(A+B)
[γ σ μ]
Out[50]:
In [51]:
@show k=rand(1:q)
sum(γ[1:k]),sum(σ[1:k])+sum(μ[1:k])
Out[51]:
Let $A=U\Sigma V^*$, let $\tilde \Sigma$ be equal to $\Sigma$ except that $\tilde \Sigma_{jj}=0$ for $j>k$, and let $\tilde A=U\tilde \Sigma V^*$. Then $\mathop{\mathrm{rank}}(\tilde A)\leq k$ and
\begin{align*} \min\{\|A-B\|_2: \mathop{\mathrm{rank}}(B)\leq k\} & =\|A-\tilde A\|_2=\sigma_{k+1}(A)\\ \min\{\|A-B\|_F: \mathop{\mathrm{rank}}(B)\leq k\} & =\|A-\tilde A\|_F= \bigg(\sum_{j=k+1}^{q}\sigma_{j}^2(A)\bigg)^{1/2}. \end{align*}This is the Eckart-Young-Mirsky Theorem.
In [52]:
A
Out[52]:
In [53]:
σ
Out[53]:
In [54]:
# @show k=rand(1:q-1)
k=3
B=U*Diagonal([σ[1:k];zeros(q-k)])*V'
# B=U[:,1:k]*Diagonal(σ[1:k])*V[:,1:k]'
Out[54]:
In [55]:
A
Out[55]:
In [56]:
opnorm(A-B), σ[k+1]
Out[56]:
In [57]:
norm(A-B),norm(σ[k+1:q])
Out[57]: