This small notebook shows a naive implementation of an algorithm to compute the eigen values and eigen vectors of a matrix, written in the Julia programming language.
In [1]:
A = [ 1. 2 3; 4 5 6; 7 8 9 ]
Out[1]:
In [2]:
B = [12 -51 4; 6 167 -68; -4 24 -41]
Out[2]:
In [3]:
function check_DV(A, D, V)
I = eye(size(A)[1])
for i = 1:size(A)[1]
check = (A - D[i] * I) * V[:,i]
println("For the ", i, "th eigen value and vector, (A - lambda I)^k * v = ", check, "\n\twhich has a L2 norm = ", norm(check))
assert(norm(check) <= 1e-14)
end
end
Out[3]:
In [4]:
D_A, V_A = eig(A)
Out[4]:
In [5]:
check_DV(A, D_A, V_A)
In [6]:
D_B, V_B = eig(B)
Out[6]:
In [7]:
check_DV(B, D_B, V_B)
In [8]:
"Power iteration method on A, to find eigenpair with largest eigen value."
function poweriteration(A, maxiter=20, userandomstart=true)
n = size(A)[1]
if userandomstart
X = randn((n, 1))
else
X = ones((n, 1))
end
for i = 1:maxiter
next_X = A * X
X = next_X / norm(next_X)
end
# lambda = (X^* ⋅ A ⋅ X) / (X^* ⋅ X)
lambda = (transpose(conj(X)) * (A * X)) / (transpose(conj(X)) * X)
lambda, X
end
Out[8]:
For instance, we obtain correctly the largest eigenpair for our example $A$. (upto a $-1$ sign in $v$)
In [9]:
poweriteration(A)
Out[9]:
In [10]:
D_A, V_A = eig(A)
Out[10]:
In [11]:
(D_A[1], -1 * V_A[:,1])
Out[11]:
What is the typical dependency on the number of iterations?
In [12]:
poweriteration
Out[12]:
In [13]:
println(poweriteration(A, 1))
println(poweriteration(A, 2))
println(poweriteration(A, 5))
println(poweriteration(A, 10))
The QR method can be used to compute eigenvalues, and then X method to compute an approximation of a correspond eigenvector for each eigenvalue.
First, we need to implement the QR decomposition, which can be based on the Gram-Schmidt process.
In [14]:
"Inner product of two vectors v, w. Can be vertical (n,1) or horizontal (1,n) vectors."
function inner(v, w)
assert(size(v) == size(w))
nm = size(v)
if length(nm) == 1
return transpose(conj(v)) * w
elseif nm[1] == 1
return conj(v) * transpose(w)
end
end
Out[14]:
It works for both $(1,n)$ and $(n,1)$ vectors:
In [15]:
inner([1 1 1], [2 3 4])
Out[15]:
In [16]:
inner([1; 1; 1], [2; 3; 4])
Out[16]:
Then a projection operator:
In [17]:
"projection(a, e): projection operator of vector a onto base vector e. Uses inner(e, a) and inner(e, e)."
function projection(a, e)
return (inner(e, a) / inner(e, e)) * e
end
Out[17]:
Then the Gram-Schmidt process:
In [18]:
"gramschmidt(A): Gram-Schmidt orthogonalization operator, returns us, es (unormalized matrix, base matrix)."
function gramschmidt(A, verbose=false)
assert(size(A)[1] == size(A)[2])
n = size(A)[1]
if verbose
println("n = ", n)
end
us = zeros((n, n))
es = zeros((n, n))
for i = 1:n
if verbose
println("i = ", i)
end
us[:,i] = A[:,i]
for j = 1:i-1
if verbose
println("\tj = ", j)
println("\tus[:,i] = ", us[:,i])
end
us[:,i] -= projection(A[:,i], us[:,j])
end
if verbose
println("us[:,i] = ", us[:,i])
end
es[:,i] = us[:,i] / norm(us[:,i])
if verbose
println("es[:,i] = ", es[:,i])
end
end
return us, es
end
Out[18]:
Example:
In [19]:
A
Out[19]:
In [20]:
us, es = gramschmidt(A)
Out[20]:
In [21]:
us
Out[21]:
In [22]:
es
Out[22]:
We can check that $a_1 = <e_1,a_1> e_1$, $a_2 = <e_1,a_2> e_1 + <e_2,a_2> e_2$ and so on:
In [23]:
inner(es[:,1], A[:,1]) * es[:,1]
Out[23]:
In [24]:
inner(es[:,1], A[:,2]) * es[:,1] + inner(es[:,2], A[:,2]) * es[:,2]
Out[24]:
It's not true for the last one, as $A$ was not full rank.
In [25]:
rank(A)
Out[25]:
In [26]:
inner(es[:,1], A[:,3]) * es[:,1] + inner(es[:,2], A[:,3]) * es[:,2] + inner(es[:,3], A[:,3]) * es[:,3]
Out[26]:
$E$ is an orthogonal matrix indeed:
In [27]:
[ norm(es[:,i]) for i = 1:size(es)[1] ]
Out[27]:
In [28]:
(es' * es)[1:2,1:2]
Out[28]:
Second example:
In [29]:
us, es = gramschmidt(B)
Out[29]:
In [30]:
[ norm(es[:,i]) for i = 1:size(es)[1] ]
Out[30]:
In [31]:
es * es'
Out[31]:
Last example:
In [32]:
V = [3 2; 1 2]
Out[32]:
In [33]:
us, es = gramschmidt(V)
Out[33]:
In [34]:
es' * es
Out[34]:
In [35]:
"QR decomposition, returns (Q, R): Q is orthogonal and R is upper-triangular, such that A = Q R."
function QR(A)
assert(size(A)[1] == size(A)[2])
n = size(A)[1]
us, es = gramschmidt(A)
Q = copy(es)
R = zeros((n, n))
for i = 1:n
for j = i:n
R[i,j] = inner(es[:,i], A[:,j])
end
end
return Q, R
end
Out[35]:
First example
In [36]:
Q, R = QR(A)
Out[36]:
In [37]:
A
Out[37]:
In [38]:
Q
Out[38]:
In [39]:
R
Out[39]:
In [40]:
Q * R
Out[40]:
Second example
In [41]:
Q2, R2 = QR(B)
Out[41]:
In [42]:
B
Out[42]:
In [43]:
Q2
Out[43]:
In [44]:
R2
Out[44]:
In [45]:
Q2 * R2
Out[45]:
Due to numerical errors, $A \neq QR$ but almost:
In [46]:
B == Q2 * R2
Out[46]:
In [47]:
assert(norm(B - (Q2 * R2)) <= 1e-13)
In [48]:
"Apply the QR method for maxiter steps. Should return a triangular matrix similar to A (same eigenvalues)."
function QR_method(A, maxiter=50)
Ak = A
for k = 1:maxiter
Qk, Rk = QR(Ak)
Ak = Rk * Qk
end
return Ak
end
Out[48]:
It should produce matrix which are almost triangular!
In [49]:
Ak = QR_method(A)
Out[49]:
In [50]:
"Truncate to zero values under the diagonal (have to be smaller than tolerance)"
function truncate_to_zero_below_diag(A, tolerance=1e-12)
assert(size(A)[1] == size(A)[2])
n = size(A)[1]
for j = 1:n
for i = j+1:n
assert(norm(A[i,j]) <= tolerance)
A[i,j] = 0
end
end
return A
end
Out[50]:
In [51]:
truncate_to_zero_below_diag(Ak)
Out[51]:
In [52]:
Ak = QR_method(B)
Out[52]:
In [53]:
truncate_to_zero_below_diag(Ak)
Out[53]:
Finally, we can extract the eigen values from the diagonal:
In [54]:
"Extract the diagonal from the QR method."
function QR_eigenvalues(A, maxiter=50)
return diag(QR_method(A, maxiter))
end
Out[54]:
It does not work for matrices which are not full rank:
In [55]:
QR_eigenvalues(A)
Out[55]:
In [56]:
eigvals(A)
Out[56]:
But it works fine for full rank matrix:
In [57]:
QR_eigenvalues(B)
Out[57]:
In [58]:
eigvals(B)
Out[58]:
Now, we have a numerical algorithm that gives an approximation of the eigen values. For each eigen value $\lambda$, we can use the inverse iteration to find the corresponding eigen vector.
In [59]:
"Inverse iteration method to find the eigen vector corresponding to a given eigen value."
function inverseIteration(A, val, maxiter=20, userandomstart=true)
mu_I = val * eye(A)
inv_A_minus_muI = inv(A - mu_I)
n = size(A)[1]
if userandomstart
X = randn((n, 1))
else
X = ones((n, 1))
end
for i = 1:maxiter
next_X = inv_A_minus_muI * X
X = next_X / norm(next_X)
end
X
end
Out[59]:
Now, putting it all together:
In [60]:
"Approximation of D, V: D contains the eigen values (vector) and V the eigen vectors (column based)."
function QR_eigen(A, maxiter=20, userandomstart=true)
n = size(A)[1]
D = QR_eigenvalues(A, maxiter)
V = zeros((n,n))
for i = 1:n
V[:,i] = inverseIteration(A, D[i], maxiter, userandomstart)
end
return D, V
end
Out[60]:
In [61]:
D2, V2 = QR_eigen(B)
Out[61]:
In [62]:
D2s, V2s = eig(B)
Out[62]:
Up to a permutation, and up to multiplication by $-1$ of some eigen vectors, we got the good values!
It is well known that finding eigenpairs and finding roots of a polynomial are equivalent problems (thanks to companion matrices). The Abel-Ruffini theorem states that there is no exact algorithm to find roots of polynomial of degrees larger than $4$, and therefore there is no hope of finding an exact algorithm for the eigenpair problem.
Hence, we focussed only on one numerical (i.e., iterative) method.
It's sad to see that there is not more details about general algorithms for the eigenpair problem.
That's it for today, folks! See here for other notebooks I wrote.