Morten Hjorth-Jensen Email morten.hjorth-jensen@fys.uio.no, Department of Physics and Center of Mathematics for Applications, University of Oslo and National Superconducting Cyclotron Laboratory, Michigan State University
Date: Fall 2015
Eigenvalue problems and project 2.
Discussion of Jacobi's algorithm, chapters 7.1-7.4
Presentation of project 2.
Discussion of Householder's and Francis' algorithms, chapter 7.5
Power methods, chapter 7.6
Lanczos' method, chapter 7.7
Let us consider the matrix ${\bf A}$ of dimension $n$. The eigenvalues of ${\bf A}$ are defined through the matrix equation
where $\lambda^{(\nu)}$ are the eigenvalues and ${\bf x}^{(\nu)}$ the corresponding eigenvectors. Unless otherwise stated, when we use the wording eigenvector we mean the right eigenvector. The left eigenvalue problem is defined as
with ${\bf I}$ being the unity matrix. This equation provides a solution to the problem if and only if the determinant is zero, namely
or
The set of these roots is called the spectrum and is denoted as $\lambda({\bf A})$. If $\lambda({\bf A})=\left\{\lambda_1,\lambda_2,\dots ,\lambda_n\right\}$ then we have
and if we define the trace of ${\bf A}$ as
then
The Abel-Ruffini theorem (also known as Abel's impossibility theorem) states that there is no general solution in radicals to polynomial equations of degree five or higher.
The content of this theorem is frequently misunderstood. It does not assert that higher-degree polynomial equations are unsolvable. In fact, if the polynomial has real or complex coefficients, and we allow complex solutions, then every polynomial equation has solutions; this is the fundamental theorem of algebra. Although these solutions cannot always be computed exactly with radicals, they can be computed to any desired degree of accuracy using numerical methods such as the Newton-Raphson method or Laguerre method, and in this way they are no different from solutions to polynomial equations of the second, third, or fourth degrees.
The theorem only concerns the form that such a solution must take. The content of the theorem is that the solution of a higher-degree equation cannot in all cases be expressed in terms of the polynomial coefficients with a finite number of operations of addition, subtraction, multiplication, division and root extraction. Some polynomials of arbitrary degree, of which the simplest nontrivial example is the monomial equation $ax^n = b$, are always solvable with a radical.
The Abel-Ruffini theorem says that there are some fifth-degree equations whose solution cannot be so expressed. The equation $x^5 - x + 1 = 0$ is an example. Some other fifth degree equations can be solved by radicals, for example $x^5 - x^4 - x + 1 = 0$. The precise criterion that distinguishes between those equations that can be solved by radicals and those that cannot was given by Galois and is now part of Galois theory: a polynomial equation can be solved by radicals if and only if its Galois group is a solvable group.
Today, in the modern algebraic context, we say that second, third and fourth degree polynomial equations can always be solved by radicals because the symmetric groups $S_2, S_3$ and $S_4$ are solvable groups, whereas $S_n$ is not solvable for $n \ge 5$.
In the present discussion we assume that our matrix is real and symmetric, that is ${\bf A}\in {\mathbb{R}}^{n\times n}$. The matrix ${\bf A}$ has $n$ eigenvalues $\lambda_1\dots \lambda_n$ (distinct or not). Let ${\bf D}$ be the diagonal matrix with the eigenvalues on the diagonal
If ${\bf A}$ is real and symmetric then there exists a real orthogonal matrix ${\bf S}$ such that
and for $j=1:n$ we have ${\bf A}{\bf S}(:,j) = \lambda_j {\bf S}(:,j)$.
To obtain the eigenvalues of ${\bf A}\in {\mathbb{R}}^{n\times n}$, the strategy is to perform a series of similarity transformations on the original matrix ${\bf A}$, in order to reduce it either into a diagonal form as above or into a tridiagonal form.
We say that a matrix ${\bf B}$ is a similarity transform of ${\bf A}$ if
We multiply the first equation on the left by ${\bf S}^T$ and insert ${\bf S}^{T}{\bf S} = {\bf I}$ between ${\bf A}$ and ${\bf x}$. Then we get
which is the same as
Or apply subsequent similarity transformations so that ${\bf A}$ becomes tridiagonal (Householder) or upper/lower triangular (the QR method to be discussed later).
Thereafter, techniques for obtaining eigenvalues from tridiagonal matrices can be used.
Or use so-called power methods
Or use iterative methods (Krylov, Lanczos, Arnoldi). These methods are popular for huge matrix problems.
The general overview.
One speaks normally of two main approaches to solving the eigenvalue problem.
The first is the formal method, involving determinants and the characteristic polynomial. This proves how many eigenvalues there are, and is the way most of you learned about how to solve the eigenvalue problem, but for matrices of dimensions greater than 2 or 3, it is rather impractical.
The other general approach is to use similarity or unitary tranformations to reduce a matrix to diagonal form. This is normally done in two steps: first reduce to for example a tridiagonal form, and then to diagonal form. The main algorithms we will discuss in detail, Jacobi's and Householder's (so-called direct method) and Lanczos algorithms (an iterative method), follow this methodology.
Direct or non-iterative methods require for matrices of dimensionality $n\times n$ typically $O(n^3)$ operations. These methods are normally called standard methods and are used for dimensionalities $n \sim 10^5$ or smaller. A brief historical overview
Year | $n$ | |
---|---|---|
1950 | $n=20$ | (Wilkinson) |
1965 | $n=200$ | (Forsythe et al.) |
1980 | $n=2000$ | Linpack |
1995 | $n=20000$ | Lapack |
2012 | $n\sim 10^5$ | Lapack |
Matrix | ${\bf A}{\bf x}={\bf b}$ | ${\bf A}{\bf x}=\lambda{\bf x}$ |
---|---|---|
${\bf A}={\bf A}^*$ | Conjugate gradient | Lanczos |
${\bf A}\ne {\bf A}^*$ | GMRES etc | Arnoldi |
The Numerical Recipes codes have been rewritten in Fortran 90/95 and C/C++ by us. The original source codes are taken from the widely used software package LAPACK, which follows two other popular packages developed in the 1970s, namely EISPACK and LINPACK.
LINPACK: package for linear equations and least square problems.
LAPACK:package for solving symmetric, unsymmetric and generalized eigenvalue problems. From LAPACK's website http://www.netlib.org it is possible to download for free all source codes from this library. Both C/C++ and Fortran versions are available.
BLAS (I, II and III): (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations. Blas I is vector operations, II vector-matrix operations and III matrix-matrix operations.
Consider an example of an ($n\times n$) orthogonal transformation matrix
A similarity transformation
results in
To demonstrate the algorithm, we consider the simple $2\times 2$ similarity transformation of the full matrix. The matrix is symmetric, we single out $ 1 \le k < l \le n$ and use the abbreviations $c=\cos\theta$ and $s=\sin\theta$ to obtain
This means that for our $2\times 2$ case we have
which leads to
since
we obtain the quadratic equation (using $\cot 2\theta=1/2(\cot \theta-\tan\theta)$
resulting in
and $c$ and $s$ are easily obtained via
and $s=tc$. Convince yourself that we have $|\theta| \le \pi/4$. This has the effect
of minimizing the difference between the matrices ${\bf B}$ and ${\bf A}$ since
Choose a tolerance $\epsilon$, making it a small number, typically $10^{-8}$ or smaller.
Setup a while test where one compares the norm of the newly computed off-diagonal matrix elements [ \mathrm{off}({\bf A}) = \sqrt{\sum{i=1}^n\sum{j=1,j\ne i}^n a_{ij}^2} > \epsilon. ]
Now choose the matrix elements $a_{kl}$ so that we have those with largest value, that is $|a_{kl}|=\mathrm{max}_{i\ne j} |a_{ij}|$.
Compute thereafter $\tau = (a_{ll}-a_{kk})/2a_{kl}$, $\tan\theta$, $\cos\theta$ and $\sin\theta$.
Compute thereafter the similarity transformation for this set of values $(k,l)$, obtaining the new matrix ${\bf B}= {\bf S}(k,l,\theta)^T {\bf A}{\bf S}(k,l,\theta)$.
Compute the new norm of the off-diagonal matrix elements and continue till you have satisfied $\mathrm{off}({\bf B}) \le \epsilon$
The convergence rate of the Jacobi method is however poor, one needs typically $3n^2-5n^2$ rotations and each rotation requires $4n$ operations, resulting in a total of $12n^3-20n^3$ operations in order to zero out non-diagonal matrix elements.
We specialize to a symmetric $3\times 3 $ matrix ${\bf A}$. We start the process as follows (assuming that $a_{23}=a_{32}$ is the largest non-diagonal) with $c=\cos{\theta}$ and $s=\sin{\theta}$
We will choose the angle $\theta$ in order to have $a_{23}=a_{32}=0$. We get (symmetric matrix)
or
We repeat then assuming that $b_{12}$ is the largest non-diagonal matrix element and get a new matrix
We continue this process till all non-diagonal matrix elements are zero (ideally). You will notice that performing the above operations that the matrix element $b_{23}$ which was previous zero becomes different from zero. This is one of the problems which slows down the jacobi procedure.
The more general expression for the new matrix elements are
// we have defined a matrix A and a matrix R for the eigenvector, both of dim n x n
// The final matrix R has the eigenvectors in its row elements, it is set to one
// for the diagonal elements in the beginning, zero else.
....
double tolerance = 1.0E-10;
int iterations = 0;
while ( maxnondiag > tolerance && iterations <= maxiter)
{
int p, q;
maxnondiag = offdiag(A, p, q, n);
Jacobi_rotate(A, R, p, q, n);
iterations++;
}
...
// the offdiag function, using Armadillo
double offdiag(mat A, int p, int q, int n);
{
double max;
for (int i = 0; i < n; ++i)
{
for ( int j = i+1; j < n; ++j)
{
double aij = fabs(A(i,j));
if ( aij > max)
{
max = aij; p = i; q = j;
}
}
}
return max;
}
// more statements
void Jacobi_rotate ( mat A, mat R, int k, int l, int n )
{
double s, c;
if ( A(k,l) != 0.0 ) {
double t, tau;
tau = (A(l,l) - A(k,k))/(2*A(k,l));
if ( tau >= 0 ) {
t = 1.0/(tau + sqrt(1.0 + tau*tau));
} else {
t = -1.0/(-tau +sqrt(1.0 + tau*tau));
}
c = 1/sqrt(1+t*t);
s = c*t;
} else {
c = 1.0;
s = 0.0;
}
double a_kk, a_ll, a_ik, a_il, r_ik, r_il;
a_kk = A(k,k);
a_ll = A(l,l);
A(k,k) = c*c*a_kk - 2.0*c*s*A(k,l) + s*s*a_ll;
A(l,l) = s*s*a_kk + 2.0*c*s*A(k,l) + c*c*a_ll;
A(k,l) = 0.0; // hard-coding non-diagonal elements by hand
A(l,k) = 0.0; // same here
for ( int i = 0; i < n; i++ ) {
if ( i != k && i != l ) {
a_ik = A(i,k);
a_il = A(i,l);
A(i,k) = c*a_ik - s*a_il;
A(k,i) = A(i,k);
A(i,l) = c*a_il + s*a_ik;
A(l,i) = A(i,l);
}
// And finally the new eigenvectors
r_ik = R(i,k);
r_il = R(i,l);
R(i,k) = c*r_ik - s*r_il;
R(i,l) = c*r_il + s*r_ik;
}
return;
} // end of function jacobi_rotate
with $i=1,2,\dots, n$. We need to add to this system the two boundary conditions $u(a) =u_0$ and $u(b) = u_{n+1}$. If we define a matrix
and the corresponding vectors ${\bf u} = (u_1, u_2, \dots,u_n)^T$ and ${\bf f}({\bf u}) = f(x_1,x_2,\dots, x_n,u_1, u_2, \dots,u_n)^T$ we can rewrite the differential equation including the boundary conditions as a system of linear equations with a large number of unknowns
In our case $V(r)$ is the harmonic oscillator potential $(1/2)kr^2$ with $k=m\omega^2$ and $E$ is the energy of the harmonic oscillator in three dimensions. The oscillator frequency is $\omega$ and the energies are
In project 2 we choose $l=0$. Inserting $V(\rho) = (1/2) k \alpha^2\rho^2$ we end up with
We multiply thereafter with $2m\alpha^2/\hbar^2$ on both sides and obtain
The constant $\alpha$ can now be fixed so that
or
Defining
we can rewrite Schroedinger's equation as
where $h$ is our step. Next we define minimum and maximum values for the variable $\rho$, $\rho_{\mathrm{min}}=0$ and $\rho_{\mathrm{max}}$, respectively. You need to check your results for the energies against different values $\rho_{\mathrm{max}}$, since we cannot set $\rho_{\mathrm{max}}=\infty$.
With a given number of steps, $n_{\mathrm{step}}$, we then define the step $h$ as
Define an arbitrary value of $\rho$ as
we can rewrite the Schr\"odinger equation for $\rho_i$ as
or in a more compact way
and the non-diagonal matrix element
In this case the non-diagonal matrix elements are given by a mere constant. All non-diagonal matrix elements are equal.
With these definitions the Schroedinger equation takes the following form
where $u_i$ is unknown. We can write the latter equation as a matrix eigenvalue problem
or if we wish to be more detailed, we can write the tridiagonal matrix as
Recall that the solutions are known via the boundary conditions at $i=n_{\mathrm{step}}$ and at the other end point, that is for $\rho_0$. The solution is zero in both cases.
We are going to study two electrons in a harmonic oscillator well which also interact via a repulsive Coulomb interaction. Let us start with the single-electron equation written as
where $E^{(1)}$ stands for the energy with one electron only. For two electrons with no repulsive Coulomb interaction, we have the following Schroedinger equation
Note that we deal with a two-electron wave function $u(r_1,r_2)$ and two-electron energy $E^{(2)}$.
With no interaction this can be written out as the product of two single-electron wave functions, that is we have a solution on closed form.
We introduce the relative coordinate ${\bf r} = {\bf r}_1-{\bf r}_2$ and the center-of-mass coordinate ${\bf R} = 1/2({\bf r}_1+{\bf r}_2)$. With these new coordinates, the radial Schroedinger equation reads
We add then the repulsive Coulomb interaction between two electrons, namely a term
This equation is similar to the one we had previously in parts (a) and (b) and we introduce again a dimensionless variable $\rho = r/\alpha$. Repeating the same steps, we arrive at
and fix the constant $\alpha$ by requiring
or
we can rewrite Schroedinger's equation as
We treat $\omega_r$ as a parameter which reflects the strength of the oscillator potential.
Here we will study the cases $\omega_r = 0.01$, $\omega_r = 0.5$, $\omega_r =1$,
and $\omega_r = 5$
for the ground state only, that is the lowest-lying state.
With no repulsive Coulomb interaction
you should get a result which corresponds to
the relative energy of a non-interacting system.
Make sure your results are
stable as functions of $\rho_{\mathrm{max}}$ and the number of steps.
We are only interested in the ground state with $l=0$. We omit the center-of-mass energy.
For specific oscillator frequencies, the above equation has analytic answers, see the article by M. Taut, Phys. Rev. A 48, 3561 - 3566 (1993). The article can be retrieved from the following web address http://prola.aps.org/abstract/PRA/v48/i5/p3561_1.
The following program uses the eigenvalue solver provided by Armadillo and returns the eigenvalues for the lowest states. You can run this code interactively if you use ipython notebook.
In [2]:
%install_ext https://raw.github.com/dragly/cppmagic/master/cppmagic.py
%load_ext cppmagic
In [4]:
%%cpp -I/usr/local/include -L/usr/local/lib -llapack -lblas -larmadillo
/*
Solves the one-particle Schrodinger equation
for a potential specified in function
potential(). This example is for the harmonic oscillator in 3d
*/
#include <cmath>
#include <iostream>
#include <fstream>
#include <iomanip>
#include <armadillo>
using namespace std;
using namespace arma;
double potential(double);
void output(double, double, int, vec& );
// Begin of main program
int main(int argc, char* argv[])
{
int i, j, Dim, lOrbital;
double RMin, RMax, Step, DiagConst, NondiagConst, OrbitalFactor;
// With spherical coordinates RMin = 0 always
RMin = 0.0;
RMax = 8.0; lOrbital = 0; Dim =2000;
mat Hamiltonian = zeros<mat>(Dim,Dim);
// Integration step length
Step = RMax/ Dim;
DiagConst = 2.0 / (Step*Step);
NondiagConst = -1.0 / (Step*Step);
OrbitalFactor = lOrbital * (lOrbital + 1.0);
// local memory for r and the potential w[r]
vec r(Dim); vec w(Dim);
for(i = 0; i < Dim; i++) {
r(i) = RMin + (i+1) * Step;
w(i) = potential(r(i)) + OrbitalFactor/(r(i) * r(i));
}
// Setting up tridiagonal matrix and brute diagonalization using Armadillo
Hamiltonian(0,0) = DiagConst + w(0);
Hamiltonian(0,1) = NondiagConst;
for(i = 1; i < Dim-1; i++) {
Hamiltonian(i,i-1) = NondiagConst;
Hamiltonian(i,i) = DiagConst + w(i);
Hamiltonian(i,i+1) = NondiagConst;
}
Hamiltonian(Dim-1,Dim-2) = NondiagConst;
Hamiltonian(Dim-1,Dim-1) = DiagConst + w(Dim-1);
// diagonalize and obtain eigenvalues
vec Eigval(Dim);
eig_sym(Eigval, Hamiltonian);
output(RMin , RMax, Dim, Eigval);
return 0;
} // end of main function
/*
The function potential()
calculates and return the value of the
potential for a given argument x.
The potential here is for the hydrogen atom
*/
double potential(double x)
{
return x*x;
} // End: function potential()
void output(double RMin , double RMax, int Dim, vec& d)
{
int i;
cout << "RESULTS:" << endl;
cout << setiosflags(ios::showpoint | ios::uppercase);
cout <<"Rmin = " << setw(15) << setprecision(8) << RMin << endl;
cout <<"Rmax = " << setw(15) << setprecision(8) << RMax << endl;
cout <<"Number of steps = " << setw(15) << Dim << endl;
cout << "Five lowest eigenvalues:" << endl;
for(i = 0; i < 5; i++) {
cout << setw(15) << setprecision(8) << d[i] << endl;
}
} // end of function output
The drawbacks with Jacobi's method are rather obvious, with perhaps the most negative feature being the fact that we cannot tell a priori how many transformations are needed. Can we do better?
The answer to this is yes and is given by a clever algorithm outlined by Householder. It was ranked among the top ten algorithms in the previous century. We will discuss this algorithm in more detail below.
The first step consists in finding an orthogonal matrix ${\bf S}$ which is the product of $(n-2)$ orthogonal matrices
each of which successively transforms one row and one column of ${\bf A}$ into the required tridiagonal form. Only $n-2$ transformations are required, since the last two elements are already in tridiagonal form.
In order to determine each ${\bf S_i}$ let us see what happens after the first multiplication, namely,
where the primed quantities represent a matrix ${\bf A}'$ of dimension $n-1$ which will subsequentely be transformed by ${\bf S_2}$.
The factor $e_1$ is a possibly non-vanishing element. The next transformation produced by ${\bf S_2}$ has the same effect as ${\bf S_1}$ but now on the submatirx ${\bf A^{'}}$ only
Note that the effective size of the matrix on which we apply the transformation reduces for every new step. In the previous Jacobi method each similarity transformation is in principle performed on the full size of the original matrix.
After a series of such transformations, we end with a set of diagonal matrix elements
and off-diagonal matrix elements
with ${\bf 0^{T}}$ being a zero row vector, ${\bf 0^{T}} = \{0,0,\cdots\}$ of dimension $(n-1)$. The matrix ${\bf P}$ is symmetric with dimension ($(n-1) \times (n-1)$) satisfying ${\bf P}^2={\bf I}$ and ${\bf P}^T={\bf P}$. A possible choice which fullfils the latter two requirements is
where ${\bf I}$ is the $(n-1)$ unity matrix and ${\bf u}$ is an $n-1$ column vector with norm ${\bf u}^T{\bf u}$ (inner product).
Note that ${\bf u}{\bf u}^T$ is an outer product giving a matrix of dimension ($(n-1) \times (n-1)$). Each matrix element of ${\bf P}$ then reads
where $i$ and $j$ range from $1$ to $n-1$. Applying the transformation
${\bf S}_1$ results in
where ${\bf v^{T}} = \{a_{21}, a_{31},\cdots, a_{n1}\}$ and {\bf P} must satisfy (${\bf Pv})^{T} = \{k, 0, 0,\cdots \}$. Then
with ${\bf e^{T}} = \{1,0,0,\dots 0\}$.
Solving the latter equation gives us ${\bf u}$ and thus the needed transformation ${\bf P}$. We do first however need to compute the scalar $k$ by taking the scalar product of the last equation with its transpose and using the fact that ${\bf P}^2={\bf I}$. We get then
which determines the constant $ k = \pm v$.
Now we can rewrite Eq. (6) as
and taking the scalar product of this equation with itself and obtain
which finally determines
In solving Eq. (7) great care has to be exercised so as to choose those values which make the right-hand largest in order to avoid loss of numerical precision. The above steps are then repeated for every transformations till we have a tridiagonal matrix suitable for obtaining the eigenvalues.
Our Householder transformation has given us a tridiagonal matrix. We discuss here how one can use Householder's iterative procedure to obtain the eigenvalues. Let us specialize to a $4\times 4 $ matrix. The tridiagonal matrix takes the form
As a first observation, if any of the elements $e_{i}$ are zero the matrix can be separated into smaller pieces before diagonalization. Specifically, if $e_{1} = 0$ then $d_{1}$ is an eigenvalue.
Thus, let us introduce a transformation ${\bf S_{1}}$ which operates like
Then the similarity transformation
produces a matrix where the primed elements in ${\bf A'}$ have been changed by the transformation whereas the unprimed elements are unchanged. If we now choose $\theta$ to give the element $a_{21}^{'} = e^{'}= 0$ then we have the first eigenvalue $= a_{11}^{'} = d_{1}^{'}$. (This is actually what you are doing in project 2!!)
This procedure can be continued on the remaining three-dimensional
submatrix for the next eigenvalue. Thus after few transformations
we have the wanted diagonal form.
What we see here is just a special case of the more general procedure developed by Francis in two articles in 1961 and 1962.
The algorithm is based on the so-called QR method (or just QR-algorithm). It follows from a theorem by Schur which states that any square matrix can be written out in terms of an orthogonal matrix ${\bf Q}$ and an upper triangular matrix ${\bf U}$. Historically R was used instead of U since the wording right triangular matrix was first used. The method is based on an iterative procedure similar to Jacobi's method, by a succession of planar rotations. For a tridiagonal matrix it is simple to carry out in principle, but complicated in detail! We will discuss this in more detail during week 38.
Our Householder transformation has given us a tridiagonal matrix. We discuss here how one can use Jacobi's iterative procedure to obtain the eigenvalues, although it may not be the best approach. Let us specialize to a $4\times 4 $ matrix. The tridiagonal matrix takes the form
As a first observation, if any of the elements $e_{i}$ are zero the matrix can be separated into smaller pieces before diagonalization. Specifically, if $e_{1} = 0$ then $d_{1}$ is an eigenvalue.
Thus, let us introduce a transformation ${\bf S_{1}}$ which operates like
Then the similarity transformation
produces a matrix where the primed elements in ${\bf A'}$ have been changed by the transformation whereas the unprimed elements are unchanged.
If we now choose $\theta$ to give the element $a_{21}^{'} = e^{'}= 0$ then we have the first eigenvalue $= a_{11}^{'} = d_{1}^{'}$.
This procedure can be continued on the remaining three-dimensional
submatrix for the next eigenvalue. Thus after few transformations
we have the wanted diagonal form.
What we see here is just a special case of the more general procedure developed by Francis in two articles in 1961 and 1962. Using Jacobi's method is not very efficient ether.
The algorithm is based on the so-called {\bf QR} method (or just {\bf QR}-algorithm). It follows from a theorem by Schur which states that any square matrix can be written out in terms of an orthogonal matrix $\hat{Q}$ and an upper triangular matrix $\hat{U}$. Historically $R$ was used instead of $U$ since the wording right triangular matrix was first used.
The method is based on an iterative procedure similar to Jacobi's method, by a succession of planar rotations. For a tridiagonal matrix it is simple to carry out in principle, but complicated in detail!
Schur's theorem
is used to rewrite any square matrix into a unitary matrix times an upper triangular matrix. We say that a square matrix is similar to a triangular matrix.
Householder's algorithm which we have derived is just a special case of the general Householder algorithm. For a symmetric square matrix we obtain a tridiagonal matrix.
There is a corollary to Schur's theorem which states that every Hermitian matrix is unitarily similar to a diagonal matrix.
It follows that we can define a new matrix
and multiply from the left with $\hat{Q}^{-1}$ we get
and multiply from the left with $\hat{Q}^{-1}$ resulting in
Suppose that $\hat{Q}$ consists of a series of planar Jacobi like rotations acting on sub blocks of $\hat{A}$ so that all elements below the diagonal are zeroed out
from Schur's theorem the matrix $\hat{B}$ is triangular and the eigenvalues the same as those of $\hat{A}$ and are given by the diagonal matrix elements of $\hat{B}$. Why?
Our matrix $\hat{B}=\hat{U}\hat{Q}$.
The matrix $\hat{A}$ is transformed into a tridiagonal form and the last step is to transform it into a diagonal matrix giving the eigenvalues on the diagonal.
The eigenvalues of a matrix can be obtained using the characteristic polynomial
which rewritten in matrix form reads
We can solve this equation in an iterative manner. We let $P_k(\lambda)$ be the value of $k$ subdeterminant of the above matrix of dimension $n\times n$. The polynomial $P_k(\lambda)$ is clearly a polynomial of degree $k$. Starting with $P_1(\lambda)$ we have $P_1(\lambda)=d_1-\lambda$. The next polynomial reads $P_2(\lambda)=(d_2-\lambda)P_1(\lambda)-e_1^2$. By expanding the determinant for $P_k(\lambda)$ in terms of the minors of the $n$th column we arrive at the recursion relation [ P_k(\lambda)=(dk-\lambda)P{k-1}(\lambda)-e{k-1}^2P{k-2}(\lambda). ] Together with the starting values $P_1(\lambda)$ and $P_2(\lambda)$ and good root searching methods we arrive at an efficient computational scheme for finding the roots of $P_n(\lambda)$. However, for large matrices this algorithm is rather inefficient and time-consuming.
Basic features with a real symmetric matrix (and normally huge $n> 10^6$ and sparse) $\hat{A}$ of dimension $n\times n$:
Lanczos' algorithm generates a sequence of real tridiagonal matrices $T_k$ of dimension $k\times k$ with $k\le n$, with the property that the extremal eigenvalues of $T_k$ are progressively better estimates of $\hat{A}$' extremal eigenvalues.* The method converges to the extremal eigenvalues.
The similarity transformation is
with the first vector $\hat{Q}\hat{e}_1=\hat{q}_1$.
We are going to solve iteratively
with the first vector $\hat{Q}\hat{e}_1=\hat{q}_1$. We can write out the matrix $\hat{Q}$ in terms of its column vectors
can be written as
we can rewrite
as
we obtain
with $\beta_0\hat{q}_0=0$ for $k=1:n-1$. Remember that the vectors $\hat{q}_k$ are orthornormal and this implies
with $\beta_0\hat{q}_0=0$ for $k=1:n-1$ and
If
is non-zero, then
with $\beta_k=\pm ||\hat{r}_{k}||_2$.