ILI285 - Computación Científica I / INF285 - Computación Científica

Generalized Minimal Residual Method

[[S]cientific [C]omputing [T]eam](#acknowledgements)

Version: 1.2


In [8]:
import numpy as np
import scipy as sp
from scipy import linalg as la
import matplotlib.pyplot as plt
import scipy.sparse.linalg
%matplotlib inline
#%load_ext memory_profiler
import matplotlib as mpl
mpl.rcParams['font.size'] = 14
mpl.rcParams['axes.labelsize'] = 20
mpl.rcParams['xtick.labelsize'] = 14
mpl.rcParams['ytick.labelsize'] = 14
M=8

Introduction

Welcome to another edition of our Jupyter Notebooks. A few notebooks back, we saw that the Conjugate Gradient Method, an iterative method, was very useful to solve $A\,\mathbf{x}=\mathbf{b}$ but it only worked when $A$ was positive definite and symmetric. So now we need an iterative method that works with nonsymmetric linear system of equations, and for that we have the Generalized Minimum Residual Method (GMRes). It works really well for finding the solution of large, sparse (and dense as well), nonsymmetric linear systems of equations. Of course, it will also have trouble for ill-conditioned linear system of equations. But it is really easy to add a left or right or both preconditioners!

A quick review on Least Squares

Least Squares is used to solve overdetemined linear systems of equations $A\,\mathbf{x} = \mathbf{b}$. That is, for example, a linear system of equations where there are more equations than unknowns. It finds the best $\overline{\mathbf{x}}$ so that it minimizes the euclidean length of $\mathbf{r} = \mathbf{b} - A\,\mathbf{x}$.

So, you might be wondering, what does Least Squares have to do with GMRes? WELL, since you're dying to know, I'll tell you: the backward error of the system in GMRes is minimized at each iteration step using a Least Squares formulation.

GMRes

GMRes is a member of the family of Krylov methods. It finds an approximation of $\mathbf{x}$ restricted to live on the Krylov sub-space $\mathcal{K_k}$, where $\mathcal{K_k}=\{\mathbf{r}_0, A\,\mathbf{r}_0, A^2\,\mathbf{r}_0, \cdots, A^{k-1}\,\mathbf{r}_0\}$ and $\mathbf{r}_0 = \mathbf{b} - A\,\mathbf{x}_0$ is the residual vector of the initial guess.

The idea behind this method is to look for improvements to the initial guess $\mathbf{x}_0$ in the Krylov space. At the $k$-th iteration, we enlarge the Krylov space by adding $A^k\,\mathbf{r}_0$, reorthogonalize the basis, and then use least squares to find the best improvement to add to $\mathbf{x}_0$.

The algorithm is as follows:

Generalized Minimum Residual Method

$\mathbf{x}_0$ = initial guess
$\mathbf{r}$ = $\mathbf{b} - A\,\mathbf{x}_0$
$\mathbf{q}_1$ = $\mathbf{r} / \|\mathbf{r}\|_2$
for $k = 1, ..., m$
$\qquad \ \ \mathbf{y} = A\,\mathbf{q}_k$
$\qquad$ for $j = 1,2,...,k$
$\qquad \qquad$ $h_{jk} = \mathbf{q}_j^*\,\mathbf{y}$
$\qquad \qquad$ $\mathbf{y} = \mathbf{y} - h_{jk}\, \mathbf{q}_j$
$\qquad$ end
$\qquad \ h_{k+1,k} = \|y\|_2 \qquad$ (If $h_{k+1,k} = 0$ , skip next line and terminate at bottom.)
$\qquad \ \mathbf{q}_{k+1} = \mathbf{y}/h_{k+1,k}$
$\qquad$ Minimize $\left\|\widehat{H}_k\, \mathbf{c}_k - [\|\mathbf{r}\|_2 \ 0 \ 0 \ ... \ 0]^T \right\|_2$ for $\mathbf{c}_k$
$\qquad$ $\mathbf{x}_k = Q_k \, \mathbf{c}_k + \mathbf{x}_0$
end

Now we have to implement it.


In [2]:
# This is a very instructive implementation of GMRes.
def GMRes(A, b, x0=np.array([0.0]), m=10, flag_display=True, threshold=1e-12):
    n = len(b)
    if len(x0)==1:
        x0=np.zeros(n)
    r0 = b - np.dot(A, x0)
    nr0=np.linalg.norm(r0)
    out_res=np.array(nr0)
    Q = np.zeros((n,n))
    H = np.zeros((n,n))
    Q[:,0] = r0 / nr0
    flag_break=False
    for k in np.arange(np.min((m,n))):
        y = np.dot(A, Q[:,k])
        if flag_display:
            print('||y||=',np.linalg.norm(y))
        for j in np.arange(k+1):
            H[j][k] = np.dot(Q[:,j], y)
            if flag_display:
                print('H[',j,'][',k,']=',H[j][k])
            y = y - np.dot(H[j][k],Q[:,j])
            if flag_display:
                print('||y||=',np.linalg.norm(y))
        # All but the last equation are treated equally. Why?
        if k+1<n:
            H[k+1][k] = np.linalg.norm(y)
            if flag_display:
                print('H[',k+1,'][',k,']=',H[k+1][k])
            if (np.abs(H[k+1][k]) > 1e-16):
                Q[:,k+1] = y/H[k+1][k]
            else:
                print('flag_break has been activated')
                flag_break=True
            # Do you remember e_1? The canonical vector.
            e1 = np.zeros((k+1)+1)        
            e1[0]=1
            H_tilde=H[0:(k+1)+1,0:k+1]
        else:
            H_tilde=H[0:k+1,0:k+1]
        # Solving the 'SMALL' least square problem. 
        # This could be improved with Givens rotations!
        ck = np.linalg.lstsq(H_tilde, nr0*e1)[0] 
        if k+1<n:
            x = x0 + np.dot(Q[:,0:(k+1)], ck)
        else:
            x = x0 + np.dot(Q, ck)
        # Why is 'norm_small' equal to 'norm_full'?
        norm_small=np.linalg.norm(np.dot(H_tilde,ck)-nr0*e1)
        out_res = np.append(out_res,norm_small)
        if flag_display:
            norm_full=np.linalg.norm(b-np.dot(A,x))
            print('..........||b-A\,x_k||=',norm_full)
            print('..........||H_k\,c_k-nr0*e1||',norm_small);
        if flag_break:
            if flag_display: 
                print('EXIT: flag_break=True')
            break
        if norm_small<threshold:
            if flag_display:
                print('EXIT: norm_small<threshold')
            break
    return x,out_res

A very simple example


In [3]:
A = np.array([[1,1,0],[0,1,0],[0,1,1]])
b = np.array([1,2,3])
x0 = np.zeros(3)

In [4]:
# scipy gmres
x_scipy = scipy.sparse.linalg.gmres(A,b,x0)[0]
# our gmres
x_our, _ = GMRes(A, b)
# numpy solve
x_np= np.linalg.solve(A,b)

# Showing the solutions
print('--------------------------------')
print('x_scipy',x_scipy)
print('x_our',x_our)
print('x_np',x_np)


||y||= 1.647508942095828
H[ 0 ][ 0 ]= 1.5714285714285714
||y||= 0.4948716593053935
H[ 1 ][ 0 ]= 0.4948716593053935
..........||b-A\,x_k||= 1.1239029738980326
..........||H_k\,c_k-nr0*e1|| 1.1239029738980326
||y||= 0.7867957924694432
H[ 0 ][ 1 ]= -0.659828879073858
||y||= 0.4285714285714286
H[ 1 ][ 1 ]= 0.4285714285714286
||y||= 2.7755575615628914e-17
H[ 2 ][ 1 ]= 2.7755575615628914e-17
flag_break has been activated
..........||b-A\,x_k||= 9.992007221626409e-16
..........||H_k\,c_k-nr0*e1|| 8.896640829961128e-16
EXIT: flag_break=True
--------------------------------
x_scipy [-1.  2.  1.]
x_our [-1.  2.  1.]
x_np [-1.  2.  1.]
/Users/claudio/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:42: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.

Another example, how may iteration does it need to converge?


In [5]:
A = np.array([[0,0,0,1],[1,0,0,0],[0,1,0,0],[0,0,1,0]])
b = np.array([1,0,1,0])
x_our, _ = GMRes(A, b, m=10)
norm_full=np.linalg.norm(b-np.dot(A,x_our))
print(norm_full)


||y||= 0.9999999999999999
H[ 0 ][ 0 ]= 0.0
||y||= 0.9999999999999999
H[ 1 ][ 0 ]= 0.9999999999999999
..........||b-A\,x_k||= 1.4142135623730951
..........||H_k\,c_k-nr0*e1|| 1.4142135623730951
||y||= 1.0
H[ 0 ][ 1 ]= 1.0
||y||= 1.5700924586837752e-16
H[ 1 ][ 1 ]= 0.0
||y||= 1.5700924586837752e-16
H[ 2 ][ 1 ]= 1.5700924586837752e-16
..........||b-A\,x_k||= 3.1401849173675503e-16
..........||H_k\,c_k-nr0*e1|| 2.2204460492503136e-16
EXIT: norm_small<threshold
3.1401849173675503e-16
/Users/claudio/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:42: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.

In [6]:
A = np.random.rand(10,10)+10*np.eye(10)
b = np.random.rand(10)
x_our, out_res = GMRes(A, b, m=10,flag_display=True)
norm_full=np.linalg.norm(b-np.dot(A,x_our))
print(norm_full)


||y||= 14.487771528768095
H[ 0 ][ 0 ]= 14.344742606867056
||y||= 2.0307346977181284
H[ 1 ][ 0 ]= 2.0307346977181284
..........||b-A\,x_k||= 0.26758506784697417
..........||H_k\,c_k-nr0*e1|| 0.26758506784697406
||y||= 10.325521127879725
H[ 0 ][ 1 ]= 1.8871583428283905
||y||= 10.151601841649612
H[ 1 ][ 1 ]= 10.13285943150067
||y||= 0.616587133203184
H[ 2 ][ 1 ]= 0.616587133203184
..........||b-A\,x_k||= 0.016856746305809862
..........||H_k\,c_k-nr0*e1|| 0.016856746305809783
||y||= 10.232672730966014
H[ 0 ][ 2 ]= 1.0304478696356436
||y||= 10.180656580349758
H[ 1 ][ 2 ]= 0.03800772348427306
||y||= 10.180585632466062
H[ 2 ][ 2 ]= 10.1130360703164
||y||= 1.170822471792279
H[ 3 ][ 2 ]= 1.170822471792279
..........||b-A\,x_k||= 0.0019411472409557762
..........||H_k\,c_k-nr0*e1|| 0.0019411472409556554
||y||= 9.785598341190836
H[ 0 ][ 3 ]= -0.07512733832000817
||y||= 9.785309947986002
H[ 1 ][ 3 ]= -0.057690070192138565
||y||= 9.785139888318154
H[ 2 ][ 3 ]= 0.22688765460709504
||y||= 9.782509117099861
H[ 3 ][ 3 ]= 9.774642749541561
||y||= 0.3922291995459664
H[ 4 ][ 3 ]= 0.3922291995459664
..........||b-A\,x_k||= 7.85639590738749e-05
..........||H_k\,c_k-nr0*e1|| 7.856395907382939e-05
||y||= 10.417858239163616
H[ 0 ][ 4 ]= -0.3517791297571158
||y||= 10.41191729390781
H[ 1 ][ 4 ]= -0.3824429247330744
||y||= 10.404891116417229
H[ 2 ][ 4 ]= -0.28890535706337817
||y||= 10.400879426238834
H[ 3 ][ 4 ]= -0.043119416593389914
||y||= 10.400790044754817
H[ 4 ][ 4 ]= 10.389019581905623
||y||= 0.4946773512628294
H[ 5 ][ 4 ]= 0.4946773512628294
..........||b-A\,x_k||= 3.739503455684014e-06
..........||H_k\,c_k-nr0*e1|| 3.7395034557893774e-06
||y||= 9.524193603300604
H[ 0 ][ 5 ]= 0.07365777845469945
||y||= 9.523908773440938
H[ 1 ][ 5 ]= -0.175402055875033
||y||= 9.522293444523756
H[ 2 ][ 5 ]= -0.4337552744605673
||y||= 9.512409200906873
H[ 3 ][ 5 ]= 0.3116769698503101
||y||= 9.507301734559743
H[ 4 ][ 5 ]= 0.098519665409421
||y||= 9.506791264537698
H[ 5 ][ 5 ]= 9.481767020486746
||y||= 0.6893288886302117
H[ 6 ][ 5 ]= 0.6893288886302117
..........||b-A\,x_k||= 2.715680564724522e-07
..........||H_k\,c_k-nr0*e1|| 2.715680565282856e-07
||y||= 10.121921265643216
H[ 0 ][ 6 ]= -0.25383419557064535
||y||= 10.118737980056574
H[ 1 ][ 6 ]= 0.15884386283864504
||y||= 10.117491138433376
H[ 2 ][ 6 ]= -0.01890530757229364
||y||= 10.117473475409929
H[ 3 ][ 6 ]= -0.17743041808594073
||y||= 10.11591755464433
H[ 4 ][ 6 ]= 0.6913718578004022
||y||= 10.092264013916944
H[ 5 ][ 6 ]= 0.0707654975424965
||y||= 10.092015912143644
H[ 6 ][ 6 ]= 10.07327409138756
||y||= 0.614763573043042
H[ 7 ][ 6 ]= 0.614763573043042
..........||b-A\,x_k||= 1.6590904358712334e-08
..........||H_k\,c_k-nr0*e1|| 1.6590904305918905e-08
||y||= 10.256821010103376
H[ 0 ][ 7 ]= -0.2947266790639884
||y||= 10.252585694250302
H[ 1 ][ 7 ]= -0.5820209264340727
||y||= 10.23605222041871
H[ 2 ][ 7 ]= -0.12425043404453101
||y||= 10.235298084998723
H[ 3 ][ 7 ]= -0.05604416939450507
||y||= 10.235144646748056
H[ 4 ][ 7 ]= -0.29018770242551284
||y||= 10.231030106358618
H[ 5 ][ 7 ]= -0.03616959243063533
||y||= 10.23096617127629
H[ 6 ][ 7 ]= 0.18522462872153378
||y||= 10.229289351402413
H[ 7 ][ 7 ]= 10.21950649248306
||y||= 0.4472669055624262
H[ 8 ][ 7 ]= 0.4472669055624262
..........||b-A\,x_k||= 7.275886090747205e-10
..........||H_k\,c_k-nr0*e1|| 7.275885908239069e-10
||y||= 10.228796398579453
H[ 0 ][ 8 ]= -0.14549454198773726
||y||= 10.227761588042798
H[ 1 ][ 8 ]= -0.09037549141336809
||y||= 10.22736228811689
H[ 2 ][ 8 ]= 0.3748608991483119
||y||= 10.220490138867376
H[ 3 ][ 8 ]= -0.24163091565283923
||y||= 10.217633443184681
H[ 4 ][ 8 ]= -0.005316723155579961
||y||= 10.21763205991197
H[ 5 ][ 8 ]= 0.6615668951512613
||y||= 10.196192139959939
H[ 6 ][ 8 ]= -0.2712871571717721
||y||= 10.192582471254992
H[ 7 ][ 8 ]= 0.14666840701967876
||y||= 10.191527157973766
H[ 8 ][ 8 ]= 10.19019905987458
||y||= 0.16452638647962725
H[ 9 ][ 8 ]= 0.16452638647962725
..........||b-A\,x_k||= 1.1765402420371118e-11
..........||H_k\,c_k-nr0*e1|| 1.1765500607711468e-11
||y||= 9.553701482232057
H[ 0 ][ 9 ]= 0.20868632677439747
||y||= 9.551421989872527
H[ 1 ][ 9 ]= -0.12467257273152932
||y||= 9.550608293623444
H[ 2 ][ 9 ]= 0.4979784943435843
||y||= 9.537616903472282
H[ 3 ][ 9 ]= 0.26249326769051895
||y||= 9.534004063446657
H[ 4 ][ 9 ]= 0.2359382960387546
||y||= 9.531084230153446
H[ 5 ][ 9 ]= 0.015168371989830676
||y||= 9.531072160191155
H[ 6 ][ 9 ]= 0.10412662709339926
||y||= 9.530503353354483
H[ 7 ][ 9 ]= -0.2644844753837581
||y||= 9.526832743917678
H[ 8 ][ 9 ]= 0.0020438123906312278
||y||= 9.52683252468589
H[ 9 ][ 9 ]= 9.526832523984378
||y||= 0.00011561322825438349
..........||b-A\,x_k||= 1.2560739669470201e-15
..........||H_k\,c_k-nr0*e1|| 1.0433986559552725e-15
EXIT: norm_small<threshold
1.2560739669470201e-15
/Users/claudio/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:42: FutureWarning: `rcond` parameter will change to the default of machine precision times ``max(M, N)`` where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass `rcond=None`, to keep using the old, explicitly pass `rcond=-1`.

Plotting the residual over the iterations


In [7]:
plt.figure(figsize=(M,M))
plt.semilogy(out_res,'.k',markersize=20,label='residual')
plt.grid(True)
plt.xlabel(r'$k$')
plt.ylabel(r'$\|\mathbf{b}-A\,\mathbf{x}_k\|_2$')
plt.grid(True)
plt.show()


Theoretical Problems

  1. Prove that in GMRES method, the backward error $||b- Ax_k||$ decreases monotonically with k.
  2. What would happen if we pass a singular matrix $A$ to the previous implementation of GMRes?
  3. Prove that for \begin{equation} A= \left[ \begin{array}{c|c} I & C \\ \hline 0 & I \end{array} \right] \end{equation} and any $x_0$ and $b$, GMRES converges to the exact solution after two steps. Here $C$ is a $m_1 \times m_2$ submatrix, $0$ denotes the $m_2 \times m_1$ matrix of zeros, and $I$ denotes the appropiate-sized identity matrix.

Practical Problems

  1. A possible improvement to the present algorithm consists on taking out of the loop the least squares computations, since the Krylov subspace spaned by $Q_k$ doesn't depend on previous least squares calculations.
    • Verify the truth of the above statement.
    • Verify if it is really an improvement.
    • Implement it.
    • Test both implementations using %timeit
  2. The GMRES method is meant for huge $n\times n$ sparse matrices $A$. In most cases, the goal is to run the method for $k$ steps (with $k << n$), reducing the complexity of the subproblems (Least squares). Neverthless for $k$ values too small, the solution $x_k$ could be not as good as needed. So to keep the values $k$ small and avoid bad solutions, there exists a variation of the algorithm known as Restarted GMRES: If no enough progress is made toward the solution after $k$ iterations, discard $Q_k$ and start GMRES from the beginning, using the current best guess $x_k$ as the new $x_0$.
    • Implement the Restarted GMRES method. Introduce a tolerance parameter to stop restarting.
    • Compare the asymptotic operation count and storage requirements of GMRES and Restarted GMRES, for fixed $k$ and increasing $n$.
    • Execute it on a huge linear system $A x = b$, and compare the solution with the solution of standard GMRES. Keep a value of $k$ small, and count how many times Restarted GMRES has to restart. Perform benchmarks using %timeit and %memit and verify the results.
    • Describe an example in which Restared GMRES can be expected to fail to converge, whereas GMRES succeds.

Acknowledgements

  • Material created by professor Claudio Torres (ctorres@inf.utfsm.cl) and assistants: Laura Bermeo, Alvaro Salinas, Axel Simonsen and Martín Villanueva. DI UTFSM. April 2016.
  • Material updated by professor Claudio Torres (ctorres@inf.utfsm.cl). DI UTFSM. June 2017.

In [ ]: