Making_a_Preconditioner-vectorized


This is functionally similar to the the other notebook. All the operations here have been vectorized. This results in much much faster code, but is also much unreadable. The vectorization also necessitated the replacement of the Gauss-Seidel smoother with under-relaxed Jacobi. That change has had some effect since GS is "twice as better" as Jacobi.

The Making of a Preconditioner ---Vectorized Version

This is a demonstration of a multigrid preconditioned krylov solver in python3. The code and more examples are present on github here. The problem solved is a Poisson equation on a rectangular domain with homogenous dirichlet boundary conditions. Finite difference with cell-centered discretization is used to get a second order accurate solution, that is further improved to 4th order using deferred correction.

The first step is a multigrid algorithm. This is the simplest 2D geometric multigrid solver.

1. Multigrid algorithm

We need some terminology before going further.

  • Approximation:
  • Residual:
  • Exact solution (of the discrete problem)
  • Correction

This is a geometric multigrid algorithm, where a series of nested grids are used. There are four parts to a multigrid algorithm

  • Smoothing Operator (a.k.a Relaxation)
  • Restriction Operator
  • Interpolation Operator (a.k.a Prolongation Operator)
  • Bottom solver

We will define each of these in sequence. These operators act of different quantities that are stored at the cell center. We will get to exactly what later on. To begin import numpy.


In [20]:
import numpy as np

1.1 Smoothing operator

This can be a certain number of Jacobi or a Gauss-Seidel iterations. Below is defined smoother that does under-relaxed Jacobi sweeps and returns the result along with the residual.


In [21]:
def Jacrelax(nx,ny,u,f,iters=1):
  '''
  under-relaxed Jacobi iteration
  '''
  dx=1.0/nx; dy=1.0/ny
  Ax=1.0/dx**2; Ay=1.0/dy**2
  Ap=1.0/(2.0*(Ax+Ay))

  #Dirichlet BC
  u[ 0,:] = -u[ 1,:]
  u[-1,:] = -u[-2,:]
  u[:, 0] = -u[:, 1]
  u[:,-1] = -u[:,-2]

  for it in range(iters):
    u[1:nx+1,1:ny+1] = 0.8*Ap*(Ax*(u[2:nx+2,1:ny+1] + u[0:nx,1:ny+1])
                             + Ay*(u[1:nx+1,2:ny+2] + u[1:nx+1,0:ny])
                             - f[1:nx+1,1:ny+1])+0.2*u[1:nx+1,1:ny+1]
    #Dirichlet BC
    u[ 0,:] = -u[ 1,:]
    u[-1,:] = -u[-2,:]
    u[:, 0] = -u[:, 1]
    u[:,-1] = -u[:,-2]

  res=np.zeros([nx+2,ny+2])
  res[1:nx+1,1:ny+1]=f[1:nx+1,1:ny+1]-(( Ax*(u[2:nx+2,1:ny+1]+u[0:nx,1:ny+1])
                                       + Ay*(u[1:nx+1,2:ny+2]+u[1:nx+1,0:ny])
                                       - 2.0*(Ax+Ay)*u[1:nx+1,1:ny+1]))
  return u,res

1.2 Interpolation Operator

This operator takes values on a coarse grid and transfers them onto a fine grid. It is also called prolongation. The function below uses bilinear interpolation for this purpose. 'v' is on a coarse grid and we want to interpolate it on a fine grid and store it in v_f.


In [22]:
def prolong(nx,ny,v):
  '''
  interpolate 'v' to the fine grid
  '''
  v_f=np.zeros([2*nx+2,2*ny+2])
  v_f[1:2*nx:2  ,1:2*ny:2  ] = 0.5625*v[1:nx+1,1:ny+1]+0.1875*(v[0:nx  ,1:ny+1]+v[1:nx+1,0:ny]  )+0.0625*v[0:nx  ,0:ny  ]
  v_f[2:2*nx+1:2,1:2*ny:2  ] = 0.5625*v[1:nx+1,1:ny+1]+0.1875*(v[2:nx+2,1:ny+1]+v[1:nx+1,0:ny]  )+0.0625*v[2:nx+2,0:ny  ]
  v_f[1:2*nx:2  ,2:2*ny+1:2] = 0.5625*v[1:nx+1,1:ny+1]+0.1875*(v[0:nx  ,1:ny+1]+v[1:nx+1,2:ny+2])+0.0625*v[0:nx  ,2:ny+2]
  v_f[2:2*nx+1:2,2:2*ny+1:2] = 0.5625*v[1:nx+1,1:ny+1]+0.1875*(v[2:nx+2,1:ny+1]+v[1:nx+1,2:ny+2])+0.0625*v[2:nx+2,2:ny+2]
  return v_f

1.3 Restriction

This is exactly the opposite of the interpolation. It takes values from the find grid and transfers them onto the coarse grid. It is kind of an averaging process. This is fundamentally different from interpolation. Each coarse grid point is surrounded by four fine grid points. So quite simply we take the value of the coarse point to be the average of 4 fine points. Here 'v' is the fine grid quantity and 'v_c' is the coarse grid quantity


In [23]:
def restrict(nx,ny,v):
  '''
  restrict 'v' to the coarser grid
  '''
  v_c=np.zeros([nx+2,ny+2])
  v_c[1:nx+1,1:ny+1]=0.25*(v[1:2*nx:2,1:2*ny:2]+v[1:2*nx:2,2:2*ny+1:2]+v[2:2*nx+1:2,1:2*ny:2]+v[2:2*nx+1:2,2:2*ny+1:2])
  return v_c

1.4 Bottom Solver

Note that we have looped over the coarse grid in both the cases above. It is easier to access the variables this way. The last part is the Bottom Solver. This must be something that gives us the exact/converged solution to what ever we feed it. What we feed to the bottom solver is the problem at the coarsest level. This has generally has very few points (e.g 2x2=4 in our case) and can be solved exactly by the smoother itself with few iterations. That is what we do here but, any other direct method can also be used. 50 Iterations are used here. If we coarsify to just one point, then just one iteration will solve it exactly.

1.5 V-cycle

Now that we have all the parts, we are ready to build our multigrid algorithm. First we will look at a V-cycle. It is self explanatory. It is a recursive function ,i.e., it calls itself. It takes as input an initial guess 'u', the rhs 'f', the number of multigrid levels 'num_levels' among other things. At each level the V cycle calls another V-cycle. At the lowest level the solving is exact.


In [24]:
def V_cycle(nx,ny,num_levels,u,f,level=1):

  if(level==num_levels):#bottom solve
    u,res=Jacrelax(nx,ny,u,f,iters=50)
    return u,res

  #Step 1: Relax Au=f on this grid
  u,res=Jacrelax(nx,ny,u,f,iters=1)

  #Step 2: Restrict residual to coarse grid
  res_c=restrict(nx//2,ny//2,res)

  #Step 3:Solve A e_c=res_c on the coarse grid. (Recursively)
  e_c=np.zeros_like(res_c)
  e_c,res_c=V_cycle(nx//2,ny//2,num_levels,e_c,res_c,level+1)

  #Step 4: Interpolate(prolong) e_c to fine grid and add to u
  u+=prolong(nx//2,ny//2,e_c)
  
  #Step 5: Relax Au=f on this grid
  u,res=Jacrelax(nx,ny,u,f,iters=1)
  return u,res

Thats it! Now we can see it in action. We can use a problem with a known solution to test our code. The following functions set up a rhs for a problem with homogenous dirichlet BC on the unit square.


In [25]:
#analytical solution
def Uann(x,y):
   return (x**3-x)*(y**3-y)
#RHS corresponding to above
def source(x,y):
  return 6*x*y*(x**2+ y**2 - 2)

Let us set up the problem, discretization and solver details. The number of divisions along each dimension is given as a power of two function of the number of levels. In principle this is not required, but having it makes the inter-grid transfers easy. The coarsest problem is going to have a 2-by-2 grid.


In [26]:
#input
max_cycles = 30
nlevels    = 6  
NX         = 2*2**(nlevels-1)
NY         = 2*2**(nlevels-1)
tol        = 1e-12

In [27]:
#the grid has one layer of ghost cellss
uann=np.zeros([NX+2,NY+2])#analytical solution
u   =np.zeros([NX+2,NY+2])#approximation
f   =np.zeros([NX+2,NY+2])#RHS

#calcualte the RHS and exact solution
DX=1.0/NX
DY=1.0/NY

xc=np.linspace(0.5*DX,1-0.5*DX,NX)
yc=np.linspace(0.5*DY,1-0.5*DY,NY)
XX,YY=np.meshgrid(xc,yc,indexing='ij')

uann[1:NX+1,1:NY+1]=Uann(XX,YY)
f[1:NX+1,1:NY+1]   =source(XX,YY)

Now we can call the solver


In [28]:
print('mgd2d.py solver:')
print('NX:',NX,', NY:',NY,', tol:',tol,'levels: ',nlevels)
for it in range(1,max_cycles+1):
  u,res=V_cycle(NX,NY,nlevels,u,f)
  rtol=np.max(np.max(np.abs(res)))
  if(rtol<tol):
    break
  error=uann[1:NX+1,1:NY+1]-u[1:NX+1,1:NY+1]
  print('  cycle: ',it,', L_inf(res.)= ',rtol,',L_inf(true error): ',np.max(np.max(np.abs(error))))

error=uann[1:NX+1,1:NY+1]-u[1:NX+1,1:NY+1]
print('L_inf (true error): ',np.max(np.max(np.abs(error))))


mgd2d.py solver:
NX: 64 , NY: 64 , tol: 1e-12 levels:  6
  cycle:  1 , L_inf(res.)=  0.891977476345 ,L_inf(true error):  0.0411816115789
  cycle:  2 , L_inf(res.)=  0.257779410083 ,L_inf(true error):  0.0116506189761
  cycle:  3 , L_inf(res.)=  0.0735673054651 ,L_inf(true error):  0.00330803624272
  cycle:  4 , L_inf(res.)=  0.0208583793969 ,L_inf(true error):  0.000930219571437
  cycle:  5 , L_inf(res.)=  0.00588946434527 ,L_inf(true error):  0.000247798905275
  cycle:  6 , L_inf(res.)=  0.00171344338378 ,L_inf(true error):  6.7168536506e-05
  cycle:  7 , L_inf(res.)=  0.000523285391864 ,L_inf(true error):  6.86431869779e-05
  cycle:  8 , L_inf(res.)=  0.000161594333349 ,L_inf(true error):  6.90600429546e-05
  cycle:  9 , L_inf(res.)=  5.09588276145e-05 ,L_inf(true error):  6.91786495929e-05
  cycle:  10 , L_inf(res.)=  1.62977007676e-05 ,L_inf(true error):  6.92125721805e-05
  cycle:  11 , L_inf(res.)=  5.2736240832e-06 ,L_inf(true error):  6.92223164925e-05
  cycle:  12 , L_inf(res.)=  1.72635327544e-06 ,L_inf(true error):  6.92251261566e-05
  cycle:  13 , L_inf(res.)=  5.71547388972e-07 ,L_inf(true error):  6.92259390797e-05
  cycle:  14 , L_inf(res.)=  1.91270373762e-07 ,L_inf(true error):  6.92261750469e-05
  cycle:  15 , L_inf(res.)=  6.46573425911e-08 ,L_inf(true error):  6.92262437573e-05
  cycle:  16 , L_inf(res.)=  2.20597939915e-08 ,L_inf(true error):  6.92262638279e-05
  cycle:  17 , L_inf(res.)=  7.58882379159e-09 ,L_inf(true error):  6.92262697094e-05
  cycle:  18 , L_inf(res.)=  2.62980393018e-09 ,L_inf(true error):  6.92262714386e-05
  cycle:  19 , L_inf(res.)=  9.17680154089e-10 ,L_inf(true error):  6.92262719488e-05
  cycle:  20 , L_inf(res.)=  3.21961124428e-10 ,L_inf(true error):  6.92262720999e-05
  cycle:  21 , L_inf(res.)=  1.13232090371e-10 ,L_inf(true error):  6.92262721448e-05
  cycle:  22 , L_inf(res.)=  4.0017766878e-11 ,L_inf(true error):  6.92262721582e-05
  cycle:  23 , L_inf(res.)=  1.40971678775e-11 ,L_inf(true error):  6.92262721622e-05
  cycle:  24 , L_inf(res.)=  5.45696821064e-12 ,L_inf(true error):  6.92262721634e-05
  cycle:  25 , L_inf(res.)=  1.81898940355e-12 ,L_inf(true error):  6.92262721638e-05
L_inf (true error):  6.92262721639e-05

True error is the difference of the approximation with the analytical solution. It is largely the discretization error. This what would be present when we solve the discrete equation with a direct/exact method like gaussian elimination. We see that true error stops reducing at the 5th cycle. The approximation is not getting any better after this point. So we can stop after 5 cycles. But, in general we dont know the true error. In practice we use the norm of the (relative) residual as a stopping criterion. As the cycles progress the floating point round-off error limit is reached and the residual also stops decreasing.

This was the multigrid V cycle. We can use this as preconditioner to a Krylov solver. But before we get to that let's complete the multigrid introduction by looking at the Full Multi-Grid algorithm. You can skip this section safely.

1.6 Full Multi-Grid

We started with a zero initial guess for the V-cycle. Presumably, if we had a better initial guess we would get better results. So we solve a coarse problem exactly and interpolate it onto the fine grid and use that as the initial guess for the V-cycle. The result of doing this recursively is the Full Multi-Grid(FMG) Algorithm. Unlike the V-cycle which was an iterative procedure, FMG is a direct solver. There is no successive improvement of the approximation. It straight away gives us an approximation that is within the discretization error. The FMG algorithm is given below.


In [29]:
def FMG(nx,ny,num_levels,f,nv=1,level=1):

  if(level==num_levels):#bottom solve
    u=np.zeros([nx+2,ny+2])  
    u,res=Jacrelax(nx,ny,u,f,iters=50)
    return u,res

  #Step 1: Restrict the rhs to a coarse grid
  f_c=restrict(nx//2,ny//2,f)

  #Step 2: Solve the coarse grid problem using FMG
  u_c,_=FMG(nx//2,ny//2,num_levels,f_c,nv,level+1)

  #Step 3: Interpolate u_c to the fine grid
  u=prolong(nx//2,ny//2,u_c)

  #step 4: Execute 'nv' V-cycles
  for _ in range(nv):
    u,res=V_cycle(nx,ny,num_levels-level,u,f)
  return u,res

Lets call the FMG solver for the same problem


In [30]:
print('mgd2d.py FMG solver:')
print('NX:',NX,', NY:',NY,', levels: ',nlevels)

u,res=FMG(NX,NY,nlevels,f,nv=1) 
rtol=np.max(np.max(np.abs(res)))

print(' FMG L_inf(res.)= ',rtol)
error=uann[1:NX+1,1:NY+1]-u[1:NX+1,1:NY+1]
print('L_inf (true error): ',np.max(np.max(np.abs(error))))


mgd2d.py FMG solver:
NX: 64 , NY: 64 , levels:  6
 FMG L_inf(res.)=  0.00659679691546
L_inf (true error):  6.66429014552e-05

It works wonderfully. The residual is large but the true error is within the discretization level. FMG is said to be scalable because the amount of work needed is linearly proportional to the the size of the problem. In big-O notation, FMG is $\mathcal{O}(N)$. Where N is the number of unknowns. Exact methods (Gaussian Elimination, LU decomposition ) are typically $\mathcal{O}(N^3)$

2. Stationary iterative methods as preconditioners

A preconditioner reduces the condition number of the coefficient matrix, thereby making it easier to solve. We dont explicitly need a matrix because we dont access the elements by index, coefficient matrix or preconditioner. What we do need is the action of the matrix on a vector. That is, we need only the matrix-vector product. The coefficient matrix can be defined as a function that takes in a vector and returns the matrix vector product.

Any stationary method has an iteration matrix associated with it. This is easily seen for Jacobi or GS methods. This iteration matrix can be used as a preconditioner. But we dont explicitly need it. The stationary iterative method for solving an equation can be written as a Richardson iteration. When the initial guess is set to zero and one iteration is performed, what you get is the action of the preconditioner on the RHS vector. That is, we get a preconditioner-vector product, which is what we want.

This allows us to use any blackbox stationary iterative method as a preconditioner

To repeat, if there is a stationary iterative method that you want to use as a preconditioner, set the initial guess to zero, set the RHS to the vector you want to multiply the preconditioner with and perform one iteration of the stationary method.

We can use the multigrid V-cycle as a preconditioner this way. We cant use FMG because it is not an iterative method.

The matrix as a function can be defined using LinearOperator from scipy.sparse.linalg. It gives us an object which works like a matrix in-so-far as the product with a vector is concerned. It can be used as a regular 2D numpy array in multiplication with a vector. This can be passed to CG(), GMRES() or BiCGStab() as a preconditioner.

Having a symmetric preconditioner would be nice because it will retain the symmetry if the original problem is symmetric and we can still use CG. If the preconditioner is not symmetric CG will not converge, and we would have to use a more general solver.

Below is the code for defining a V-Cycle preconditioner. The default is one V-cycle. In the V-cycle, the defaults are one pre-sweep, one post-sweep.


In [31]:
from scipy.sparse.linalg import LinearOperator,bicgstab,cg
def MGVP(nx,ny,num_levels):
  '''
  Multigrid Preconditioner. Returns a (scipy.sparse.linalg.) LinearOperator that can
  be passed to Krylov solvers as a preconditioner. 
  '''
  def pc_fn(v):
    u =np.zeros([nx+2,ny+2])
    f =np.zeros([nx+2,ny+2])
    f[1:nx+1,1:ny+1] =v.reshape([nx,ny]) #in practice this copying can be avoived
    #perform one V cycle
    u,res=V_cycle(nx,ny,num_levels,u,f)
    return u[1:nx+1,1:ny+1].reshape(v.shape)
  M=LinearOperator((nx*ny,nx*ny), matvec=pc_fn)
  return M

Let us define the Poisson matrix also as a LinearOperator


In [32]:
def Laplace(nx,ny):
  '''
  Action of the Laplace matrix on a vector v
  '''
  def mv(v):
    u =np.zeros([nx+2,ny+2])
  
    u[1:nx+1,1:ny+1]=v.reshape([nx,ny])
    dx=1.0/nx; dy=1.0/ny
    Ax=1.0/dx**2; Ay=1.0/dy**2
  
    #BCs. Needs to be generalized!
    u[ 0,:] = -u[ 1,:]
    u[-1,:] = -u[-2,:]
    u[:, 0] = -u[:, 1]
    u[:,-1] = -u[:,-2]

    ut = (Ax*(u[2:nx+2,1:ny+1]+u[0:nx,1:ny+1])
        + Ay*(u[1:nx+1,2:ny+2]+u[1:nx+1,0:ny])
        - 2.0*(Ax+Ay)*u[1:nx+1,1:ny+1])
    return ut.reshape(v.shape)
  A = LinearOperator((nx*ny,nx*ny), matvec=mv)
  return A

The nested function is required because "matvec" in LinearOperator takes only one argument-- the vector. But we require the grid details and boundary condition information to create the Poisson matrix. Now will use these to solve a problem. Unlike earlier where we used an analytical solution and RHS, we will start with a random vector which will be our exact solution, and multiply it with the Poisson matrix to get the Rhs vector for the problem. There is no analytical equation associated with the matrix equation.

The scipy sparse solve routines do not return the number of iterations performed. We can use this wrapper to get the number of iterations


In [33]:
def solve_sparse(solver,A, b,tol=1e-10,maxiter=500,M=None):
      num_iters = 0
      def callback(xk):
         nonlocal num_iters
         num_iters+=1
      x,status=solver(A, b,tol=tol,maxiter=maxiter,callback=callback,M=M)
      return x,status,num_iters

Lets look at what happens with and without the preconditioner.


In [34]:
A = Laplace(NX,NY)
#Exact solution and RHS
uex=np.random.rand(NX*NY,1)
b=A*uex

#Multigrid Preconditioner
M=MGVP(NX,NY,nlevels)

u,info,iters=solve_sparse(bicgstab,A,b,tol=1e-10,maxiter=500)
print('Without preconditioning. status:',info,', Iters: ',iters)
error=uex.reshape([NX,NY])-u.reshape([NX,NY])
print('error :',np.max(np.abs(error)))

u,info,iters=solve_sparse(bicgstab,A,b,tol=1e-10,maxiter=500,M=M)
print('With preconditioning. status:',info,', Iters: ',iters)
error=uex.reshape([NX,NY])-u.reshape([NX,NY])
print('error :',np.max(np.abs(error)))


Without preconditioning. status: 0 , Iters:  159
error : 1.68785727617e-08
With preconditioning. status: 0 , Iters:  7
error : 1.4983458918e-11

Without the preconditioner ~150 iterations were needed, where as with the V-cycle preconditioner the solution was obtained in far fewer iterations. Let's try with CG:


In [35]:
u,info,iters=solve_sparse(cg,A,b,tol=1e-10,maxiter=500)
print('Without preconditioning. status:',info,', Iters: ',iters)
error=uex.reshape([NX,NY])-u.reshape([NX,NY])
print('error :',np.max(np.abs(error)))

u,info,iters=solve_sparse(cg,A,b,tol=1e-10,maxiter=500,M=M)
print('With preconditioning. status:',info,', Iters: ',iters)
error=uex.reshape([NX,NY])-u.reshape([NX,NY])
print('error :',np.max(np.abs(error)))


Without preconditioning. status: 0 , Iters:  204
error : 1.56476173685e-09
With preconditioning. status: 0 , Iters:  14
error : 7.51643247643e-10

There we have it. A Multigrid Preconditioned Krylov Solver. We did all this without even having to deal with an actual matrix. How great is that! I think the next step should be solving a non-linear problem without having to deal with an actual Jacobian (matrix).