To solve 2 and 3 we need to solve auxilliary linear systems.
Types of solvers
One of the research directions for solving sparse linear systems
$$ A x = f $$with a given sparse matrix $A$ are so-called direct solvers which try to factorize the matrix $A$.
The simplest decomposition is the sparse LU factorization of the form
$$A = LU,$$where $L$ is sparse lower triangular, and $U$ is sparse upper triangular.
The crucial concept to analyze the efficiency of sparse matrix factorization is the sparse matrix graph.
The graph of the sparse matrix has vertices $i$, and the edge exists if $A_{ij} \ne 0$.
The pattern of the $L$ factor can be inferred from the symbolic operations in the sparse matrix graph.
The fill-in of a matrix are those entries which change from an initial zero to a nonzero value during the execution of an algorithm.
The fill-in is different for different permutations. So, before factorization we need to find reordering which produces the smallest fill-in.
For one order you get sparse factorization, for another - dense factors.
In fact, for a Cholesky factorization you compute
$$A = P L L^{\top} P^{\top},$$where $P$ is a permutation matrix.
Theoretical results from the fill-in:
By fill-in I mean the number of non-zeros in the $L$ factor.
In [ ]:
import numpy as np
import scipy as sp
import scipy.sparse
import scipy.sparse.linalg
from scipy.sparse import csc_matrix, csr_matrix, coo_matrix
import matplotlib.pyplot as plt
%matplotlib inline
n = 20
ex = np.ones(n);
lp1 = sp.sparse.spdiags(np.vstack((ex, -2*ex, ex)), [-1, 0, 1], n, n, 'csr');
e = sp.sparse.eye(n)
A = sp.sparse.kron(lp1, e) + sp.sparse.kron(e, lp1)
A = csc_matrix(A)
T = scipy.sparse.linalg.spilu(A)
plt.spy(T.L, marker='.', color='k', markersize=8)
A separator in a graph $G$ is a set $S$ of vertices whose removal leaves at least two connected components
It all boils down to finding good separators!
The idea is to eliminate rows and/or columns with fewer non-zeros, update fill-in and then repeat
Efficient implementation is an issue (adding/removing elements).
Current champion is "approximate minimal degree" by Amestoy, Davis, Duff
It is suboptimal even for 2D problems
In practice, often wins for medium-sized problems.
Computing separators is not a trivial task.
Graph partitioning heuristics have been an active research area for many years, often motivated by partitioning for parallel computation.
Existing approaches:
Many popular modern codes (e.g. Metis, Chaco) use multilevel iterative swapping.
The "cost" of the separator is defined in a very natural way as the sum over edges:
$$T(A, B) = \sum_{e} \{ \mbox{weight}(e): \mbox{ $e$ connects $A$ and $B$} \}.$$Given some initial partion, test some subsets $X$ and $Y$ of the same size, and if swapping decreases the cost function - swap them.
The idea of spectral bisection goes back to Fiedler.
We introduce the graph Laplacian of the matrix, which is defined as as symmetric matrix
that
$$L_{ii} = \mbox{degree of node $i$},$$$$L_{ij} = -1, \mbox{if $i \ne j$ and there is an edge},$$and $0$ otherwise.
For a general sparse matrix, fast direct method is typically the method of choice.
The a sparse matrix coming from a PDE, the rows/columns can be associated with points (elements) in a $d$-dimensional space, just like for the H-matrix case.
This is an additional structure that can be used in many ways, for example by approximating factors using $H$-matrix case.
In [2]:
import numpy as np
import scipy.sparse as spsp
n = 7
ex = np.ones(n);
a = spsp.spdiags(np.vstack((ex, -2*ex, ex)), [-1, 0, 1], n, n, 'csr');
a = a.todense()
b = np.array(np.linalg.inv(a))
print a
print b
If we want to achieve $\mathcal{O}(N)$ complexity, then direct solvers are not appropriate.
If we want to solve partial eigenproblem, the full eigendecomposition is too costly.
For both problems we will use iterative, Krylov subspace solvers, which treat the matrix as a black-box linear operator.
We have now an absolutely different view on a matrix: matrix is now a linear operator, that acts on a vector,
and this action can be computed in $\mathcal{O}(N)$ operations.
This is the only information we know about the matrix: the matrix-by-vector product (matvec)
Can we solve linear systems using only matvecs?
Of course, we can multiply by the colums of the identity matrix, and recover the full matrix, but it is not what we need.
We can apply the GMRES-like idea to speed up the convergence of a given fixed-point iteration
$$x_{k+1} = \Phi(x_k).$$This was actually older than the GMRES, and known as an Direct Inversion in Iterated Subspaces in Quantum Chemistry, or Anderson Acceleration.
Idea:
Use history for the update,
$$x_{k+1} = \Phi(x_k) + \sum_{s=1}^m \alpha_s (x_{k - s} - \Phi(x_{k - s})), $$and the parameters $\alpha_s$ are selected to minimize the norm of the residual.
The condition number problem is un-avoidable if only the matrix-by-vector product is used.
Thus we need an army of preconditioners to solve it.
There are several general purpose preconditioners that we can use,
but often for a particular problem a special design is needed.
The general concept of the preconditioner is simple:
Given a linear system
$$A x = f,$$we want to find the matrix $P_R$ (or $P_L$) such that
Then we solve for (right preconditioner)
$$ AP_R^{-1} y = f \quad \Rightarrow \quad P_R x = y$$or (left preconditioner)
$$ P_L^{-1} A x = P_L^{-1}f,$$or even both $$ P_L^{-1} A P_R^{-1} y = P_L^{-1}f \quad \Rightarrow \quad P_R x = y.$$
The best choice is of course $P = A,$ but this does not make life easier.
One of the ideas is to use other iterative methods (beside Krylov) as preconditioners.
Suppose you want to eliminate a variable $x_1$,
and the equations have the form
$$5 x_1 + x_4 + x_{10} = 1, \quad 3 x_1 + x_4 + x_8 = 0, \ldots,$$and in all other equations $x_1$ are not present.
After the elimination, only $x_{10}$ will enter additionally to the second equation (new fill-in).
$$x_4 + x_8 + 3(1 - x_4 - x_{10})/5 = 0$$In the Incomplete $LU$ case (actually, ILU(0)) we just throw away the new fill-in.
We run the usual LU-decomposition cycle, but avoid inserting non-zeros other than the initial non-zero pattern.
L = np.zeros((n, n))
U = np.zeros((n, n))
for k in range(n): #Eliminate one row
L[k, k] = 1
for i in range(k+1, n):
L[i, k] = a[i, k] / a[k, k]
for j in range(k+1, n):
a[i, j] = a[i, j] - L[i, k] * a[k, j] #New fill-ins appear here
for j in range(k, n):
U[k, j] = a[k, j]
The idea of ILU(k) is very instructive and is based on the connection between sparse matrices and graphs.
Suppose you have an $n \times n$ matrix $A$ and a corresponding adjacency graph.
Then we eliminate one variable (vertex) and get a smaller system of size $(n-1) \times (n-1)$.
New edges (=fill-in) appears between high-order neighbors.
The new edge can appear only between vertices that had common neighbour: it means, that they are second-order neigbours. This is also a sparsity pattern of the matrix
$$A^2.$$The ILU(k) idea is to leave only the elements in $L$ and $U$ that are $k$-order neighbours in the original graph.
The ILU(2) is very efficient, but for some reason completely abandoned (i.e. there is no implementation in MATLAB and scipy).
There is an original Sparsekit software by Saad, which works quite well.
A much more popular approach is based on the so-called thresholded LU.
You do the standard Gaussian elimination with fill-ins, but either:
Throw away elements that are smaller than threshold, and/or control the amount of non-zeros you are allowed to store.
The smaller is the threshold, the better is the preconditioner, but more memory it takes.
It is denoted ILUT($\tau$).
There is a more efficient (but much less popular due to the limit of open-source implementations) second-order LU factorization proposed by I. Kaporin
The idea is to approximate the matrix in the form
$$A \approx U_2 U^{\top}_2 + U^{\top}_2 R_2 + R^{\top}_2 U_2,$$which is just the expansion of the $UU^{\top}$ with respect to the perturbation of $U$.
where $U_1$ and $U_2$ are low-triangular and sparse, whereare $R_2$ is small with respect to the drop tolerance parameter.