In [4]:
from IPython.html.services.config import ConfigManager
from IPython.utils.path import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
              'theme': 'sky',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
})


Out[4]:
{u'start_slideshow_at': 'selected', u'theme': 'sky', u'transition': 'zoom'}

Lecture 13. The algebra of hierarchical matrices

Previous lecture

  • Introduction to hierarhical matrices ($\mathcal{H}, \mathcal{H}^2$) as algebraic interpretation of the FMM
  • The concept of block rows and nested bases
  • The concept of splitting of the matrix into blocks

Todays lecture

  • How to construct the hierarhical approximation (both in H- and H-2 cases)

Book

A good introductory book is S. Borm "Efficient numerical methods for non-local operators: H2-matrix compression, algorithms and analysis".

Hierarhical matrices

  • Split the matrix into blocks $A(t, s)$ corresponding to the mosaic partioning, approximate "far" blocks with low-rank matrices.
  • $H^2$ matrix: using the block row (i.e. interaction of the box with everything outside) is of low-rank.
  • Computation of the factorization requires the treatment of block matrices.

Simple case: H-matrices

The case of H-matrices is simple: the matrix is represented as a collection of low-rank blocks, so we have to approximate each block independently,

$$A(t, s) \approx U(t, s) V(t, s)^{\top}.$$

How we can do that?

NLA Flashback: approximation of low-rank matrices

A rank-$r$ matrix can be represented as

$$A = C \widehat{A}^{-1} R, $$

where $C$ are some columns of the matrix $A$, $R$ are some rows of the matrix $A$, $\widehat{A}$ is the submatrix on their intersection.

Approximate case:

If $\widehat{A}$ is the submatrix with maximal volume (where volume is the absolute value of the determinant)

Cross-approximation

Idea of the cross approximation: select the submatrix to maximize the determinant, i.e. in a greedy fashion.

The term "cross" comes from the rank-$1$ update formula

$$A_{ij} := A_{ij} - \frac{A_{i j^*} A_{i^* j}}{A_{i^* j^*}},$$

where the pivots $(i^*, j^*)$ has to be selected in such a way that $|A_{i^* j^*}|$ is as big as possible.

Pivoting strategies

  • Row/column pivoting (select a column, find maximal in it).
  • Rook pivoting
  • Additionally, do some random sampling
  • There is a result by L. Demanet et. al on the class of matrices, where the approximation exist

Concept of approximation for H-matrices

  1. Create a list of blocks
  2. Sample rows/columns to get the low-rank factorization
  3. Profit!

$H^2$-matrices

For the $H^2$ matrices the situation is much more complicated.

The standard way to go is to first compress into the $\mathcal{H}$-form, and then do recompression into the $\mathcal{H}^2$ form.

Such recompression can be done in $\mathcal{O}(n \log n)$ operations.

Can we do it directly?

Nested cross approximation

  • A block row is of low-rank -> there exist basis rows that span the row space.
  • If we join the basis rows from the children, we get the basis rows for the father.
  • This requires the SVD of the matrix $2r \times N$, and it has $\mathcal{O}(N^2)$ complexity (although better, than $\mathcal{O}(N^3)$ for direct SVD of big blocks)

Solution: Approximate the "far zone" with few receivers ("representor set").

Demo


In [16]:
import os
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'

from h2py.tree.quadtree import ParticlesQuadTree as PQT
from h2py.tree.octtree import ParticlesOctTree as POT
from h2py.tree.inertial_tree import ParticlesInertialTree as PIT
from h2py.data.particles_data import ParticlesData as PD
from h2py.problem import Problem
from h2py.data.particles import log_distance
inv_distance = log_distance
import numpy as np
from time import time
import sys

N = 20000
np.random.seed(0)
particles = np.random.rand(2, N)
data = PD(particles)
tree = PIT(data, block_size = 20)
problem = Problem(inv_distance, tree, tree)
problem.build()
problem.gen_queue(symmetric=0)

print('H2 1e-5')
problem.factorize('h2', tau=1e-5, onfly=0, verbose=1, iters=1)


1.95954990387
H2 1e-5
('Function calls:', 36682)
('Function values computed:', 33309887)
('Function time:', 1.7329978942871094)
('Average time per function value:', 5.202653177079554e-08)
('Maxvol time:', 6.476972341537476)
('Total MCBH time:', 9.493279933929443)

Representor set

  • Way 1: Select it using apriori knowledge (geometrical approach)
  • Way 2: For "good columns" it is sufficient to know good columns of the father! For details, see http://arxiv.org/abs/1309.1773

Inversion of the hierarchical matrices

Recall our goal is often to solve the integral equation (i.e., compute the inverse).

One of the possible ways is to use recursive block elimination

Block-LU (or Schur complement)

Consider the matrix $$A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22}\end{bmatrix},$$

where the first group of variables correspond to the unknowns in the first node, the second group variables corresponds to the unknowns in the second node (binary tree implicitly assumed).

Then,

$$A \begin{bmatrix} q_1 \\ q_2 \end{bmatrix} = \begin{bmatrix} f_1 \\ f_2 \end{bmatrix}.$$

After the elimination we have the following equality for the Schur complement.

$$q_1 = A^{-1}_{11} f_1 - A^{-1}_{11} A_{12} q_2, \quad \underbrace{(A_{22} - A_{21} A^{-1}_{11} A_{12})}_{\mbox{Schur complement}} q_2 = f_2 - A_{21} A^{-1}_{11} f_1.$$

The core idea is recursion: if we know $A^{-1}_{11},$ then we can compute the matrix $S$, and invert it as well.

The multiplication of H-matrices has $\mathcal{O}(N \log N)$ complexity, and is also (typically) implemented via recursion.

Multiplication of H-matrices

Consider 1D partioning, and multiplication of two matrices with H-structure (i.e., blocks(1,2) and (2,1) ) have low-rank

$$\begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22}\end{bmatrix}\begin{bmatrix} B_{11} & B_{12} \\ B_{21} & B_{22}\end{bmatrix}$$

The (1, 2) block of the result is

$$ \underbrace{A_{21} B_{11}}_{\mbox{low rank}} + \underbrace{A_{22} B_{21}}_{\mbox{low rank}} $$

(1, 1) and (2, 2) blocks are evaluated recursively, so it is a recursion inside recursion.

Summary

  • Use Block elimination (by nodes of the tree)
  • Problem is reduced to the multiplication of $H$-matrices
  • Constant is high
  • Can compute $LU$-factorization instead.

Fast direct solvers for sparse matrices

Sparse matrices coming from PDEs are in fact H-matrices as well!

So, we can compute the inverse by the same block factorization technique.

Works very well for the "1D" partioning (i.e., off-diagonal blocks are of low-rank), does not work for 2D/3D problems with optimal complexity (but the constants can be really good).

We also have our own idea (an since it is unpublished, will use the whiteboard magic here :)

Summary

  • Nested cross approximation
  • Block Schur elimination idea for the inversion

Next lecture (week)

  • We will talk about high-frequency problems (and there are problems there)
  • FFT on the non-uniform grid, butterfly algorithm
  • Precorrected FFT

In [4]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()


Out[4]:

In [ ]: