In [4]:

    
from IPython.html.services.config import ConfigManager
from IPython.utils.path import locate_profile
cm = ConfigManager(profile_dir=locate_profile(get_ipython().profile))
cm.update('livereveal', {
              'theme': 'sky',
              'transition': 'zoom',
              'start_slideshow_at': 'selected',
})









    Out[4]:





{u'start_slideshow_at': 'selected', u'theme': 'sky', u'transition': 'zoom'}

Lecture 13. The algebra of hierarchical matrices

Previous lecture

Introduction to hierarhical matrices ($\mathcal{H}, \mathcal{H}^2$) as algebraic interpretation of the FMM
The concept of block rows and nested bases
The concept of splitting of the matrix into blocks

Todays lecture

How to construct the hierarhical approximation (both in H- and H-2 cases)

Book

A good introductory book is S. Borm "Efficient numerical methods for non-local operators: H2-matrix compression, algorithms and analysis".

Hierarhical matrices

Split the matrix into blocks $A(t, s)$ corresponding to the mosaic partioning, approximate "far" blocks with low-rank matrices.
$H^2$ matrix: using the block row (i.e. interaction of the box with everything outside) is of low-rank.
Computation of the factorization requires the treatment of block matrices.

Simple case: H-matrices

The case of H-matrices is simple: the matrix is represented as a collection of low-rank blocks, so we have to approximate each block independently,

$$A(t, s) \approx U(t, s) V(t, s)^{\top}.$$

How we can do that?

NLA Flashback: approximation of low-rank matrices

A rank-$r$ matrix can be represented as

$$A = C \widehat{A}^{-1} R, $$

where $C$ are some columns of the matrix $A$, $R$ are some rows of the matrix $A$, $\widehat{A}$ is the submatrix on their intersection.

Approximate case:

If $\widehat{A}$ is the submatrix with maximal volume (where volume is the absolute value of the determinant)

Cross-approximation

Idea of the cross approximation: select the submatrix to maximize the determinant, i.e. in a greedy fashion.

The term "cross" comes from the rank-$1$ update formula

$$A_{ij} := A_{ij} - \frac{A_{i j^*} A_{i^* j}}{A_{i^* j^*}},$$

where the pivots $(i^*, j^*)$ has to be selected in such a way that $|A_{i^* j^*}|$ is as big as possible.

Pivoting strategies

Row/column pivoting (select a column, find maximal in it).
Rook pivoting
Additionally, do some random sampling
There is a result by L. Demanet et. al on the class of matrices, where the approximation exist

Concept of approximation for H-matrices

Create a list of blocks
Sample rows/columns to get the low-rank factorization
Profit!

$H^2$-matrices

For the $H^2$ matrices the situation is much more complicated.

The standard way to go is to first compress into the $\mathcal{H}$-form, and then do recompression into the $\mathcal{H}^2$ form.

Such recompression can be done in $\mathcal{O}(n \log n)$ operations.

Can we do it directly?

Nested cross approximation

A block row is of low-rank -> there exist basis rows that span the row space.
If we join the basis rows from the children, we get the basis rows for the father.
This requires the SVD of the matrix $2r \times N$, and it has $\mathcal{O}(N^2)$ complexity (although better, than $\mathcal{O}(N^3)$ for direct SVD of big blocks)

Solution: Approximate the "far zone" with few receivers ("representor set").

Demo



In [16]:

    
import os
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_NUM_THREADS'] = '1'

from h2py.tree.quadtree import ParticlesQuadTree as PQT
from h2py.tree.octtree import ParticlesOctTree as POT
from h2py.tree.inertial_tree import ParticlesInertialTree as PIT
from h2py.data.particles_data import ParticlesData as PD
from h2py.problem import Problem
from h2py.data.particles import log_distance
inv_distance = log_distance
import numpy as np
from time import time
import sys

N = 20000
np.random.seed(0)
particles = np.random.rand(2, N)
data = PD(particles)
tree = PIT(data, block_size = 20)
problem = Problem(inv_distance, tree, tree)
problem.build()
problem.gen_queue(symmetric=0)

print('H2 1e-5')
problem.factorize('h2', tau=1e-5, onfly=0, verbose=1, iters=1)









    



1.95954990387
H2 1e-5
('Function calls:', 36682)
('Function values computed:', 33309887)
('Function time:', 1.7329978942871094)
('Average time per function value:', 5.202653177079554e-08)
('Maxvol time:', 6.476972341537476)
('Total MCBH time:', 9.493279933929443)

Representor set

Way 1: Select it using apriori knowledge (geometrical approach)
Way 2: For "good columns" it is sufficient to know good columns of the father! For details, see http://arxiv.org/abs/1309.1773

Inversion of the hierarchical matrices

Recall our goal is often to solve the integral equation (i.e., compute the inverse).

One of the possible ways is to use recursive block elimination

Block-LU (or Schur complement)

Consider the matrix $$A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22}\end{bmatrix},$$

where the first group of variables correspond to the unknowns in the first node, the second group variables corresponds to the unknowns in the second node (binary tree implicitly assumed).

Then,

$$A \begin{bmatrix} q_1 \\ q_2 \end{bmatrix} = \begin{bmatrix} f_1 \\ f_2 \end{bmatrix}.$$

After the elimination we have the following equality for the Schur complement.

$$q_1 = A^{-1}_{11} f_1 - A^{-1}_{11} A_{12} q_2, \quad \underbrace{(A_{22} - A_{21} A^{-1}_{11} A_{12})}_{\mbox{Schur complement}} q_2 = f_2 - A_{21} A^{-1}_{11} f_1.$$

The core idea is recursion: if we know $A^{-1}_{11},$ then we can compute the matrix $S$, and invert it as well.

The multiplication of H-matrices has $\mathcal{O}(N \log N)$ complexity, and is also (typically) implemented via recursion.

Multiplication of H-matrices

Consider 1D partioning, and multiplication of two matrices with H-structure (i.e., blocks(1,2) and (2,1) ) have low-rank

$$\begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22}\end{bmatrix}\begin{bmatrix} B_{11} & B_{12} \\ B_{21} & B_{22}\end{bmatrix}$$

The (1, 2) block of the result is

$$ \underbrace{A_{21} B_{11}}_{\mbox{low rank}} + \underbrace{A_{22} B_{21}}_{\mbox{low rank}} $$

(1, 1) and (2, 2) blocks are evaluated recursively, so it is a recursion inside recursion.

Summary

Use Block elimination (by nodes of the tree)
Problem is reduced to the multiplication of $H$-matrices
Constant is high
Can compute $LU$-factorization instead.

Fast direct solvers for sparse matrices

Sparse matrices coming from PDEs are in fact H-matrices as well!

So, we can compute the inverse by the same block factorization technique.

Works very well for the "1D" partioning (i.e., off-diagonal blocks are of low-rank), does not work for 2D/3D problems with optimal complexity (but the constants can be really good).

We also have our own idea (an since it is unpublished, will use the whiteboard magic here :)

Summary

Nested cross approximation
Block Schur elimination idea for the inversion

Next lecture (week)

We will talk about high-frequency problems (and there are problems there)
FFT on the non-uniform grid, butterfly algorithm
Precorrected FFT



In [4]:

    
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()









    Out[4]:



In [ ]: