- Complexity analysis for the FMM
- M2M, M2L, L2L operators via Taylor expansions
- Separability criteria and low-rank approximation
- What is the algebraic interpretation of the Barnes-Hut method
- What is the algebraic interpretation of the FMM
- The concept of equivalent charges
- $\mathcal{H}$, $\mathcal{H}^2$-matrices
- A brief recap on the uniform grid computations

Write down again the basic N-body problem.

$$V_i = \sum_{j} q_j F(x_i, y_j).$$- We construct trees for the
**sources**$y_j$ and**receivers**$x_i$. - We construct S2M, M2M, M2L, L2L, L2T operators
- We fill the "multipoles" in the sources tree using M2M operators (
**upward pass**) - We fill the "local" coefficients in the receivers tree using M2L operators (
**translation operator**) - We fill the "local" coefficients using L2L operators (
**downward pass**)

We will denote the tree as $\mathcal{T}$, its nodes as $t$.

The set **children(t)** is the set of all children of a given node (is empty for the leaf)

For each $t$ we will denote all pairs $(t, s)$ for all nodes $s$ that belong to the **interaction list**.

Definition: For box $X$ the white squares are the **interaction list**: the fathers do not fullfil the separability criteria, but the children are separable)

M2M operator has a very simple form for the Barnes-Hut algorithm.

Each node stores exactly **1 number** (total charge/mass).

Thus for a binary tree

$$q_{father} = \underbrace{\begin{bmatrix} 1 & 1 \end{bmatrix}}_{\mbox{M2M operator}} \begin{bmatrix} q_{leaf1} \\ q_{leaf2}\end{bmatrix}$$In general,

$$ M_{father} = M2M \begin{bmatrix} M_{leaf1} \\ M_{leaf2}\end{bmatrix} $$A simple complexity estimate for the FMM in 2D gives

$$\underbrace{N p}_{\mbox{upward pass}} + \underbrace{29 \frac{N}{s} p^2}_{\mbox{translation}} + \underbrace{N p}_{\mbox{downward pass}} + \underbrace{9 N s}_{\mbox{close interaction}}$$A good choice is $s \approx p$ and that gives the complexity $40 N p$ ($p$ is the number of multipole coefficients involved)

In 3D the complexity is

$$189 \frac{N}{s} p^4 + 2 N p^3 + 27 N s,$$i.e. for $s = 2p^2$ we have

$$150 N p^2.$$This can be too much for $p$. Many $\mathcal{O}(p)$ schemes have been developed.

If the interaction between two boxes is separable, then

$$V(x) = \sum_{j} f(x, y_j) q_j = \sum_{\alpha=1}^r u_{\alpha}(x) c_{\alpha},$$where the coefficients $c_{\alpha}$ are given by the formula

$$c_{\alpha} = \sum_{j} v_{\alpha}(y_j) q_j.$$Questions:

Is this enough for BH-type complexity?

Is this enough for the FMM-type complexity?

Algebraic interpretation of the separated representation is simple.

Given two nodes $s, t$ of the cluster trees of sources and receivers, the interaction between boxes containing $s$ and $t$ is in fact the interaction between **all particles** in $t$ and **all particles** in $s$.

This is then a (sub)matrix of the total interaction matrix $A$.

It is convenient to denote it by $A(t, s)$. The separability property is then reduced on the discrete level to

$$A(t, s) \approx U(t, s) V(t, s)^{\top}, \quad U(t, s) \in \mathbb{R}^{M_t \times r}, \quad V(t, s) \in \mathbb{R}^{N_s \times r},$$where $N_s$ is number of sources and $M_t$ is number of receivers in corresponding boxes $s$ and $t$. $r$ is rank of $A(t, s)$

**admissability condition** by low-rank matrices.

The interpretation by "block low-rank structure" is viable, but not enough;

A closer step to the FMM algorithm is to consider the decomposition

$$A(t, s) = U(t) \Psi(t, s) V^{\top}(s),$$where we have **common factor matrices** $U(t)$ and $V(s)$ for each node, and coefficient matrices $\Psi(t, s)$ which are **small**.

Still not an FMM: will have to store $n \times r$ matrices (there are nowhere in the FMM).

The algebraic characterization is that the **rank of the block row** is equal to $r$.

Then the bases are nested, i.e. from $U(t_1)$ and $U(t_2)$ we can recompute the basis for the father:

$$U(t) = \begin{bmatrix} U_{t_1} & 0 \\ 0 & U_{t_2} \end{bmatrix} \Phi((t_1,t_2)\rightarrow t).$$The transfer matrix $\Phi$ is of size $(r_{t_1} + r_{t_2}) \times r_t \approx 2r \times r$.

- Thus, we only need the matrices to recompute the basises from the sons to the father.
- The same holds for the columns.
- The transfer matrices $\Phi$ are the M2M operators, the matrix $\Psi(t, s)$ are the M2L operators, and the same column matrices are the $L2L$ operators!

Let us do some whiteboard magic to show that it is indeed the case.

Now when we know the algebraic interpretation of the FMM, we can easily describe how to speedup the computation of translation operators.

The **translation** is just the multiplication by the matrices $\Psi(t, s)$.

Thus, if we select such basis $U'(t)$ and $V'(s)$ (maybe even larger, that the minimal one!) so $\Psi(t, s)$ is **diagonal**, we are fine.

The splitting can be quite interesting, for example we can consider the simplest "interaction"

$$F(x, y) = 1, \quad \Vert x - y \Vert > \delta, \quad 0, \quad \mbox{otherwise}.$$Then in 1D this is the **excluded sum**.

The most efficient way is to compute the **suffix sum** and the **prefix sum**.

This also can be interpreted as the splitting of the matrix into blocks!

There is a well-known code HLIB for working with H-matrices. There is also **kernel-independent** variant.

Several "in-house" codes, we also develop one (http://bitbucket.org/muxas/h2tools)

The interaction between two boxes (clusters) is defined as a **submatrix** of the original matrix and "far field" loosely correponds to the low-rank of this matrix.

The equivalent charges representation follows from the **skeleton decomposition** of the form

```
In [4]:
```from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()

```
Out[4]:
```