Write down again the basic N-body problem.
$$V_i = \sum_{j} q_j F(x_i, y_j).$$We will denote the tree as $\mathcal{T}$, its nodes as $t$.
The set children(t) is the set of all children of a given node (is empty for the leaf)
For each $t$ we will denote all pairs $(t, s)$ for all nodes $s$ that belong to the interaction list.
Definition: For box $X$ the white squares are the interaction list: the fathers do not fullfil the separability criteria, but the children are separable)
M2M operator has a very simple form for the Barnes-Hut algorithm.
Each node stores exactly 1 number (total charge/mass).
Thus for a binary tree
$$q_{father} = \underbrace{\begin{bmatrix} 1 & 1 \end{bmatrix}}_{\mbox{M2M operator}} \begin{bmatrix} q_{leaf1} \\ q_{leaf2}\end{bmatrix}$$In general,
$$ M_{father} = M2M \begin{bmatrix} M_{leaf1} \\ M_{leaf2}\end{bmatrix} $$A simple complexity estimate for the FMM in 2D gives
$$\underbrace{N p}_{\mbox{upward pass}} + \underbrace{29 \frac{N}{s} p^2}_{\mbox{translation}} + \underbrace{N p}_{\mbox{downward pass}} + \underbrace{9 N s}_{\mbox{close interaction}}$$A good choice is $s \approx p$ and that gives the complexity $40 N p$ ($p$ is the number of multipole coefficients involved)
In 3D the complexity is
$$189 \frac{N}{s} p^4 + 2 N p^3 + 27 N s,$$i.e. for $s = 2p^2$ we have
$$150 N p^2.$$This can be too much for $p$. Many $\mathcal{O}(p)$ schemes have been developed.
If the interaction between two boxes is separable, then
$$V(x) = \sum_{j} f(x, y_j) q_j = \sum_{\alpha=1}^r u_{\alpha}(x) c_{\alpha},$$where the coefficients $c_{\alpha}$ are given by the formula
$$c_{\alpha} = \sum_{j} v_{\alpha}(y_j) q_j.$$Questions:
Is this enough for BH-type complexity?
Is this enough for the FMM-type complexity?
Algebraic interpretation of the separated representation is simple.
Given two nodes $s, t$ of the cluster trees of sources and receivers, the interaction between boxes containing $s$ and $t$ is in fact the interaction between all particles in $t$ and all particles in $s$.
This is then a (sub)matrix of the total interaction matrix $A$.
It is convenient to denote it by $A(t, s)$. The separability property is then reduced on the discrete level to
$$A(t, s) \approx U(t, s) V(t, s)^{\top}, \quad U(t, s) \in \mathbb{R}^{M_t \times r}, \quad V(t, s) \in \mathbb{R}^{N_s \times r},$$where $N_s$ is number of sources and $M_t$ is number of receivers in corresponding boxes $s$ and $t$. $r$ is rank of $A(t, s)$
We split the matrix into blocks, and approximate blocks $A(t, s)$ that satisfy the admissability condition by low-rank matrices.
The interpretation by "block low-rank structure" is viable, but not enough;
A closer step to the FMM algorithm is to consider the decomposition
$$A(t, s) = U(t) \Psi(t, s) V^{\top}(s),$$where we have common factor matrices $U(t)$ and $V(s)$ for each node, and coefficient matrices $\Psi(t, s)$ which are small.
Still not an FMM: will have to store $n \times r$ matrices (there are nowhere in the FMM).
The algebraic characterization is that the rank of the block row is equal to $r$.
Then the bases are nested, i.e. from $U(t_1)$ and $U(t_2)$ we can recompute the basis for the father:
$$U(t) = \begin{bmatrix} U_{t_1} & 0 \\ 0 & U_{t_2} \end{bmatrix} \Phi((t_1,t_2)\rightarrow t).$$The transfer matrix $\Phi$ is of size $(r_{t_1} + r_{t_2}) \times r_t \approx 2r \times r$.
Let us do some whiteboard magic to show that it is indeed the case.
Now when we know the algebraic interpretation of the FMM, we can easily describe how to speedup the computation of translation operators.
The translation is just the multiplication by the matrices $\Psi(t, s)$.
Thus, if we select such basis $U'(t)$ and $V'(s)$ (maybe even larger, that the minimal one!) so $\Psi(t, s)$ is diagonal, we are fine.
The splitting can be quite interesting, for example we can consider the simplest "interaction"
$$F(x, y) = 1, \quad \Vert x - y \Vert > \delta, \quad 0, \quad \mbox{otherwise}.$$Then in 1D this is the excluded sum.
$$V_i = \sum_{j \ne i} q_j.$$The most efficient way is to compute the suffix sum and the prefix sum.
This also can be interpreted as the splitting of the matrix into blocks!
There is a well-known code HLIB for working with H-matrices. There is also kernel-independent variant.
Several "in-house" codes, we also develop one (http://bitbucket.org/muxas/h2tools)
The interaction between two boxes (clusters) is defined as a submatrix of the original matrix and "far field" loosely correponds to the low-rank of this matrix.
The equivalent charges representation follows from the skeleton decomposition of the form
$$ A(t, s)\approx \sum_{\widehat{t}, \widehat{s}} A(t, \widehat{s}) \widehat{A}^{-1}(\widehat{t}, \widehat{s}) A(\widehat{t}, s), $$
In [4]:
from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()
Out[4]: