Fast PDE/IE course, Skoltech, Spring 2015

Problem 1 (70 pts). The $N$-Body Problem

The motion of more than two orbiting bodies (the $N$-body problem) may be described by the Newton second law $$ \frac{d^2}{dt^2} {\bf x}_j = {\bf g} ({\bf x}_j), $$ where $M_i$ are masses, ${\bf x}_i$ are coordinates of particles and ${\bf g}$ is the force field, which may be written in terms of the gravitational potential ${\bf g} \equiv - \nabla \phi$.

The $N$-body problem is a challenging problem that had eluded mathematicians since Newton’s time. In honour of his 60th birthday on January 21, 1889, Oscar II, King of Sweden and Norway, established a prize for anyone who could find the solution to the problem. The prize was awarded to Poincaré, even though he did not solve the original problem. Only a century later in 1990 was an analytical solution to the $n$-body problem finally found, in the form of a converging power series.

Fortunately, it is significantly easier to approximate the solution to this famous problem using numerical techniques. However, from the computational point of view this deceptively simple ODE is greatly complicated by two considerations. Firstly, ${\bf g}$ is affected by the presence of all particles within the system, and computing the forces upon $N$ particles due their interaction with $N$ particles by direct summation is an $\mathcal{O}(N^2)$ operation. Secondly, the particles are mobile, and the forces upon the $N$ particles must be repeatedly computed at each new time-step. Considering that galaxies are comprised of billions of stars, the complexity of computations quickly become prohibitive.

Fortunately, once accuracy is fixed, fast algorithms can be used. These algorithms are based on the idea of using approximate low-parametric representations of potentials. They yield $\mathcal{O}(N \log N)$ (e.g., Barnes–Hut algorithm) or even with $\mathcal{O}(N)$ complexity (fast multipole method). With the help of these algorithms recent simulations of galactic motion, such as the millenium simulation project by the Max Planck Institute for Astrophysics, have included over $300$ billion particles simulated over the equivalent span of more than $13$ billion years!

  • (5 pts) Estimate how much time requires to do one matrix-by-vector multiplication for $300$ billion particles with $\mathcal{O}(N^2)$ and $\mathcal{O}(N)$ complexity on a supercomputer

The Low-Rank nature of the FMM

Recently the low-rank interpretation of the Fast Multipole Method was developed. The $\mathcal{O}(N \log N)$ complexity can be obtained with $\mathcal{H}$-matrices and $\mathcal{O}(N)$ with $\mathcal{H^2}$-matrices. In this part of the problem set you will investigate the low-rank nature of the FMM and will implement matrix-by-vector multiplication with the help of $\mathcal{H}$-matrices.

In this example we will investigate low-rank nature of the interaction between well-separated clusters of particles. Let ${\bf y}_j$, $j=1\dots N$ be source point charges located randomly within the unit square $[0,1]^2$ and let ${\bf x}_j$, $j=1\dots N$ be receivers points randomly located within a specified target square.

Newton potential

Consider the Newtonial gravitational potential function $\phi$ as follows: $$ \phi({\bf x}) = \frac{1}{|{\bf x}|} $$ The dense coefficient matrix $G$ associated with the potential interaction between ${\bf x}_m$ and ${\bf y}_n$ under Newtonian gravity contains an element on the $j$-th row and $k$-th column equal to: $$ G_{mn} = \frac{1}{|{\bf x}_m - {\bf y}_n|} $$ Compute the relative rank of this matrix, by performing a singular value decomposition $G\approx U\Sigma V^T$, so that the relative spectral error of the reconstruction $\|G- U\Sigma V^T\|_2/\|G\|_2 \leqslant \epsilon$, where $\epsilon = 10^{-5}$. The relative rank of the matrix is equal to the number of non-zero singular values with spectral norms $\geqslant \sigma_1 \epsilon$.

  • (5 pts) Use 20 source points in the unit square with lower left corner at the origin, and 20 test points in a target unit square. Produce a plot of the rank of the associated $G$ matrix, as a function of the target square position. In particular, place the target square’s bottom left corner at $[2, p]$ and the top right corner at $[3, p+ 1]$. Vary $p$ from $0$ to $5$ in steps of $0.5$.
  • (5pts) With the bottom left corner of the target square placed at [2,4] and the top right corner placed at [3,5]. Vary the number of interacting points N from 20 to 200 in steps of 20. Does increasing the number of interactions and the size of the matrix G significantly affect its rank?

High-Frequency issues

Suppose that Newtonian gravity were instead governed by the Helmholtz Green’s function as follows: $$ \phi({\bf x}) = \frac{e^{ik|{\bf x}|}}{|{\bf x}|}. $$ The dense coefficient matrix $G$ associated with the interaction between ${\bf x}_m$ and ${\bf y}_n$ associated with this new potential function contains an element on the $m$-th row and $n$-th column equal to: $$ G_{mn} = \frac{e^{ik|{\bf x}_m - {\bf y}_n|}}{|{\bf x}_m - {\bf y}_n|}. $$

  • (5 pts) Use $20$ source points in the unit square with lower left corner at the origin, and $20$ test points in a target unit square. Let $k = 10$, produce a plot of the rank of the associated $G$ matrix, as a function of the target square position. In particular, place the target square’s bottom left corner at $[2, p]$ and the top right corner at $[3, p+ 1]$. Vary $p$ from $0$ to $5$ in steps of $0.5$.
  • (5 pts) Use $20$ source points and $20$ test points, and set the bottom left corner of the target square placed to $[2,4]$ and the top right corner to $[3,5]$. Produce a plot of the rank of the interaction as a function of $k$, varying from $1$ to $100$. What are the difficulties presented by having a high value of $k$?

In [ ]:

The $\mathcal{H}$-matrices

Let ${\bf y_j} $, $j=1\dots N$ be coordinates of randomly distributed particles in the 2D box $[0,1]^2$. Let reciever points ${\bf x}_j $ be the same points ${\bf y_j}$. Consider Newton potential from the previous part and implement matvec operation with the matrix $G$ in the $\mathcal{H}$-format with $\mathcal{O}(N \log N)$.

The idea of $\mathcal{H}$-matrices is quite simple. First, we separate our particles in a quad-tree structure of particle groups. If the group on some tree level is sufficiently far away from another group, we can approximate its gravitational effects by the low-rank approximation with some accuracy $\epsilon$. If it is not, we compute its influence recursively by using lower tree levels. Note that the group of sources and receivers are separated if the paramenter $\theta = s / d$ is small enough (here $s$ is the width of the region represented by the internal node, and $d$ is the distance between the body and the node's center-of-mass).

  • (8 pts) Propose an algorithm for the quadtree construction (here tree for sources = tree for receivers). How much memory the tree storage takes? Can you reach $\mathcal{O}(N)$ memory for the storage? Propose a way to store the tree in Python and write the program that computes the tree, given the location of the particles. What do you need to store in each node of the tree?
  • (17 pts) Write $\mathcal{H}$-matrix by vector multiplication. The program should consist of three parts:
    1. Tree and interaction list construction, given the location of the particles, geometric constant $\theta$ and $\epsilon$-truncation level
    2. Filling the information in the tree
    3. Computing the matvec
  • (10 pts) Compare the results computed by direct evaluation and by $\mathcal{H}$-matrices. Study the dependence of accuracy and computational cost on the geometric parameter $\theta$ and $\epsilon$
  • (5 pts) Explain in a nutshell the $\mathcal{H}^2$-matrices as compared with the $\mathcal{H}$-matrices
  • (5 pts) Explain how does $\mathcal{H}^2$-matrices relate to the FMM. Give exapmles where low-rank interpretation is preferable

In [ ]:


In [1]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()


Out[1]: