The discretization of the integral equation leads to **dense matrices**.

The main question is how to compute the **matrix-by-vector product**,

i.e. the summation of the form:

$$\sum_{j=1}^M A_{ij} q_j = V_i, \quad i = 1, \ldots, N.$$The matrix $A$ is **dense**, i.e. its element can not be omitted. The complexity is $\mathcal{O}(N^2)$.

Can we make it faster?

The **N-Body problem** is the problem of computing interactions between $N$ bodies.

The model $N$-body problem reads: $$V(y_i) = \sum_{j=1}^M \frac{q_j}{\Vert y_i - x_j \Vert}, \quad i=1,\dots,N$$ –– given charges $q_j$ located at "sources" $x_j$ find potentials at "receivers" $y_j$

This summation appears in:

- Modelling of large systems of charges
- Astronomy –– planet movement (where instead of $q_j$ we have
**masses**) - Discretization of IEs

It is called the N-body problem .

There is no problem with memory, since you only have two cycles (for each receiver sum over all sources).

```
In [1]:
```import numpy as np
import math
from numba import jit
@jit
def compute_nbody_direct(N, x, y, charges, res):
for i in xrange(N):
res[i] = 0.0
for j in xrange(N):
dist = (x[i, 0] - y[i, 0]) ** 2 + (x[i, 1] - y[i, 1]) ** 2
dist = math.sqrt(dist)
res[i] += charges[j] / dist

```
In [2]:
```N = 1000
x = np.random.randn(N, 2);
y = np.random.randn(N, 2);
charges = np.ones(N)
res = np.zeros(N)
print('N = {}'.format(N))
%timeit compute_nbody_direct(N, x, y, charges, res)
N = 10000
x = np.random.randn(N, 2);
y = np.random.randn(N, 2);
charges = np.ones(N)
res = np.zeros(N)
print('N = {}'.format(N))
%timeit compute_nbody_direct(N, x, y, charges, res)

```
```

$N^2$ scaling!

One of the most famous N-body computations is the Millenium run

- More than 10 billions particles ($2000^3$)
- $>$
**1 month**of computations,**25 Terabytes**of storage - Each "particle" represents approximately a billion solar masses of dark matter
- Study, how the matter is distributed through the Universy (cosmology)

```
In [4]:
```from IPython.display import YouTubeVideo
YouTubeVideo('PrIk6dKcdoU') #UC5pDPY5Nz4

```
Out[4]:
```

The particle systems can be used to model a lot of things.

For nice examples, see the website of Ron Fedkiw

One of the application areas is so-called **smoothed particle hydrodynamics (SPH)**, where the fluid is modelled by a system of interacting particles.

```
In [5]:
```from IPython.display import YouTubeVideo
YouTubeVideo('6bdIHFTfTdU')

```
Out[5]:
```

The **density** is represented as a **kernel sum**:

Smoothed Particle Hydrodynamics: Things I wish my mother taught me

$$\rho(r) = \sum_{j=1}^{N_{neigb}} m_j W( |r - r_j|, h).$$The simlest smoothing kernel is a 3D Gaussian, $W(x, h) = e^{-\frac{x^2}{h^2}}$.

and the first law of thermodynamics, we get the following system of ordinary differential equations (ODEs).

$$\frac{dv_i}{dt} = -\sum_j \left(\frac{P_i}{\rho^2_i} + \frac{P_j}{\rho^2_j}\right) \nabla W(|r_i - r_j|, h),$$which is a discrete form of the Euler equation

$$ \frac{dv}{dt} = -\frac{\nabla P}{\rho}. $$**Introducing dissipation is non-trivial** (one of the ways through Brownian motion).

Where the N-body problem arises in different problems with **long-range** interactions`

- Cosmology (interacting masses)
- Electrostatics (interacting charges)
- Molecular dynamics (more complicated interactions, maybe even 3-4 body terms).
- Particle modelling (
**smoothed particle hydrodynamics**) - Integral equations (
**indirectly**after quadratures, the phylosophy is the same)

Mathematically, the idea is as follows. Given a cluster of particles $K$ we approximate its **far field** (i.e. for all points outside a sphere of radius $r > r_K$) using the **total charge** and **center of mass**.
At $x$, far from $K$ we have

To compute the interaction, it is sufficient to replace by the a center-of-mass and a total mass!

What about the error of such approximation?

The points are not separated initially - what to do?

The idea of Barnes and Hut was to split the sources into big blocks using the cluster tree

The construction of a cluster tree is simple, and can be done by any clutering algorithm, i.e. **quadtree**, **kd-trees**, **inertial bisection**.

Generally, the particles are put into a **box**, and this box is split into several **children**.

If the box is small and/or contains **few particles**, it is not divided.

The complexity is $\mathcal{O}(N \log N)$ for a naive implementation.

The algorithm is recursive. Let $\mathcal{T}$ be the tree, and $x$ is the point where we need to compute the potential.

- Set $K$ to the root node
- If $x$ and $K$ are separated (given some parameter, see problem set 2), then set $V(x) = Q V(y_{\mathrm{center}})$
- If $x$ and $K$ are not separated, compute $V(x) = \sum_{C \in \mathrm{sons}(K)} V(C, x)$ recursion

The complexity is $\mathcal{O}(\log N)$ for 1 point!

Another popular choice of the tree is the KD-tree

The construction is simple as well:

Split along $x$-axis, then $y$-axis in such a way that the tree is **balanced** (i.e. the number of points in the left child/right child is similar).

The tree is always balanced, but biased towards the coordinate axis.

For each node of the tree, we need to compute its total mass and the center-of-mass. If we do it in a cycle, then the complexity will be $\mathcal{O}(N^2)$ for the tree constuction.

However, it is easy to construct it in a smarter way.

- Start from the children (which contain only few particles) and fill then
- Bottom-to-top graph traversal: if we know the charges for the children, we can cheaply compute the total charge/center of massfor the father

Now you can actually code this (minor things remaining are the **bounding box** and separation criteria).

Well, several

- The logarithmic term (want $\mathcal{O}(N)$)
- Low accuracy $\varepsilon = 10^{-2}$ is OK, but if we want $\varepsilon=10^{-5}$ we have to take larger separation criteria

**Complexity**: To avoid the logarithmic term, we need to store two trees, for the sources, and for receivers**Accuracy**: instead of the piecewise-constant approximation (we used only sum of charges) which is inheritant in the BH algorithm, use more accurate representations (multipole expansion).

Note that in the original scheme the sources and receivers are treated differently. In "Double Tree" they are treated in the same way.

**Answer:** Use higher-order expansions **Taylor expansions**!

For the Coloumb interaction $\frac{1}{r}$ we have the **multipole expansion**

where $P_{\alpha}$ is the **dipole moment**, $Q_{\alpha \beta}$ is the **quadrupole moment** (by actually, nothing more than the Taylor series expansion).

For cosmology this problem is so important, so that they have released a special hardware Gravity Pipe for solving the N-body problem

**Sidenote**, When you Google for "FMM", you will also encounter **fast marching method** (even in the scikit).

Everyone uses its own **in-house** software, so a good Python open-source software is yet to be written.

This is also a perfect test for the **GPU** programming (can select as a term project as well).

```
In [2]:
```from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()

```
Out[2]:
```