The discretization of the integral equation leads to dense matrices.
The main question is how to compute the matrix-by-vector product,
i.e. the summation of the form:
$$\sum_{j=1}^M A_{ij} q_j = V_i, \quad i = 1, \ldots, N.$$The matrix $A$ is dense, i.e. its element can not be omitted. The complexity is $\mathcal{O}(N^2)$.
Can we make it faster?
The N-Body problem is the problem of computing interactions between $N$ bodies.
The model $N$-body problem reads: $$V(y_i) = \sum_{j=1}^M \frac{q_j}{\Vert y_i - x_j \Vert}, \quad i=1,\dots,N$$ –– given charges $q_j$ located at "sources" $x_j$ find potentials at "receivers" $y_j$
This summation appears in:
It is called the N-body problem .
There is no problem with memory, since you only have two cycles (for each receiver sum over all sources).
In [1]:
import numpy as np
import math
from numba import jit
@jit
def compute_nbody_direct(N, x, y, charges, res):
for i in xrange(N):
res[i] = 0.0
for j in xrange(N):
dist = (x[i, 0] - y[i, 0]) ** 2 + (x[i, 1] - y[i, 1]) ** 2
dist = math.sqrt(dist)
res[i] += charges[j] / dist
In [2]:
N = 1000
x = np.random.randn(N, 2);
y = np.random.randn(N, 2);
charges = np.ones(N)
res = np.zeros(N)
print('N = {}'.format(N))
%timeit compute_nbody_direct(N, x, y, charges, res)
N = 10000
x = np.random.randn(N, 2);
y = np.random.randn(N, 2);
charges = np.ones(N)
res = np.zeros(N)
print('N = {}'.format(N))
%timeit compute_nbody_direct(N, x, y, charges, res)
$N^2$ scaling!
One of the most famous N-body computations is the Millenium run
In [4]:
from IPython.display import YouTubeVideo
YouTubeVideo('PrIk6dKcdoU') #UC5pDPY5Nz4
Out[4]:
The particle systems can be used to model a lot of things.
For nice examples, see the website of Ron Fedkiw
One of the application areas is so-called smoothed particle hydrodynamics (SPH), where the fluid is modelled by a system of interacting particles.
In [5]:
from IPython.display import YouTubeVideo
YouTubeVideo('6bdIHFTfTdU')
Out[5]:
The density is represented as a kernel sum:
Smoothed Particle Hydrodynamics: Things I wish my mother taught me
$$\rho(r) = \sum_{j=1}^{N_{neigb}} m_j W( |r - r_j|, h).$$The simlest smoothing kernel is a 3D Gaussian, $W(x, h) = e^{-\frac{x^2}{h^2}}$.
and the first law of thermodynamics, we get the following system of ordinary differential equations (ODEs).
$$\frac{dv_i}{dt} = -\sum_j \left(\frac{P_i}{\rho^2_i} + \frac{P_j}{\rho^2_j}\right) \nabla W(|r_i - r_j|, h),$$which is a discrete form of the Euler equation
$$ \frac{dv}{dt} = -\frac{\nabla P}{\rho}. $$Introducing dissipation is non-trivial (one of the ways through Brownian motion).
Where the N-body problem arises in different problems with long-range interactions`
Mathematically, the idea is as follows. Given a cluster of particles $K$ we approximate its far field (i.e. for all points outside a sphere of radius $r > r_K$) using the total charge and center of mass. At $x$, far from $K$ we have
$$V(x) = \sum_{j \in K} q_j F(x, y_j) \approx Q F(x, y_C)$$$$Q = \sum_j q_j, \quad y_C = \frac{1}{|\sum_{j} q_j|} \sum_{j} q_j y_j$$To compute the interaction, it is sufficient to replace by the a center-of-mass and a total mass!
What about the error of such approximation?
The points are not separated initially - what to do?
The idea of Barnes and Hut was to split the sources into big blocks using the cluster tree
The construction of a cluster tree is simple, and can be done by any clutering algorithm, i.e. quadtree, kd-trees, inertial bisection.
Generally, the particles are put into a box, and this box is split into several children.
If the box is small and/or contains few particles, it is not divided.
The complexity is $\mathcal{O}(N \log N)$ for a naive implementation.
The algorithm is recursive. Let $\mathcal{T}$ be the tree, and $x$ is the point where we need to compute the potential.
The complexity is $\mathcal{O}(\log N)$ for 1 point!
Another popular choice of the tree is the KD-tree
The construction is simple as well:
Split along $x$-axis, then $y$-axis in such a way that the tree is balanced (i.e. the number of points in the left child/right child is similar).
The tree is always balanced, but biased towards the coordinate axis.
For each node of the tree, we need to compute its total mass and the center-of-mass. If we do it in a cycle, then the complexity will be $\mathcal{O}(N^2)$ for the tree constuction.
However, it is easy to construct it in a smarter way.
Now you can actually code this (minor things remaining are the bounding box and separation criteria).
Well, several
Note that in the original scheme the sources and receivers are treated differently. In "Double Tree" they are treated in the same way.
Answer: Use higher-order expansions Taylor expansions!
$$f(x + \delta x, y + \delta y) \approx f(x, y) + \sum_{k, l=0}^p (D^{k} D^{l} f) \delta ^k \delta y^l \frac{1}{k!} \frac{1}{l!} + \ldots $$For the Coloumb interaction $\frac{1}{r}$ we have the multipole expansion
$$ v(R) = \frac{Q}{R} + \frac{1}{R^3} \sum_{\alpha} P_{\alpha} R_{\alpha} + \frac{1}{6R^5} \sum_{\alpha, \beta} Q_{\alpha \beta} (3R_{\alpha \beta} - \delta_{\alpha \beta}R^2) + \ldots, $$where $P_{\alpha}$ is the dipole moment, $Q_{\alpha \beta}$ is the quadrupole moment (by actually, nothing more than the Taylor series expansion).
For cosmology this problem is so important, so that they have released a special hardware Gravity Pipe for solving the N-body problem
Sidenote, When you Google for "FMM", you will also encounter fast marching method (even in the scikit).
Everyone uses its own in-house software, so a good Python open-source software is yet to be written.
This is also a perfect test for the GPU programming (can select as a term project as well).
In [2]:
from IPython.core.display import HTML
def css_styling():
styles = open("./styles/custom.css", "r").read()
return HTML(styles)
css_styling()
Out[2]: