Lecture 3: FFT

Previous lecture

  • Basic discretization schemes (Galerkin, collocation, Nystrom)
  • Some approaches to compute singular integrals

Todays lecture

  • Fast methods: computation of the convolution via FFT, idea of precorrected FFT.
  • More integral equation kernels

Integral equations and non-local interactions

The main problem with integral equation methods (we talked about boundary integral equations, but there are also volume integral equations is that we have to deal with non-local interactions.

After discretization, we have to compute the sum

$$v_i = \sum_{j=1}^N A_{ij} q_j,$$

and all $A_{ij}$ are not small (i.e., they can not be approximated by sparse matrices).

Storage and complexity

  • The naive approach is to compute all matrix elements $A_{ij}$ and store them in a matrix.
  • The storage complexity is $N^2$ elements
  • The LU-complexity is $\mathcal{O}(N^3)$, but we can also go with iterative methods, and have $\mathcal{O}(N_{iter} N^2)$ complexity.

Do we need fast methods?

If we have two-dimensional PDE, the integral equation will be defined on a curve.

For a second-order discretization, 1000 elements are needed, i.e. the matrix will be

$10^{3} \times 10^{3}$, and storage is just 8 megabytes, i.e. fast methods are not very needed.

For 3D problems and surface integral equations, the storage problem becomes much more complicated:

For $h = 10^{-3}$ we have $10^6$ unknowns and the matrix is $10^6 \times 10^6$, more than 8 Terabytes.

Solving the storage problem

One of the ways to deal with the storage problem is to solve the system using matrix-by-vector product.

Apply iterative method, and compute

$$v_i = \sum_{j=1}^N A_{ij} y_j $$

on the fly.

The complexity, however, stays $\mathcal{O}(N^2)$ as well.

Moreover, the evaluation of matrix elements can be expensive, especially if Galerkin/collocation methods are used.

Translation-invariant case

One of the way to solve this problem is to use translation-invariance of many popular integral equation kernels.

This, however, requires the special properties of the geometry and basis functions.

Boundary integral equations with translation-invariant kernels

The first kind integral equation reads

$$\int_{\partial \Omega} \frac{q(y)}{\Vert x - y \Vert} dy = f(x), \quad x \in \partial \Omega,$$

where $\Omega$ is a certain domain in 3D.

Model problem

Consider the integral equation on a square:

$$\int_{\Omega} \frac{q(y)}{\Vert x - y \Vert} dy = f(x), $$

Let us discretize using shifts of the same basis functions,

$$q(y) = \sum_{ij} \phi_{ij} q_{ij}, $$

where $$\phi_{ij}(y_1, y_2) = \phi( y_1 - h i, y_2 - h j).$$

What would be the structure of the matrix?

Block Toeplitz with Toeplitz blocks

The matrix will have two-level block Toeplitz with Toeplitz blocks (BTTB) structure,

$$A_{ij} = A(i_1 i_2, j_1 j_2) = A(i_1 - j_1, i_2 - j_2),$$

i.e. it is defined by $(2 n - 1) \times (2n - 1) \approx 4 n^2 = 4 N$ parameters.

BTTB matrix by vector multiplication

Recall from numerical linear algebra, that BTTB matrix can be multiplied by a vector in $\mathcal{O}(N \log N)$ operations.

The idea is a generalization of the idea for one level Toeplitz matrix: any $N \times N$ matrix can be embedded into a $2N \times 2N$ circulant matrix.

Circulant matrices and spectral theorem

A matrix is called circulant, if

$$A_{ij} = A( (i -j)\mod N),$$

i.e. it wraps periodically

Spectral theorem

Any circulant matrix can be represented as

$$C = F D F^*,$$

where $$ D = \mathrm{diag}(d), \quad d = F c, $$

where $c$ is the the first column of $C$ and $F$ is a DFT matrix,

$$ F_{kl} = w^{kl}, \quad w = \exp\left( \frac{2 \pi i}{n}\right). $$

Fast Toeplitz matrix-by-vector product

  • Embed a Toeplitz matrix $T$ into a circulant matrix $C$
  • Pad the vector $q$ by zeros for matching dimensions
  • Use the fact that $$C \begin{bmatrix} q \\ 0 \end{bmatrix} = \begin{bmatrix} Tq \\ * \end{bmatrix}.$$
  • Compute circulant matrix-by-vector product using two FFTs and multiplication by diagonal matrix.

Two-level (BTTB) case

For two-level model problem on a square, the matrix is not Toeplitz, but block Toeplitz with Toeplitz blocks.

It can be embedded into block circulant with circulant blocks, which can be diagonalized in $\mathcal{O}(N)$ operations using two-dimensional FFT:

$$F_2 = F \otimes F.$$

Putting it all together

For a translation-invariant case with a square uniform grid we have a matrix with BTTB structure, which can be:

  • Stored with $\mathcal{O}(N)$ memory
  • Multiplied with $\mathcal{O}(N \log N)$ complexity

But if the domain is not a square (i.e. a sphere or a complicated domain) using Toeplitz structure becomes very difficult.

What to do? The idea of precorrected FFT comes to mind, where 2D structure is embedded into a regular 3D grid.

Precorrected FFT

Suppose we are given a set of conducting surfaces, each surface is discretized into panels, and piecewise-constant basis functions are used.

We solve our favourite first kind integral equation

$$\psi(x) = \int_{surfaces} \frac{q(y)}{\Vert x - y \Vert} dy.$$

Embedding the panels into a parallelepiped

Consider a 3D parallelepiped splitted into $k \times l \times m$ array of small cubes in such a way that each cube contains only a small number of panels.

The key idea is that for all evaluation points, distant for the particular cell, it can be represented as a weighted sum of point charges located on a uniform grid.

pFFT approach

  1. Project the panel charges onto a uniform grid of point charges
  2. Compute the grid potentials due to grid charges using an FFT
  3. Interpolate the grid potentials onto the panels
  4. Directly compute nearby interactions

The name precorrected comes from the exact evaluation of close interactions

Step 1: projection

In a cell, we want to match the potential, created by $m$ panels, by a uniform grid of $p \times p \times p$ point charges. Then, we just select a test surface of radius $r_c$, select test points $z_i$, $i=1, \ldots, N_{test}$ and require that

$$\psi_{uniform}(z_i) \approx \psi_{panels}(z_i), \quad i=1,\ldots,N_{test}$$

on the test points.

We have $p^3$ unknowns and $N_{test}$ equations, and it is solved as a least squares.

Computing grid potentials

Once the uniform grid charges have been evaluation, we compute

$$V(i, j, k) = \sum_{i'j'k'} H(i-i', j-j',k-k') \widehat{q}(i', j', k'),$$

using 3D FFT.

The main benefit is that highly optimized FFT code (including parallel versions) has been developed.

Interpolation grid potentials

To get the value at evaluation point, it is sufficient to use the transpose matrix to the matrix that interpolates the charges from a non-uniform grid to a uniform grid.

The projection/interpolation have similar accuracy.

For collocation, the result can be written as

$$\psi_{c} = V^{\top} H W q, $$

and for Galerkin method:

$$\psi_{G} = W^{\top} H W q,$$

i.e. pFFT preserves the symmetry of the Galerkin method.

Final step: precorrecting

The main diffuculty is that the projection-FFT-interpolation step does not accurately represent the nearby interactions.

Let $(k, l)$ be a pair of nearby cells.

Then,

$$\psi_{c (k, l)} = V^{\top}(k) H(k, l) W(l) q_l,$$

adding and subtracting the wrong contribution we have

$$\psi_{(k, l)} = \psi_{c (k, l)} - V^{\top}(k) H(k, l) W(l) q_l + P_{k, l} q_l.$$

Precorrected grid operator

This can be effectively achieved by introduction of corrected operator

$$\widehat{P}_{k,l} = P_{k, l} - V^{\top}(k) H(k, l) W(l).$$

Final algorithm

  • Precomputation: for all close cells $(k, l)$ compute sparse precorrected matrix $\widehat{P}$.
  • $\psi = \widehat{P} q + V^{\top} H W q,$ as a product of 3 matrices: sparse $V$ and $W$, 3D Toeplitz matrix $H$.

More integral equations kernels

The simplicity of pFFT is that it can be applied very easily to other integral equation kernels with a similar efficiency, and few additional programming.

A disadvantage is that for simple sufraces many of the panel charges will be zero, i.e. the representation can be redundant.

Finally, let us present some other integral equation kernels that are often used.

Stokes problem

Stokes equation define the laminar flow in low Reynolds number regime:

\begin{equation} \begin{split} - \nabla p + \mu \Delta v = -F \delta(r) \\ \nabla \cdot v = 0 \end{split} \end{equation}

Here $p$ is pressure, $v$ is velocity.

The solution is so-called Stokeslet, and it is a matrix:

$$\mathbb{J}(r) = \frac{1}{8 \pi \mu} \left( \frac{I}{r} + \frac{r_i r_j}{r^3} \right), \quad u = F J , \quad p = \frac{(F, \mathbf{r})}{r^3}.$$

Instationary Maxwell equations

$$ \begin{split} \nabla \times H = \frac{\partial D}{\partial t} + J, \\\quad \nabla \times E = -\frac{\partial B}{\partial t}, \quad \nabla \cdot D = \rho, \quad \nabla \cdot B = 0. \\ D = \varepsilon E, \quad H = \mu B. \end{split} $$

If we consider time-harmonic case, i.e. the dependence from time is $e^{i w t}$, we get time-harmonic Maxwell equations.

However, for Maxwell equations the most natural setting is scattering from inhomogenious media, and volume integral equations (or volume integro-differential) equations are used instead.

Demo

We will illustrate how the BEM++ package works, which is quite general package for solving boundary integral equations (unfortunately, installation can be painful).

Interior Laplace

Maxwell scattering

Summary

  • Precorrected FFT
  • Demo of BEM++

Next lecture

  • Elaborate on the idea of close/fast once more in Barnes-Hut method

In [2]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()


Out[2]: