Quasi Monte-Carlo integration


In [ ]:
# Load libraries
library(ggplot2, quietly=TRUE);
library(randtoolbox, quietly=TRUE);

# Decrease plot sizes
options(repr.plot.width=4, repr.plot.height=4);

Table of contents

  • References
  • Monte Carlo integration (MC)
  • Sampling (QMC)
  • Low-discrepancy sequences
  • Sobol-sequences
  • High-dimensional Sobol-sequences and Brownian bridges

References

Monte Carlo integration

Monte Carlo integration provides a general method of computing multivariate integrals through random sampling. Many problems can be transformed into the computation of the integral of some function $f$ on the d-dimensional unit hypercube $[0,1]^d$:

\begin{equation} \int_{[0,1]^d} \! f(\mathbf{x}) \, \mathrm{d}\mathbf{x}. \quad \mathbf{x} = (x_1, \ldots, x_d) \end{equation}

We can estimate this integral by picking $N$ uniformly distributed points in $[0,1]^d$ by averaging:

\begin{equation} \frac{1}{n}\sum\limits_{i=1}^n\; f(\mathbf{x}_n), \end{equation}

And as we increase N, we increase the precision of our approximation:

\begin{equation} \lim_{n\to\infty}\; \frac{1}{n}\sum\limits_{i=1}^n\; f(\mathbf{x}_n) = \int_{[0,1]^s} \! f(\mathbf{x}) \, \mathrm{d}\mathbf{x}. \end{equation}

If we select the points uniformly from the hypercube, the approximation will converge faster. Thus to improve efficiency, the main problem is how to select the sampling points.

Sampling

We can use either of these approaches to generate points in the unit hypercube:

  • Uniform grid
  • Pseudo-random number generation (PRNG)
  • Quasi-random number generation (QRNG)
  • or a combination (e.g. using many PRNGs for different subintervals, this is called stratified sampling)

Uniform grid

The most basic approach will be to generate a uniform grid:


In [ ]:
w <- 20
ggplot(data = data.frame(x=((0:(w*w-1))%/%w)/w,y=((0:(w*w-1))%%w)/w), aes(x,y))+ geom_point()

The problem is that we with this approach have to select the resolution of the grid in advance. We usually don't know how long to iterate in advance, and often we want to the algorithm to continue until we reach some precision.

Pseudo-random number generation

With pseudo-random number generation we will be able to do this, but the points can clumb together or leave larger unfilled regions. Evaluating $f(\mathbf{x})$ at a point close to another one we have already used, will not give much new information and will thus not improve on our current approximation.


In [ ]:
n <- 400
set.generator("MersenneTwister", initialization="init2002", resolution=53, seed=12345)
mersenne_frame <- data.frame(x=runif(n), y=runif(n))
ggplot(data = mersenne_frame, aes(x,y))+ geom_point()

Quasi-random sequences

In contrast, Quasi-random number generators does not aim to be truly random, they instead generate points in a more structured fashion, filling more and more of unit hypercube. After calculating 400 2D-points of the Torus algorithm (Kronecker sequences), we obtain the following pattern:


In [ ]:
n <- 400
torus_sequence <- torus(n, dim = 2)
ggplot(data = data.frame(x=torus_sequence[,1],y=torus_sequence[,2]), aes(x,y))+ geom_point()

But unlike uniform grid, we can continue to fill in more and more points in-between those already generated:


In [ ]:
n <- 2000
torus_sequence <- torus(n, dim = 2)
ggplot(data = data.frame(x=torus_sequence[,1],y=torus_sequence[,2]), aes(x,y))+ geom_point()

Low-discrepancy sequences

Sequences, such as the one generated by the Torus algorithm above, are called low-discrepancy sequences and the algorithms for generating them are called quasi-random number generators (QRNGs). The basic idea for these algorithms is to generate proportionally equal amounts of samples in each subinterval, thus limiting the number of empty regions and regions with high sample density.

On Wikipedia we find the following definition of discrepancy:

The *discrepancy* of a set $P = \{ x_1, \ldots, x_N\}$ is defined, using Niederreiter's notation, as \begin{equation} D_N(P) = \sup_{B\in J} \left| \frac{A(B;P)}{N} - \lambda_s(B) \right| \end{equation} where $λ_s$ is the $s$-dimensional Lebesgue measure, $A(B;P)$ is the number of points in $P$ that fall into $B$, and $J$ is the set of $s$-dimensional intervals or boxes of the form \begin{equation} \prod_{i=1}^s [a_i, b_i) = \{ \mathbf{x} \in \mathbf{R}^s : a_i \le x_i < b_i \} \, \end{equation} where $0 \le a_i < b_i \le 1$.

This can be understood as finding the region in the point set, with the largest difference between observed size $\left(\frac{A(B;P)}{N}\right)$ and actual size $\left(\lambda_s(B)\right)$.

Sobol sequences

The Sobol-sequence is a method of generating quasi-random integers in a $D$-dimensional unit hypercube.

A very approachable description of Sobol generation algorithms is given by Bradley et al. in their paper "Parallelisation Techniques for Random Number Generators" https://people.maths.ox.ac.uk/gilesm/files/gems_rng.pdf But lookout, they use both 0- and 1-indexing, which also have confused themselves, leading to an indexing error in equation (4), it should say $m_{q_n + 1}$.

We will now build a simple library for generating Sobol-numbers, to illustrate the different algorithms.

A $D$-dimensional Sobol-sequence consists of $D$ one-dimensional Sobol-sequences. Thus we can generate each dimension in parallel, and we only need to describe how to generate one-dimensional sequences.

However, to test our generators, we will generate two-dimensional sequences and use them to calculate an estimate for $\pi$.

Direction vectors

To initialise each dimension, we need a so-called direction vector. We will not describe how these are generated (TODO: maybe we should actually do that at some point). We will use 30 direction numbers for each sequence:

\begin{equation} m_0, m_1, \ldots m_{29} \end{equation}

Each such sequence contain information to generate a Sobol-sequence of 30-bit quasi-random integers.


In [ ]:
sobol_bitwidth <- 30

The integers are later normalised into the hypercube by division with:


In [ ]:
sobol_divisor <- 2^30

We are going to use the following two direction vectors in our $\pi$ example later:


In [ ]:
sobol_directionVector <- 
  rbind(c(536870912,268435456,134217728,67108864,33554432,
          16777216,8388608,4194304,2097152,1048576,524288,
          262144,131072,65536,32768,16384,8192,4096,2048,
          1024,512,256,128,64,32,16,8,4,2,1),
        c(536870912,805306368,671088640,1006632960,570425344,
          855638016,713031680,1069547520,538968064,808452096,
          673710080,1010565120,572653568,858980352,715816960,
          1073725440,536879104,805318656,671098880,1006648320,
          570434048,855651072,713042560,1069563840,538976288,
          808464432,673720360,1010580540,572662306,858993459))

Independent formula

The simplest way of generating a Sobol-number is through the formula:

\begin{equation} y_i = b_0m_0 \oplus b_2m_2 \oplus \ldots \oplus b_{29}m_{29} \end{equation}

Where $y_i$ is the Sobol-number at index $i$, $b_0, b_1, \ldots b_{29}$ is the 30-bit binary representation of the index $i$ and $\oplus$ is the bitwise exclusive or operation.

In the recursive formulation, that we will discuss later, it is required that we instead of the binary representaion $i$, use the binary representation of the gray code of $i$. Gray codes are computed like this:


In [ ]:
grayCode <- function(i) (bitwXor(i, bitwShiftR(i, 1)))

Let $g_0, g_1, \ldots g_{29}$ be the binary representation fo the gray code, now the independent formula is:

\begin{equation} y_n = g_0m_0 \oplus g_2m_2 \oplus \ldots \oplus g_{29}m_{29} \end{equation}

Let us implement this in R. First we need a function for converting an integer $x$ into it's binary representation, $b_0, b_1, \ldots \b_{w}$ where $w$ is the bit width.


In [ ]:
# Returns TRUE if the n'th bit of x is set
testBit <- function(x,n) (bitwAnd(x, bitwShiftL(1, n)) != 0)

# This is similar to the intToBits(x) function in R, but is not
# restricted to 32-bit integers
bits <- function(x, bitwidth) {
     mapply(function (b) (testBit (x, b)), 0:(bitwidth-1))
}

The independent formula for generating Sobol-sequences can then be written as:


In [ ]:
sobol_independent <- function (i, m) {
  g <- grayCode(i)
  gbits <- bits(g, sobol_bitwidth)
  Reduce(bitwXor, m * gbits, 0, right = FALSE, accumulate = FALSE)
}

Let us test it:


In [ ]:
sobol_independent(5,sobol_directionVector[1,])

To generate the complete one-dimensional sequence, we just have to map this formula over every index into the sequence:


In [ ]:
sobol_independent_1D <- function(n, m) (mapply (function (i) (sobol_independent (i, m)), 0:(n-1)))
sobol_independent_1D(10, sobol_directionVector[1,])

Similarly, we can map it over the individual Sobol-vectors:


In [ ]:
sobol_independent_ND <- function (n, ms) {
    apply(ms, 1, function(m) (sobol_independent_1D(n,m)))
}
sobol_independent_ND(10, sobol_directionVector)

Recursive formula

A much more efficient formulation was found by ... TODO

Leap-frogging

To accomodate GPU computation, and even more efficient formulation was discovered by Bradley et al., which allows for coalesced writes.

TODO

High-dimensional Sobol-sequences and Brownian bridge

Sobol sequences with many dimensions expose a different problem, that needs to be tackled. For the first couple of dimensions, Sobol numbers have good space coverage:


In [ ]:
n <- 16000
dimensions = 30
sobol_sequence <- sobol(n, dim = dimensions)
sobol_frame <- data.frame(x=sobol_sequence[,1],y=sobol_sequence[,2])
ggplot(data = sobol_frame[0:4000,], aes(x,y))+ geom_point() + xlab("dimension 1") + ylab("dimension 2")

But on the higher dimensions of the Sobol-sequence, the quality decreases significantly. We can illustrate by showing the 29th and 30th dimension of the same sequence as was plotted above:


In [ ]:
sobol_frame <- data.frame(x=sobol_sequence[,29],y=sobol_sequence[,30])
ggplot(data = sobol_frame[0:4000,], aes(x,y)) + geom_point() + xlab("dimension 29") + ylab("dimension 30")

Later in the sequence, the gaps disappear. Here we plot the last dimensions with 8000 and 16000 points.


In [ ]:
ggplot(data = sobol_frame[0:8000,], aes(x,y)) + geom_point() + xlab("dimension 29") + ylab("dimension 30")
ggplot(data = sobol_frame[0:16000,], aes(x,y)) + geom_point() + xlab("dimension 29") + ylab("dimension 30")

Brownian bridge for path generation w. Sobol-numbers

When Sobol-sequences are used for path-generation, a strategy has been devised to tackle the above problem of little variation in the higher dimensions.

Instead of using each dimension in sequence to generate a path, the lower-dimension (higher quality) dimensions are used to generate the big steps of the sequence, while the latter dimensions are only used to "fill in the holes" (only taking small steps).

This can be done by using the Brownian bridge formula:

...


In [ ]: