Sequential Monte Carlo with two gaussians

Sampling from $n$-dimensional distributions with multiple peaks with a standard Metropolis-Hastings algorithm can be difficult, if not impossible, as the Markov chain often gets stuck in either of the minima.

This problem can be avoided by running many (n_chains) Markov chains in parallel for (n_steps) steps. To speed this process up we do not sample right away from the posterior distribution, but rather from an intermediate distribution that is similar to the previous distribution. Once the sampling for all the chains is finished, the algorithm enters a 'transitional stage'.

In this stage the similarity between the intermediate distributions is evaluated by a tempering parameter (beta), which is automatically determined from the sampling results (coefficient of variation - COV) from the previous intermediate distribution. If the COV is high the cooling is slow, resulting in small steps in beta and vice versa. Also based on the parameter distributions the MultivariateProposal is updated and new seed points for the following Markov chains are determined. The end points of the Markov chains with the highest likelihoods are chosen as new seed-points for the Markov chains of the next sampling stage.

So the sampling of the intermediate distribution is repeated until beta > 1, which means that the posterior distribution is reached.



In [1]:

    
import pymc3 as pm
import numpy as np
from pymc3.step_methods import smc
import theano.tensor as tt
from matplotlib import pyplot as plt
from tempfile import mkdtemp
import shutil
%matplotlib inline

test_folder = mkdtemp(prefix='ATMIP_TEST')

The number of Markov chains and the number of steps each Markov chain is sampling has to be defined, as well as the tune_interval and the number of processors to be used in the parallel sampling. In this very simple example using only one processor is faster than forking the interpreter. However, if the calculation cost of the model increases it becomes more efficient to use many processors.



In [2]:

    
n_chains = 500
n_steps = 100
tune_interval = 25
n_jobs = 1

Define the number of dimensions for the multivariate gaussians, their weights and the covariance matrix.



In [3]:

    
n = 4

mu1 = np.ones(n) * (1. / 2)
mu2 = -mu1

stdev = 0.1
sigma = np.power(stdev, 2) * np.eye(n)
isigma = np.linalg.inv(sigma)
dsigma = np.linalg.det(sigma)

w1 = 0.1
w2 = (1 - w1)

The PyMC3 model. Note that we are making two gaussians, where one has w1 (90%) of the mass:



In [4]:

    
def two_gaussians(x):
    log_like1 = - 0.5 * n * tt.log(2 * np.pi) \
                - 0.5 * tt.log(dsigma) \
                - 0.5 * (x - mu1).T.dot(isigma).dot(x - mu1)
    log_like2 = - 0.5 * n * tt.log(2 * np.pi) \
                - 0.5 * tt.log(dsigma) \
                - 0.5 * (x - mu2).T.dot(isigma).dot(x - mu2)
    return tt.log(w1 * tt.exp(log_like1) + w2 * tt.exp(log_like2))


with pm.Model() as ATMIP_test:
    X = pm.Uniform('X',
                   shape=n,
                   lower=-2. * np.ones_like(mu1),
                   upper=2. * np.ones_like(mu1),
                   testval=-1. * np.ones_like(mu1),
                   transform=None)
    like = pm.Deterministic('like', two_gaussians(X))
    llk = pm.Potential('like', like)

Note: In contrast to other pymc3 samplers here we have to define a random variable like that contains the model likelihood. The likelihood has to be stored in the sampling traces along with the model parameter samples, in order to determine the coefficient of variation [COV] in each transition stage.

Now the sampler is initialised dependent on the previous specifications:



In [5]:

    
with ATMIP_test:
    step = smc.SMC(
        n_chains=n_chains, tune_interval=tune_interval,
        likelihood_name=ATMIP_test.deterministics[0].name)









    



/home/jovyan/pymc3/pymc3/step_methods/smc.py:144: UserWarning: Warning: SMC is an experimental step method, and not yet recommended for use in PyMC3!
  warnings.warn(EXPERIMENTAL_WARNING)

Finally, the sampling is executed:



In [6]:

    
mtrace = smc.ATMIP_sample(
    n_steps=n_steps,
    step=step,
    n_jobs=n_jobs,
    progressbar=False,
    stage='0',
    homepath=test_folder,
    model=ATMIP_test,
    rm_flag=True)









    



/home/jovyan/pymc3/pymc3/step_methods/smc.py:549: UserWarning: Warning: SMC is an experimental step method, and not yet recommended for use in PyMC3!
  warnings.warn(EXPERIMENTAL_WARNING)
Sample initial stage: ...
Beta: 0.000000 Stage: 0
Initialising chain traces ...
Sampling ...
Beta: 0.010404 Stage: 1
Initialising chain traces ...
Sampling ...
Beta: 0.028533 Stage: 2
Initialising chain traces ...
Sampling ...
Beta: 0.062573 Stage: 3
Initialising chain traces ...
Sampling ...
Beta: 0.133869 Stage: 4
Initialising chain traces ...
Sampling ...
Beta: 0.279577 Stage: 5
Initialising chain traces ...
Sampling ...
Beta: 0.569002 Stage: 6
Initialising chain traces ...
Sampling ...
Beta > 1.: 1.205892
Sample final stage
Initialising chain traces ...
Sampling ...

Note: Complex models run for a long time and might stop for some reason during the sampling. In order to restart the sampling in the stage when the sampler stopped, set the stage argument to the right stage number ("stage='4'"). The rm_flag determines whether existing results are deleted - there is NO additional warning, so the user should pay attention to that one!

Plotting the results using the traceplot:



In [7]:

    
_ = pm.traceplot(mtrace, combined=True)

Finally, we delete the sampling result folder. This folder may occupy significant disc-space (Gigabytes), depending on the number of sampling parameters for complex models. So we advice the user to check in advance if there is enough space on the disc.



In [8]:

    
shutil.rmtree(test_folder)