Inference for data with autoregressive-moving-average (ARMA) errors

A standard assumption in time series analysis is the presence of IID (independent and identically distributed) Gaussian noise, which is supported in Pints using the pints.GaussianLogLikelihood class. However, there are also more sophisticated noise models in which the noise terms depend on both a random term and on previous values. Two standard time series models which can be used as the error model in Pints are the autoregressive (AR) and autoregressive-moving-average (ARMA).

This example shows you how to perform Bayesian inference for a time series with AR(1) or ARMA(1,1) noise. It follows on from the first sampling example. Pints also includes some diagnostic plots which can help identify whether or not a correlated noise process such as AR(1) is appropriate, which are explained in another example notebook.

Generating data with AR(1) errors

Under the AR(1) or autoregressive order 1 noise model, the observed data $y(t)$ follows

$$y(t) = f(t; \theta) + \epsilon(t).$$

The error terms $\epsilon(t)$ depend on a random white noise term $\nu(t)$ as well as the value of the error term at the previous time step:

$$\epsilon(t) = \rho \epsilon(t-1) + \nu(t).$$

The white noise term follows $\nu(t) \overset{i.i.d.}{\sim} \mathcal{N}(0, \sigma \sqrt{1 - \rho^2})$. The noise process standard deviation is such that the marginal distribution of $\epsilon$ is

$$\epsilon\sim\mathcal{N}(0, \sigma).$$

Pints supports the AR(1) noise model using the pints.AR1LogLikelihood class. In the sections of code below, we generate a time series with AR(1) noise, and then perform Bayesian inference using the incorrect IID likelihood and the correct AR(1) likelihood.


In [1]:
import pints
import pints.toy as toy
import pints.plot
import numpy as np
import matplotlib.pyplot as plt

# Load a forward model
model = toy.LogisticModel()

# Create some toy data
real_parameters = [0.015, 500]
times = np.linspace(0, 1000, 100)
org_values = model.simulate(real_parameters, times)

# Add noise
noise = 40
rho = 0.9
errors = pints.noise.ar1(rho, noise, len(org_values))
values = org_values + errors

# Show the noisy data
plt.figure()
plt.plot(times, org_values)
plt.plot(times, values)
plt.xlabel('time')
plt.ylabel('y')
plt.legend(['true', 'observed'])
plt.show()


Fit model with independent errors to AR(1) data

This model is not a correct representation of the noise and so the posteriors will understate true uncertainty.


In [2]:
# Get properties of the noise sample
noise_sample_mean = np.mean(values - org_values)
noise_sample_std = np.std(values - org_values)

# Create an object with links to the model and time series
problem = pints.SingleOutputProblem(model, times, values)

# Create a log-likelihood function (adds an extra parameter!)
log_likelihood = pints.GaussianLogLikelihood(problem)

# Create a uniform prior over both the parameters and the new noise variable
log_prior = pints.UniformLogPrior(
    [0.01, 400, noise * 0.1],
    [0.02, 600, noise * 100],
)

# Create a posterior log-likelihood (log(likelihood * prior))
log_posterior = pints.LogPosterior(log_likelihood, log_prior)

# Choose starting points for 3 mcmc chains
real_parameters1 = np.array(real_parameters + [noise])
xs = [
    real_parameters1 * 1.05,
    real_parameters1 * 1,
    real_parameters1 * 1.1,
    real_parameters1 * 0.9,
    real_parameters1 * 0.8,
    real_parameters1 * 0.85,
]

# Create mcmc routine
mcmc = pints.MCMCController(log_posterior, 6, xs, method=pints.DifferentialEvolutionMCMC)

# Add stopping criterion
mcmc.set_max_iterations(8000)

# Disable logging
mcmc.set_log_to_screen(False)

# Set number of iterations between which gamma=1 (for a single iteration)
mcmc.sampler().set_gamma_switch_rate(2000)

# Set to uniform error process rather than Gaussian
mcmc.sampler().set_gaussian_error(False)

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Check convergence and other properties of chains
results = pints.MCMCSummary(chains=chains, time=mcmc.time(), parameter_names=['growth rate', 'capacity', 'sigma'])
print(results)

# Show traces and histograms
pints.plot.trace(chains)

# Discard warm up
chains = chains[:, 5000:, :]

# Apply thinning
chains = chains[:, ::10, :]

# Look at distribution in chain 0
pints.plot.pairwise(chains[0], kde=True, ref_parameters=real_parameters1)

# Show graphs
plt.show()


Running...
Done!
param        mean    std.    2.5%    25%     50%     75%     97.5%    rhat    ess      ess per sec.
-----------  ------  ------  ------  ------  ------  ------  -------  ------  -------  --------------
growth rate  0.02    0.00    0.02    0.02    0.02    0.02    0.02     1.00    2273.45  467.60
capacity     452.40  4.22    444.94  449.81  452.35  454.84  459.94   1.00    2683.73  551.98
sigma        26.34   2.34    22.90   24.93   26.14   27.46   30.64    1.00    1522.92  313.23

Fit model with AR(1) errors

Posterior uncertainties here are more realistic.


In [3]:
# Get properties of the noise sample
noise_sample_mean = np.mean(values - org_values)
noise_sample_std = np.std(values - org_values)

# Create an object with links to the model and time series
problem = pints.SingleOutputProblem(model, times, values)

# Create a log-likelihood function (adds an extra parameter!)
log_likelihood = pints.AR1LogLikelihood(problem)

# Create a uniform prior over both the parameters and the new noise variable
log_prior = pints.UniformLogPrior(
    [0.01, 400, 0, noise * 0.1],
    [0.02, 600, 1, noise * 100],
)

# Create a posterior log-likelihood (log(likelihood * prior))
log_posterior = pints.LogPosterior(log_likelihood, log_prior)

# Choose starting points for 3 mcmc chains
real_parameters = np.array(real_parameters + [rho, noise])
xs = [
    real_parameters * 1.05,
    real_parameters * 1,
    real_parameters * 1.1,
    real_parameters * 0.9,
    real_parameters * 0.8,
    real_parameters * 0.85,
]

# Create mcmc routine
mcmc = pints.MCMCController(log_posterior, 6, xs, method=pints.DifferentialEvolutionMCMC)

# Add stopping criterion
mcmc.set_max_iterations(8000)

# Disable logging
mcmc.set_log_to_screen(False)

# Set number of iterations between which gamma=1 (for a single iteration)
mcmc.sampler().set_gamma_switch_rate(2000)

# Set to uniform error process rather than normal
mcmc.sampler().set_gaussian_error(False)

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Check convergence and other properties of chains
results = pints.MCMCSummary(chains=chains, time=mcmc.time(), parameter_names=['growth rate', 'capacity', 'rho', 'sigma'])
print(results)

# Show traces and histograms
pints.plot.trace(chains)

# Discard warm up
chains = chains[:, 5000:, :]

# Apply thinning
chains = chains[:, ::10, :]

# Look at distribution in chain 0
pints.plot.pairwise(chains[0], kde=True, ref_parameters=real_parameters)

# Show graphs
plt.show()


Running...
Done!
param        mean    std.    2.5%    25%     50%     75%     97.5%    rhat    ess     ess per sec.
-----------  ------  ------  ------  ------  ------  ------  -------  ------  ------  --------------
growth rate  0.02    0.00    0.01    0.02    0.02    0.02    0.02     1.04    581.05  130.51
capacity     462.20  30.77   424.14  447.55  455.10  465.35  567.08   1.86    667.27  149.88
rho          0.81    0.09    0.67    0.75    0.79    0.87    0.98     1.35    278.90  62.65
sigma        39.58   23.67   23.12   26.79   29.43   36.65   109.84   2.10    322.71  72.49

Generate model with ARMA errors

As before, we assume that the observed data $y(t)$ follows

$$y(t)= f(t; \theta) + \epsilon(t).$$

Under the ARMA(1,1) noise model, the error terms $\epsilon(t)$ have 1 moving average term and 1 autoregressive term. Therefore,

$$\epsilon(t) = \rho \epsilon(t-1) + \nu(t) + \phi \nu(t-1),$$

where the white noise term $\nu(t) \overset{i.i.d.}{\sim} \mathcal{N}(0, \sigma \sqrt{(1 - \rho^2) / (1 + 2 \rho \phi + \phi^2))}$. The noise process standard deviation is such that the marginal distribution of $\epsilon$ is,

$$\epsilon\sim\mathcal{N}(0, \sigma).$$

The ARMA(1,1) noise model is available in Pints using pints.ARMA11LogLikelihood. As before, the code below shows how to generate a time series with ARMA(1,1) noise and perform Bayesian inference with the correct likelihood and with an incorrect IID assumption.


In [4]:
# Create some toy data
real_parameters = [0.015, 500]
times = np.linspace(0, 1000, 100)
org_values = model.simulate(real_parameters, times)

# Add noise
noise = 40
rho = 0.9
phi = 0.95
errors = pints.noise.arma11(rho, phi, noise, len(org_values))
values = org_values + errors

# Show the noisy data
plt.figure()
plt.plot(times, org_values)
plt.plot(times, values)
plt.xlabel('time')
plt.ylabel('y')
plt.legend(['true', 'observed'])
plt.show()


Fit model with independent errors to ARMA(1,1) data

The posterior uncertainties here are even more understated relative to their value under the true data generating process.


In [5]:
# Get properties of the noise sample
noise_sample_mean = np.mean(values - org_values)
noise_sample_std = np.std(values - org_values)

# Create an object with links to the model and time series
problem = pints.SingleOutputProblem(model, times, values)

# Create a log-likelihood function (adds an extra parameter!)
log_likelihood = pints.GaussianLogLikelihood(problem)

# Create a uniform prior over both the parameters and the new noise variable
log_prior = pints.UniformLogPrior(
    [0.01, 400, noise * 0.1],
    [0.02, 600, noise * 100],
)

# Create a posterior log-likelihood (log(likelihood * prior))
log_posterior = pints.LogPosterior(log_likelihood, log_prior)

# Choose starting points for 3 mcmc chains
real_parameters1 = np.array(real_parameters + [noise])
xs = [
    real_parameters1 * 1.05,
    real_parameters1 * 1,
    real_parameters1 * 1.1,
    real_parameters1 * 0.9,
    real_parameters1 * 0.8,
    real_parameters1 * 0.85,
]

# Create mcmc routine
mcmc = pints.MCMCController(log_posterior, 6, xs, method=pints.DifferentialEvolutionMCMC)

# Add stopping criterion
mcmc.set_max_iterations(8000)

# Disable logging
mcmc.set_log_to_screen(False)

# Set number of iterations between which gamma=1 (for a single iteration)
mcmc.sampler().set_gamma_switch_rate(2000)

# Set to uniform error process rather than normal
mcmc.sampler().set_gaussian_error(False)

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Check convergence and other properties of chains
results = pints.MCMCSummary(chains=chains, time=mcmc.time(), parameter_names=['growth rate', 'capacity', 'sigma'])
print(results)

# Show traces and histograms
pints.plot.trace(chains)

# Discard warm up
chains = chains[:, 5000:, :]

# Apply thinning
chains = chains[:, ::10, :]

# Look at distribution in chain 0
pints.plot.pairwise(chains[0], kde=True, ref_parameters=real_parameters1)

# Show graphs
plt.show()


Running...
Done!
param        mean    std.    2.5%    25%     50%     75%     97.5%    rhat    ess      ess per sec.
-----------  ------  ------  ------  ------  ------  ------  -------  ------  -------  --------------
growth rate  0.01    0.00    0.01    0.01    0.01    0.01    0.02     1.00    2970.75  683.02
capacity     500.27  5.32    490.10  496.85  500.24  503.66  510.38   1.00    3288.55  756.09
sigma        33.11   2.40    28.73   31.50   32.94   34.59   38.20    1.00    2526.81  580.95

Fit model with ARMA(1, 1) errors

The posterior uncertainties are now considerably greater.


In [6]:
# Get properties of the noise sample
noise_sample_mean = np.mean(values - org_values)
noise_sample_std = np.std(values - org_values)

# Create an object with links to the model and time series
problem = pints.SingleOutputProblem(model, times, values)

# Create a log-likelihood function (adds an extra parameter!)
log_likelihood = pints.ARMA11LogLikelihood(problem)

# Create a uniform prior over both the parameters and the new noise variable
log_prior = pints.UniformLogPrior(
    [0.01, 400, 0, 0, noise * 0.1],
    [0.02, 600, 1, 1, noise * 100],
)

# Create a posterior log-likelihood (log(likelihood * prior))
log_posterior = pints.LogPosterior(log_likelihood, log_prior)

# Choose starting points for 3 mcmc chains
real_parameters = np.array(real_parameters + [rho, phi, noise])
xs = [
    real_parameters * 1.05,
    real_parameters * 1,
    real_parameters * 1.025,
    real_parameters * 0.9,
    real_parameters * 0.8,
    real_parameters * 0.85,
]

# Create mcmc routine
mcmc = pints.MCMCController(log_posterior, 6, xs, method=pints.DifferentialEvolutionMCMC)

# Add stopping criterion
mcmc.set_max_iterations(8000)

# Disable logging
mcmc.set_log_to_screen(False)

# Set number of iterations between which gamma=1 (for a single iteration)
mcmc.sampler().set_gamma_switch_rate(2000)

# Set to uniform error process rather than normal
mcmc.sampler().set_gaussian_error(False)

# Run!
print('Running...')
chains = mcmc.run()
print('Done!')

# Check convergence and other properties of chains
results = pints.MCMCSummary(chains=chains, time=mcmc.time(), parameter_names=['growth rate', 'capacity', 'rho', 'phi', 'sigma'])
print(results)

# Show traces and histograms
pints.plot.trace(chains)

# Discard warm up
chains = chains[:, 5000:, :]

# Apply thinning
chains = chains[:, ::10, :]

# Look at distribution in chain 0
pints.plot.pairwise(chains[0], kde=True, ref_parameters=real_parameters)

# Show graphs
plt.show()


This method should be run with n_chains >= 1.5 * n_parameters
Running...
Done!
param        mean    std.    2.5%    25%     50%     75%     97.5%    rhat    ess     ess per sec.
-----------  ------  ------  ------  ------  ------  ------  -------  ------  ------  --------------
growth rate  0.01    0.00    0.01    0.01    0.01    0.02    0.02     1.02    935.33  172.03
capacity     500.06  19.80   461.48  487.97  499.40  511.28  542.18   1.01    854.82  157.23
rho          0.72    0.10    0.52    0.65    0.72    0.79    0.91     1.01    551.22  101.39
phi          0.69    0.09    0.49    0.64    0.70    0.76    0.88     1.01    778.65  143.21
sigma        25.96   5.27    19.20   22.40   24.72   28.16   39.92    1.01    488.27  89.81