This example shows how to use the autocorrelation plots of the residuals to check assumptions of the noise model.
Three cases are shown. In the first two, optimisation is used to obtain a best-fit parameter vector in a single output problem. In the first case the noise is correctly specified and in the second case the noise is misspecified. The third case demonstrates the same method in a multiple output problem with Bayesian inference.
For the first example, we will use optimisation to obtain the best-fit parameter vector. See Optimisation First Example for more details. We begin with a problem in which the noise is correctly specified: both the data generation and the model use independent Gaussian noise.
In [5]:
from __future__ import print_function
import pints
import pints.toy as toy
import pints.plot
import numpy as np
import matplotlib.pyplot as plt
# Use the toy logistic model
model = toy.LogisticModel()
real_parameters = [0.015, 500]
times = np.linspace(0, 1000, 100)
org_values = model.simulate(real_parameters, times)
# Add independent Gaussian noise
noise = 50
values = org_values + np.random.normal(0, noise, org_values.shape)
# Set up the problem and run the optimisation
problem = pints.SingleOutputProblem(model, times, values)
score = pints.SumOfSquaresError(problem)
boundaries = pints.RectangularBoundaries([0, 200], [1, 1000])
x0 = np.array([0.5, 500])
found_parameters, found_value = pints.optimise(
score,
x0,
boundaries=boundaries,
method=pints.XNES,
)
print('Score at true solution: ')
print(score(real_parameters))
print('Found solution: True parameters:' )
for k, x in enumerate(found_parameters):
print(pints.strfloat(x) + ' ' + pints.strfloat(real_parameters[k]))
In [6]:
fig, ax = pints.plot.series(np.array([found_parameters]), problem, ref_parameters=real_parameters)
fig.set_size_inches(15, 7.5)
plt.show()
Next we look at the autocorrelation plot of the residuals to evaluate the noise model (using pints.residuals_diagnostics.plot_residuals_autocorrelation).
In [7]:
from pints.residuals_diagnostics import plot_residuals_autocorrelation
# Plot the autocorrelation
fig = plot_residuals_autocorrelation(np.array([found_parameters]),
problem)
plt.show()
The figure shows no significant autocorrelation in the residuals. Therefore, the assumption of independent noise may be valid.
In [8]:
import pints.noise
# Use the toy logistic model
model = toy.LogisticModel()
real_parameters = [0.015, 500]
times = np.linspace(0, 1000, 100)
org_values = model.simulate(real_parameters, times)
# Add AR(1) noise
rho = 0.75
sigma = 50
values = org_values + pints.noise.ar1(rho, sigma, len(org_values))
# Set up the problem and run the optimisation
problem = pints.SingleOutputProblem(model, times, values)
score = pints.SumOfSquaresError(problem)
boundaries = pints.RectangularBoundaries([0, 200], [1, 1000])
x0 = np.array([0.5, 500])
found_parameters, found_value = pints.optimise(
score,
x0,
boundaries=boundaries,
method=pints.XNES,
)
print('Score at true solution: ')
print(score(real_parameters))
print('Found solution: True parameters:' )
for k, x in enumerate(found_parameters):
print(pints.strfloat(x) + ' ' + pints.strfloat(real_parameters[k]))
In [9]:
fig, ax = pints.plot.series(np.array([found_parameters]), problem, ref_parameters=real_parameters)
fig.set_size_inches(15, 7.5)
plt.show()
In [10]:
# Plot the autocorrelation
fig = plot_residuals_autocorrelation(np.array([found_parameters]),
problem)
plt.show()
Now the autocorrelation plot of the residuals shows high autocorrelation at small lags, which is typical of AR(1) noise. Therefore, this visualisation suggests that the assumption of independent Gaussian noise which we made during inference is invalid.
The plot_residuals_autocorrelation
function also works with Bayesian inference and multiple output problems. For the final example, we demonstrate the same strategy in this setting.
For this example, the Lotka-Volterra model is used. See the Lotka-Volterra example for more details. As in Case 1, the true data is generated with independent Gaussian noise.
In [11]:
import numpy as np
import matplotlib.pyplot as plt
import pints
import pints.toy
model = pints.toy.LotkaVolterraModel()
times = np.linspace(0, 3, 50)
parameters = model.suggested_parameters()
model.set_initial_conditions([2, 2])
org_values = model.simulate(parameters, times)
# Add noise
sigma = 0.1
values = org_values + np.random.normal(0, sigma, org_values.shape)
# Create an object with links to the model and time series
problem = pints.MultiOutputProblem(model, times, values)
# Create a log posterior
log_prior = pints.UniformLogPrior([0, 0, 0, 0, 0, 0], [6, 6, 6, 6, 1, 1])
log_likelihood = pints.GaussianLogLikelihood(problem)
log_posterior = pints.LogPosterior(log_likelihood, log_prior)
# Run MCMC on the noisy data
x0 = [[4, 1, 2, 3, .1, .1]]*3
mcmc = pints.MCMCController(log_posterior, 3, x0)
mcmc.set_max_iterations(4000)
print('Running')
chains = mcmc.run()
print('Done!')
In [12]:
# Get the first MCMC chain
chain1 = chains[0]
# Cut off the burn-in samples
chain1 = chain1[2500:]
fig, ax = pints.plot.series(chain1, problem, ref_parameters=parameters)
fig.set_size_inches(15, 7.5)
plt.show()
In [13]:
# Plot the autocorrelation
fig = plot_residuals_autocorrelation(chain1, problem)
plt.show()
The plot_residuals_autocorrelation
function generates one residuals plot for each output. Additionally, since Bayesian inference was performed and an MCMC chain was provided to the function, it draws a diagram of the distribution of the autocorrelations at each lag over the MCMC samples. Each dot indicates the median autocorrelation, and the bars show the extent of the 95% posterior interval.
In both outputs, no significant autocorrelation in the residuals is seen, as expected since independent Gaussian noise was used to generate the data.