SIR Epidemiology model: Outbreak on Tristan de Cunha

This example shows how the SIR model can be used.

The SIR model describes a disease breaking out in a population split into a susceptible (S), infected (I), and recovered (R) part (see wikipedia).

The formulation shown here was used by Toni et al., in a paper in which they also give real data of a common-cold outbreak on the isle of Tristan de Cunha (originally from Hammond & Tyrrell and Shibli et al.). In this formulation, the susceptible population is unknown, the we can observe the infected and recovered population.

The model has three parameters: infection rate gamma, recovery rate v, and the initial susceptible population S0.


In [1]:
import pints
import pints.toy
import pints.plot
import numpy as np
import matplotlib.pyplot as plt

# Create a model
model = pints.toy.SIRModel()

# Run a simulation
parameters = model.suggested_parameters()
times = model.suggested_times()
values = model.simulate(parameters, times)

print('Parameters:')
print(parameters)

# Plot the results
plt.figure()
plt.xlabel('Time')
plt.ylabel('Population')
plt.plot(times, values)
plt.legend(['Infected', 'Recovered'])
plt.show()


Parameters:
[0.026, 0.285, 38]
<Figure size 640x480 with 1 Axes>

We can have a look at the real data too:


In [2]:
real_values = model.suggested_values()

# Plot the results
plt.figure()
plt.xlabel('Time')
plt.ylabel('Population')
plt.plot(times, real_values)
plt.legend(['Infected', 'Recovered'])
plt.show()


Next, we set up a problem. Because this model has multiple outputs (2), we use a MultiOutputProblem.


In [3]:
problem = pints.MultiOutputProblem(model, times, real_values)
score = pints.SumOfSquaresError(problem)

To create a probabilistic model, we'll assume the data can be perfectly described by the model + Gausian noise with an unknown standard deviation sigma1 in I and sigma2 in R.


In [4]:
loglikelihood = pints.GaussianLogLikelihood(problem)

We can now use MCMC to have a guess at the parameters! We'll try to infer two noise parameters at the same time.


In [5]:
# Starting points
x0 = [
    [0.001, 0.20, 52, 3, 3],
    [0.05, 0.34, 34, 3, 3],
    [0.02, 0.18, 20, 3, 3],
]

# Create MCMC routine
mcmc = pints.MCMCController(loglikelihood, 3, x0)
mcmc.set_max_iterations(3000)
mcmc.set_log_to_screen(False)
chains = mcmc.run()


/home/mrobins/git/pints/pints/_log_likelihoods.py:154: RuntimeWarning: invalid value encountered in log
  - np.sum(error**2, axis=0) / (2 * sigma**2))

We now have a look at the rhat and traces, to get an indication of how well the chains have converged:


In [6]:
# Check convergence using rhat criterion
print('R-hat:')
print(pints.rhat_all_params(chains))

plt.figure()
pints.plot.trace(chains)
plt.show()


R-hat:
[1.1404228253914508, 1.1663101654093089, 1.2341689027380545, 1.0649510899918277, 1.116114610283234]
<Figure size 432x288 with 0 Axes>

Finally, we compare some predictions based on the final samples to the real data:


In [7]:
samples = chains[0, -1000:]

plt.figure()
fig, axes = pints.plot.series(samples, problem)
axes[0].plot(times, real_values[:, 0], c='#ff7f0e')
axes[1].plot(times, real_values[:, 1], c='#ff7f0e')
plt.show()


<Figure size 432x288 with 0 Axes>