In [1]:
import os, sys
sys.path.append(os.path.abspath('../../main/python'))
import datetime as dt
import numpy as np
import numpy.testing as npt
import matplotlib.pyplot as plt
from thalesians.tsa.distrs import NormalDistr as N
import thalesians.tsa.filtering as filtering
import thalesians.tsa.filtering.kalman as kalman
import thalesians.tsa.numpyutils as npu
import thalesians.tsa.processes as proc
In [2]:
process = proc.WienerProcess.create_from_cov(mean=3., cov=25.)
This we pass to a newly created Kalman filter, along with the initial time and initial state. The latter takes the form of a normal distribution. We have chosen to use Python datetimes as our data type for time, but we could have chosen ints or something else.
In [3]:
t0 = dt.datetime(2017, 5, 12, 16, 18, 25, 204000)
kf = filtering.kalman.KalmanFilter(t0, state_distr=N(mean=100., cov=250.), process=process)
Next we create an observable, which incorporates a particular observation model. In this case, the observation model is particularly simple, since we are observing the entire state of the Kalman filter. Our observation model is a 1x1 identity:
In [4]:
observable = kf.create_observable(kalman.LinearGaussianObsModel.create(1.), process)
Let's roll forward the time by one hour:
In [5]:
t1 = t0 + dt.timedelta(hours=1)
What is our predicted observation at this time? Since we haven't observed any actual information, this is our prior observation estimate:
In [6]:
prior_predicted_obs1 = observable.predict(t1)
prior_predicted_obs1
Out[6]:
We confirm that this is consistent with how our (linear-Gaussian) process model scales over time:
In [7]:
prior_predicted_obs1 = observable.predict(t1)
npt.assert_almost_equal(prior_predicted_obs1.distr.mean, 100. + 3./24.)
npt.assert_almost_equal(prior_predicted_obs1.distr.cov, 250. + 25./24.)
npt.assert_almost_equal(prior_predicted_obs1.cross_cov, prior_predicted_obs1.distr.cov)
Let us now actually observe our observation. Say, the observation is 100.35 and the observation noise covariance is 100.0:
In [8]:
observable.observe(time=t1, obs=N(mean=100.35, cov=100.0))
Out[8]:
Having seen an actual observation, let us obtain the posterior observation estimate:
In [9]:
posterior_predicted_obs1 = observable.predict(t1); posterior_predicted_obs1
Out[9]:
We can now fast-forward the time, by two hours, say, and repeat the process:
In [10]:
t2 = t1 + dt.timedelta(hours=2)
prior_predicted_obs2 = observable.predict(t2)
npt.assert_almost_equal(prior_predicted_obs2.distr.mean, 100.28590504 + 2.*3./24.)
npt.assert_almost_equal(prior_predicted_obs2.distr.cov, 71.513353115 + 2.*25./24.)
npt.assert_almost_equal(prior_predicted_obs2.cross_cov, prior_predicted_obs2.distr.cov)
observable.observe(time=t2, obs=N(mean=100.35, cov=100.0))
posterior_predicted_obs2 = observable.predict(t2)
npt.assert_almost_equal(posterior_predicted_obs2.distr.mean, 100.45709020)
npt.assert_almost_equal(posterior_predicted_obs2.distr.cov, 42.395213845)
npt.assert_almost_equal(posterior_predicted_obs2.cross_cov, posterior_predicted_obs2.distr.cov)
The real power of our Kalman filter interface is demonstrated for process models consisting of several (independent) stochastic processes:
In [11]:
process1 = proc.WienerProcess.create_from_cov(mean=3., cov=25.)
process2 = proc.WienerProcess.create_from_cov(mean=[1., 4.], cov=[[36.0, -9.0], [-9.0, 25.0]])
Such models are common in finance, where, for example, the dynamics of a yield curve may be represented by a (multivariate) stochastic process, whereas the idiosyncratic spread for each bond may be an independent stochastic process.
Let us pass process1 and process2 as a (compound) process model to our Kalman filter, along with the initial time and state:
In [12]:
t0 = dt.datetime(2017, 5, 12, 16, 18, 25, 204000)
kf = kalman.KalmanFilter(
t0,
state_distr=N(
mean=[100.0, 120.0, 130.0],
cov=[[250.0, 0.0, 0.0],
[0.0, 360.0, 0.0],
[0.0, 0.0, 250.0]]),
process=(process1, process2))
We shall now create several observables, each corresponding to a distinct observation model. The first one will observe the entire state:
In [13]:
state_observable = kf.create_observable(
kalman.LinearGaussianObsModel.create(1.0, np.eye(2)),
process1, process2)
The second observable will observe the first coordinate of the first process:
In [14]:
coord0_observable = kf.create_observable(
kalman.LinearGaussianObsModel.create(1.),
process1)
The third, the first coordinate of the second process:
In [15]:
coord1_observable = kf.create_observable(
kalman.LinearGaussianObsModel.create(npu.row(1., 0.)),
process2)
The fourth, the second coordinate of the second process:
In [16]:
coord2_observable = kf.create_observable(
kalman.LinearGaussianObsModel.create(npu.row(0., 1.)),
process2)
The fifth will observe the sum of the entire state (across the two processes):
In [17]:
sum_observable = kf.create_observable(
kalman.LinearGaussianObsModel.create(npu.row(1., 1., 1.)),
process1, process2)
And the sixth a certain linear combination thereof:
In [18]:
lin_comb_observable = kf.create_observable(
kalman.LinearGaussianObsModel.create(npu.row(2., 0., -3.)),
process1, process2)
Fast-forward the time by one hour:
In [19]:
t1 = t0 + dt.timedelta(hours=1)
Let's predict the state at this time...
In [20]:
predicted_obs1_prior = state_observable.predict(t1)
predicted_obs1_prior
Out[20]:
And check that it is consistent with the scaling of the (multivariate) Wiener process with time:
In [21]:
npt.assert_almost_equal(predicted_obs1_prior.distr.mean,
npu.col(100.0 + 3.0/24.0, 120.0 + 1.0/24.0, 130.0 + 4.0/24.0))
npt.assert_almost_equal(predicted_obs1_prior.distr.cov,
[[250.0 + 25.0/24.0, 0.0, 0.0],
[0.0, 360.0 + 36.0/24.0, -9.0/24.0],
[0.0, -9.0/24.0, 250 + 25.0/24.0]])
npt.assert_almost_equal(predicted_obs1_prior.cross_cov, predicted_obs1_prior.distr.cov)
Suppose that a new observation arrives, and we observe each of the three coordinates individually:
In [22]:
state_observable.observe(time=t1, obs=N(mean=[100.35, 121.0, 135.0],
cov=[[100.0, 0.0, 0.0],
[0.0, 400.0, 0.0],
[0.0, 0.0, 100.0]]));
Let's look at our (posterior) predicted state:
In [23]:
state_observable.predict(t1)
Out[23]:
Let's also look at the predictions for the individual coordinates:
In [24]:
coord0_observable.predict(t1)
Out[24]:
In [25]:
coord1_observable.predict(t1)
Out[25]:
In [26]:
coord2_observable.predict(t1)
Out[26]:
The predicted sum:
In [27]:
sum_observable.predict(t1)
Out[27]:
And the predicted linear combination:
In [28]:
lin_comb_observable.predict(t1)
Out[28]:
Let's now go 30 minutes into the future:
In [29]:
t2 = t1 + dt.timedelta(minutes=30)
And observe only the first coordinate of the second process, with a pretty high confidence:
In [30]:
coord1_observable.observe(time=t2, obs=N(mean=125.25, cov=4.))
Out[30]:
How does our predicted state change?
In [31]:
state_observable.predict(t2)
Out[31]:
Thirty minutes later...
In [32]:
t3 = t2 + dt.timedelta(minutes=30)
We observe the sum of the three coordinates, rather than the individual coordinates:
In [33]:
sum_observable.observe(time=t3, obs=N(mean=365.00, cov=9.))
Out[33]:
How has our prediction of the state changed?
In [34]:
state_observable.predict(t3)
Out[34]:
And what is its predicted sum?
In [35]:
sum_observable.predict(t3)
Out[35]: