This is an jupyter notebook. Lectures about Python, useful both for beginners and experts, can be found at http://scipy-lectures.github.io.
Open the notebook by (1) copying this file into a directory, (2) in that directory typing jupyter-notebook and (3) selecting the notebook.
Written By: Riddhish Bhalodia
In this exercise, we will learn and code about Kalman Filtering and look at few of it's applications.
Let me start it this way that before Rudolf Kalman (co-inventor of Kalman filtering), problems were divided in two distinct classes Control Problems (what value of acceleration should be provided to the car so that it climbs a certain incline with constant speed) and Filtering Problem (damn this noisy accelerometer, I can't get a clear value even at a fixed point).
You might have guessed (if you care to read the brackets :P) that the two problems are not uncorrelated. To the un-initiated, take a scenario that the car has a noisy accelerometer and you want to control it's speed on the incline, so two problems in one. One way is to solve them independently, but that is too situation dependent and so there was a need for a dynamic solution for filtering while controlling and vice versa, essentially bringing the two separate problems under one roof.
This is precisely what Kalman Filter does! Kalman Filter and it's non-linear extensions are essential elements in modern control theory. Lot of different applications ranging from filtering noisy sensor output to autonomous robot navigation uses Kalman Filter.
In this tutorial we will first start off with an application, which we will code (yay!) and then move on to build Kalman filtering theory.
A classic example to start with and very intuitive. We have to measure a DC voltage from a faulty (noisy) voltmeter, it cant get any simpler than this. First off let's import certain packages
In [ ]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import random
Now for solving any computational problem we need to model the system (in layman terms, set of equations which is used to describe the overall situation and how it varies with physical inputs and parameters). The simplest way to model the noisy voltmeter at every measurements instant is through the following equation \begin{equation} V_m = V_{m-1} + \omega_m \qquad (1) \end{equation}
Here, $V_m$ : Voltage at current time, $V_{m-1}$ : Voltage at previous time instant, $\omega_m$ : Random Noise (process noise)
We also have measurements taken at each instance m which is given by $Z_m$ and is corrupted by some sensor (measurements) noise $\nu_m$
$$Z_m = V_m + \nu_m \qquad (2)$$Usually in such cases as $\nu_m$ is introduced due to faults in voltmeter (sensor) and hence we usually know it's characteristics (least count, precision ... ring any bells?). So here we will model $\nu_m$ as a Gaussian Random Variable (as it is usually done) with zero mean and standard deviation $\sigma_{\nu} = 0.3$ (for simulation sake). $\omega_m$ is more difficult to predict and is introduced by error due to non-ideality (we know it's constant DC voltage, but is it really!) of our process equations but we still assume that we know it. Again, $\omega_m$ is to be modeled by a Gaussian Random Variable with zero mean and standard deviation $\sigma_{\omega} = 0.01$.
So before solving this let's model this voltmeter, What are the parameters to give. The true constant voltage lets say variable true_voltage = 1
, and we need a noise level as well (this is voltmeter's error so will feature in measurements error), say variable noise_level_sig = 0.2
(KF works when noise estimate are off by a mark...). Let's take measurements for 50 instances store it in iter_max
and we will generate the measurements for each instant which will be just $~ \mathcal{N}(true\_voltage,noise\_level\_sig)$ (Think about this :))
In [ ]:
true_voltage = 1
noise_level_sig = 0.2
iter_max = 50
measurements = []
true_voltage_mat = []
for i in range(iter_max):
measured = random.gauss(true_voltage , noise_level_sig)
measurements.append(measured)
true_voltage_mat.append(true_voltage)
Let's plot how the measurements look as compared to the true voltage
In [ ]:
plt.plot(range(iter_max),true_voltage_mat,'b',range(iter_max), measurements,'r')
plt.xlabel('Time Instances')
plt.ylabel('Voltage')
plt.legend(('true voltage', 'measured'), loc=4)
Math Math Math.
That being said, to solve this problem iteratively we need two major step
So we define two variables, $\hat{V}^-_m :$ prior estimate of the voltage given only the knowledge of the process (the equation (1)) and $\hat{V}_m :$ posterior estimate of the voltage at step m given the knowledge of the measurements $Z_m$
So lets' start the firing of equations, just one comment before this, we will also estimate the error in the estimate at every iteration along with the estimate itself.
$$e_m = V_m - \hat{V}_m \quad \textrm{and} \quad \sigma^2_m = \mathbb{E}[e_m^2]$$and
$$e^-_m = V_m - \hat{V}^-_m \quad \textrm{and} \quad \sigma^{2-}_m = \mathbb{E}[e_m^{2-}]$$We have to minimize this $\sigma^2_m$. So now as any sane person (ok statistician) would do we would model the posterior estimate $\hat{V}_m$ as the linear combination of the prior estimate $\hat{V}^-_m$ and the deviation of the estimate from the measurements (also called as innovation term) given as
$$y_m = Z_m - \hat{V}^-_m \qquad (3)$$Putting the above ramble in equation we have
$$\hat{V}_m = \hat{V}^-_m + k_my_m \qquad (4)$$subtracting $V_m$ from both sides we get
$$\hat{V}_m - V_m = \hat{V}^-_m - V_m + k_m(Z_m - \hat{V}^-_m)$$To compute $k_m$ we take the square and it's expectation and then differentiate the quadratic in $k_m$ to get something like (try this your self)
$$k_m = \frac{\mathbb{E}[(V_m - \hat{V}^-_m)(y_m)]}{\mathbb{E}[y_m^2]}$$The numerator and denominator when expanded and taking into the account for independence of the R.V $Z_m$, $V_m$, and $\hat{V}^-_m$ (think about this too) we get
$$k_m = \frac{\sigma^{2-}_m}{\sigma^{2-}_m + \sigma^2_{\nu}} \qquad (5)$$along with this we also have from equation(1)
$$\sigma_m^{2-} = \sigma_{m-1}^2 + \sigma_\omega^2 \qquad (6)$$Now substituting this (5) in the quadratic for $\mathbb{E}[e^2_m]$ we get the variance as
$$\sigma^2_m = (1 - k_m)\sigma_m^{2-} \qquad (7)$$Now we have everything :D, already! It will be clear when you look at the summary below
So hopefully you would have got a hang of how this works. So let's code it up
In [ ]:
# Initialize the parameters
initial_guess = 3
initial_guess_error = 1
sig_nu = 0.3
sig_omega = 0.01
estimate_vector = []
estimate_vector.append(initial_guess)
error_estimate_vector = []
error_estimate_vector.append(initial_guess_error)
# Run the Filter
for i in range(iter_max-1):
# first the prior estimation step
volt_prior_est = estimate_vector[i]
error_prior_est = error_estimate_vector[i] + sig_omega * sig_omega
# estimate correction
k = error_prior_est/(error_prior_est + sig_nu * sig_nu)
volt_corrected_est = volt_prior_est + k * (measurements[i+1] - volt_prior_est)
error_corrected_est = (1 - k) * error_prior_est
estimate_vector.append(volt_corrected_est)
error_estimate_vector.append(error_corrected_est)
Let us plot the things
In [ ]:
plt.figure()
plt.plot(range(iter_max),true_voltage_mat,'b',range(iter_max), measurements,'r', range(iter_max), estimate_vector,'g')
plt.xlabel('Time Instances')
plt.ylabel('Voltage')
plt.legend(('true voltage', 'measured', 'filtered'), loc=1)
Did you have your voila moment? :D Lets also look at the error for the estimate, lets plot it.
In [ ]:
plt.figure()
plt.plot(range(iter_max),error_estimate_vector)
plt.xlabel('Time Instances')
plt.ylabel('Voltage Error')
You might be wondering where is the control in all this. So now that's easily introduced in the actual formulation of Kalman filter which we will look at now. In the actual Kalman Filter we deal with multi-variable setting unlike that of the voltmeter example. So let me describe the inputs and the outputs and the parameters
Inputs
Outputs
Parameters
Now we are ready to write the basic equations for the KF, don't worry much of the above will be cleared as you look at the equation.
Lets start with the two basic equations, first is the process equation
$$\textbf{X}_m = A\textbf{X}_{m-1} + B\textbf{U}_{m-1} + \pmb{\omega}_m \qquad (8)$$and then we have the measurements equation
$$\textbf{Z}_m = H\textbf{X}_m + \pmb{\nu}_m \qquad (9)$$Here, we model the two error terms $\pmb{\omega}_m$ and $\pmb{\nu}_m$ as multi-variate Gaussian distributions with zero mean and covariance matrices Q and P respectively, i.e. $\pmb{\omega}_m \textrm{~} \mathcal{N}(0,Q)$ and $\pmb{\nu}_m \textrm{~} \mathcal{N}(0,P)$
So now following the exact same philosophy of that we followed for the derivation in voltmeter example we get the update equations for the general Kalman Filter. I will just list it down, if people are interested they can look up the references given below
$$ P_m^- = AP_{m-1}A^T + Q \qquad (10) $$$$ K_m = P_m^-H^T(HP_m^-H^T + R) \qquad (11) $$$$\pmb{y}_m = \pmb{Z}_m - H\hat{\pmb{X}_m^-} \qquad (12)$$$$ \hat{\pmb{X}}_m = \hat{\pmb{X}_m^-} + K_m\pmb{y}_m \qquad (13)$$$$ P_m = (I - K_mH)P_m^- \qquad (14)$$Again summarizing
Now enough of this rambling, I do hope you get this but we are going to make a kalman filter class and then try to see how this fits with our voltmeter example. So lets first create a class
In [ ]:
class kalmanFilter:
def __init__(self, X0, P0, A, B, H, Q, R):
self.A = A # State Transition Matrix
self.B = B # Control Matrix
self.H = H # Observation Matrix
self.Q = Q # Covariance for the process error
self.R = R # Covariance for the measurements error
self.current_estimate = X0 # this is the initial guess of the state
self.current_error_estimate = P0 # initial guess for the state estimate error
def getEstimate(self):
# returns the current state estimate
return self.current_estimate
def getErrorEstimate(self):
# returns the current state error estimate
return self.current_error_estimate
def iteration(self, U, Z):
# here is where the updates happen
# U = control vector
# Z = measurements vector
# prior prediction step
prior_estimate = self.A * self.current_estimate + self.B * U
prior_error_estimate = (self.A * self.current_error_estimate) * np.transpose(self.A) + self.Q
# intermediate observation
y = Z - self.H * prior_estimate
y_covariance = self.H * prior_error_estimate * np.transpose(self.H) + self.R
# Correction Step
K = prior_error_estimate * np.transpose(self.H) * np.linalg.inv(y_covariance)
self.current_estimate = prior_estimate + K * y
# We need the size of the matrix so we can make an identity matrix.
size = self.current_error_estimate.shape[0]
# eye(n) = nxn identity matrix.
self.current_error_estimate = (np.eye(size) - K * self.H) * prior_error_estimate
Now we have this nice class set up, let's test it's correctness by applying it to the Voltmeter problem. First things in the voltmeter problem we set the parameters first.
In [ ]:
A = np.matrix([1])
B = np.matrix([0])
H = np.matrix([1])
Q = np.matrix([0.0001]) # the sigmas gets squared
R = np.matrix([0.09])
X0 = np.matrix([3])
P0 = np.matrix([1])
KF = kalmanFilter(X0, P0, A, B, H, Q, R)
estimate_vector_new = []
estimate_vector_new.append(initial_guess)
error_estimate_vector_new = []
error_estimate_vector_new.append(initial_guess_error)
# Run the Filter
for i in range(iter_max-1):
U = np.matrix([0]) # there is no control here
Z = np.matrix([measurements[i+1]])
estimate_vector_new.append(KF.getEstimate()[0,0])
error_estimate_vector_new.append(KF.getErrorEstimate()[0,0])
KF.iteration(U,Z)
Now lets plot again to see weather we are good to go or not.
In [ ]:
plt.figure()
plt.plot(range(iter_max),true_voltage_mat,'b',range(iter_max), measurements,'r', range(iter_max), estimate_vector_new,'g')
plt.xlabel('Time Instances')
plt.ylabel('Voltage')
plt.legend(('true voltage', 'measured', 'filtered'), loc=1)
Well it should exactly match with the previous plot :P duh, big deal. But now as we have this nice class we can start dealing with cooler application. So now after much search I have come up with this application to end this hopefully interesting notebook :D
Well a simple but I guess not with real wow factor example is that we have a ball thrown in a projectile and we can measure it's (x,y) position with camera system and also it's velocity (vx,vy) by the sensors on the ball (used in cricket for LBW system without the velocity part). Now we know that these cameras are noisy so we need a filtered estimate of the state of the ball finally (here the state is a vector of x,y,vx,vy). So let's get on with the system modeling first.
Remember projectiles! Remember JEE physics which you all did. All right I am not going to explain the kinematics equations just have a look and you will understand. We project at an initial velocity u and angle $\theta$ wrt horizontal. we divide into intervals of measurements into time intervals of $\Delta t$.
$$Vx_{t} = Vx_{t-1}$$$$Vy_{t} = Vy_{t-1} - g\Delta t$$$$x_{t} = x_{t-1} + Vx_{t-1}\Delta t$$$$y_{t} = y_{t-1} + Vy_{t-1}\Delta t - 0.5g\Delta t^2$$The state of the equations at instance t is given by the vector $\pmb{X}_t = (x_t,Vx_t,y_t,Vy_t)$, and the control vector is the additional term $\pmb{u}_t = (0,0,-0.5g\Delta t^2, -g\Delta t)$. Think about this! Now we start defining matrices.
As our measurements are directly the state there is no proportionality and hence the H matrix is an identity matrix (H = $I_4$). Also we assume that our process has an error covariance as $Q = 0.0001*I_4$ and the measurements has an error covariance of $R = 0.3*I_4$ (all the errors are not necessary to be equal, but taken here for convenience :D). Also we need the values for $\theta$ and $u$, lets say $\theta = 45$ and $u = 100m/s$
Now we have all the matrices and so lets start with solving this using our kalmanFilter class.
In [ ]:
# Physics
# 1) sin(45)*100 = 70.710 and cos(45)*100 = 70.710
# vf = vo + at
# 0 = 70.710 + (-9.81)t
# t = 70.710/9.81 = 7.208 seconds for half
# 14.416 seconds for full journey
# distance = 70.710 m/s * 14.416 sec = 1019.36796 m
due to the above calculations we have to be careful choosing $\Delta t$ and number of iterations. $\Delta t = 0.1$ and max_iter = 145 makes sense (think). So now we create our simulated data
In [ ]:
del_t = 0.1
max_iter = 145
added_noise = 25
init_vel = 100
theta = np.pi/4
# now we define the measurement matrix
measurements = np.zeros((4,max_iter))
true_value = np.zeros((4,max_iter))
ux0 = init_vel * np.cos(theta)
uy0 = init_vel * np.sin(theta)
for i in range(max_iter):
# we generate this by projectile equations and adding noise to it
t = i * del_t
true_value[0,i] = ux0 * t
true_value[1,i] = ux0
true_value[2,i] = uy0 * t - 0.5 * 9.8 * t * t
true_value[3,i] = uy0 - 9.8 * t
measurements[0,i] = random.gauss(true_value[0,i],added_noise)
measurements[1,i] = random.gauss(true_value[1,i],added_noise)
measurements[2,i] = random.gauss(true_value[2,i],added_noise)
measurements[3,i] = random.gauss(true_value[3,i],added_noise)
Lets plot the position data and measurements
In [ ]:
plt.figure()
plt.plot(true_value[0,:],true_value[2,:],'b',measurements[0,:],measurements[2,:],'r')
plt.xlabel('X Position')
plt.ylabel('Y Position')
plt.legend(('true trajectory', 'measured trajectory'), loc=2)
So this is how the cricket ball's trajectory is measured, now we go to the filtering part
In [ ]:
A = np.matrix([[1,del_t,0,0],[0,1,0,0],[0,0,1,del_t],[0,0,0,1]])
B = np.matrix([[0,0,0,0],[0,0,0,0],[0,0,1,0],[0,0,0,1]])
u = np.matrix([[0],[0],[-0.5*9.8*del_t*del_t],[-9.8*del_t]]) # control vector is constant does not depend on t
H = np.eye(4)
Q = 0.0001 * np.eye(4)
R = 0.3 * np.eye(4)
X0 = np.matrix([[0],[ux0],[500],[uy0]]) # set it little different than the orig initial just to show that KF will still work
P0 = np.eye(4) # set arbitrary as identity
estimate_matrix = np.zeros((4,max_iter))
estimate_matrix[:,0] = np.asarray(X0)[:,0]
estimate_error = P0
KF = kalmanFilter(X0, P0, A, B, H, Q, R)
for i in range(max_iter-1):
Z = np.matrix([[measurements[0,i+1]],[measurements[1,i+1]],[measurements[2,i+1]],[measurements[3,i+1]]])
estimate_matrix[:,i+1] = np.asarray(KF.getEstimate())[:,0]
KF.iteration(u,Z)
In [ ]:
plt.figure()
plt.plot(true_value[0,:],true_value[2,:],'b',measurements[0,:],measurements[2,:],'r',estimate_matrix[0,:],estimate_matrix[2,:],'g')
plt.xlabel('X Position')
plt.ylabel('Y Position')
plt.legend(('true trajectory', 'measured trajectory', 'filtered trajectory'), loc=1)