This is an jupyter notebook. Lectures about Python, useful both for beginners and experts, can be found at http://scipy-lectures.github.io.

Open the notebook by (1) copying this file into a directory, (2) in that directory typing jupyter-notebook and (3) selecting the notebook.


Written By: Riddhish Bhalodia


In this exercise, we will learn and code about Kalman Filtering and look at few of it's applications.

Kalman Filtering

Motivation

Let me start it this way that before Rudolf Kalman (co-inventor of Kalman filtering), problems were divided in two distinct classes Control Problems (what value of acceleration should be provided to the car so that it climbs a certain incline with constant speed) and Filtering Problem (damn this noisy accelerometer, I can't get a clear value even at a fixed point).

You might have guessed (if you care to read the brackets :P) that the two problems are not uncorrelated. To the un-initiated, take a scenario that the car has a noisy accelerometer and you want to control it's speed on the incline, so two problems in one. One way is to solve them independently, but that is too situation dependent and so there was a need for a dynamic solution for filtering while controlling and vice versa, essentially bringing the two separate problems under one roof.

This is precisely what Kalman Filter does! Kalman Filter and it's non-linear extensions are essential elements in modern control theory. Lot of different applications ranging from filtering noisy sensor output to autonomous robot navigation uses Kalman Filter.

In this tutorial we will first start off with an application, which we will code (yay!) and then move on to build Kalman filtering theory.

Faulty Voltmeter

A classic example to start with and very intuitive. We have to measure a DC voltage from a faulty (noisy) voltmeter, it cant get any simpler than this. First off let's import certain packages


In [ ]:
%matplotlib inline 

import numpy as np
import matplotlib.pyplot as plt
import random

Now for solving any computational problem we need to model the system (in layman terms, set of equations which is used to describe the overall situation and how it varies with physical inputs and parameters). The simplest way to model the noisy voltmeter at every measurements instant is through the following equation \begin{equation} V_m = V_{m-1} + \omega_m \qquad (1) \end{equation}

Here, $V_m$ : Voltage at current time, $V_{m-1}$ : Voltage at previous time instant, $\omega_m$ : Random Noise (process noise)

We also have measurements taken at each instance m which is given by $Z_m$ and is corrupted by some sensor (measurements) noise $\nu_m$

$$Z_m = V_m + \nu_m \qquad (2)$$

Usually in such cases as $\nu_m$ is introduced due to faults in voltmeter (sensor) and hence we usually know it's characteristics (least count, precision ... ring any bells?). So here we will model $\nu_m$ as a Gaussian Random Variable (as it is usually done) with zero mean and standard deviation $\sigma_{\nu} = 0.3$ (for simulation sake). $\omega_m$ is more difficult to predict and is introduced by error due to non-ideality (we know it's constant DC voltage, but is it really!) of our process equations but we still assume that we know it. Again, $\omega_m$ is to be modeled by a Gaussian Random Variable with zero mean and standard deviation $\sigma_{\omega} = 0.01$.

So before solving this let's model this voltmeter, What are the parameters to give. The true constant voltage lets say variable true_voltage = 1, and we need a noise level as well (this is voltmeter's error so will feature in measurements error), say variable noise_level_sig = 0.2 (KF works when noise estimate are off by a mark...). Let's take measurements for 50 instances store it in iter_max and we will generate the measurements for each instant which will be just $~ \mathcal{N}(true\_voltage,noise\_level\_sig)$ (Think about this :))


In [ ]:
true_voltage = 1
noise_level_sig = 0.2
iter_max = 50
measurements = []
true_voltage_mat = []

for i in range(iter_max):
    measured = random.gauss(true_voltage , noise_level_sig)
    measurements.append(measured)
    true_voltage_mat.append(true_voltage)

Let's plot how the measurements look as compared to the true voltage


In [ ]:
plt.plot(range(iter_max),true_voltage_mat,'b',range(iter_max), measurements,'r')
plt.xlabel('Time Instances')
plt.ylabel('Voltage')
plt.legend(('true voltage', 'measured'), loc=4)

Now starts the actual thing, we want to filter this :D, by using Kalman Filtering. But first we need to derive, so do get ready for a chunk load of theory, but but please be patient as once we are done with this, the standard Kalman Filter will be a piece of cake :)

Filtering!!

Math Math Math.

That being said, to solve this problem iteratively we need two major step

  • Predict the voltage at next instance using the previous estimate (1)
  • Correct the estimate based on the measurements at that instant

So we define two variables, $\hat{V}^-_m :$ prior estimate of the voltage given only the knowledge of the process (the equation (1)) and $\hat{V}_m :$ posterior estimate of the voltage at step m given the knowledge of the measurements $Z_m$

So lets' start the firing of equations, just one comment before this, we will also estimate the error in the estimate at every iteration along with the estimate itself.

$$e_m = V_m - \hat{V}_m \quad \textrm{and} \quad \sigma^2_m = \mathbb{E}[e_m^2]$$

and

$$e^-_m = V_m - \hat{V}^-_m \quad \textrm{and} \quad \sigma^{2-}_m = \mathbb{E}[e_m^{2-}]$$

We have to minimize this $\sigma^2_m$. So now as any sane person (ok statistician) would do we would model the posterior estimate $\hat{V}_m$ as the linear combination of the prior estimate $\hat{V}^-_m$ and the deviation of the estimate from the measurements (also called as innovation term) given as

$$y_m = Z_m - \hat{V}^-_m \qquad (3)$$

Putting the above ramble in equation we have

$$\hat{V}_m = \hat{V}^-_m + k_my_m \qquad (4)$$

subtracting $V_m$ from both sides we get

$$\hat{V}_m - V_m = \hat{V}^-_m - V_m + k_m(Z_m - \hat{V}^-_m)$$

To compute $k_m$ we take the square and it's expectation and then differentiate the quadratic in $k_m$ to get something like (try this your self)

$$k_m = \frac{\mathbb{E}[(V_m - \hat{V}^-_m)(y_m)]}{\mathbb{E}[y_m^2]}$$

The numerator and denominator when expanded and taking into the account for independence of the R.V $Z_m$, $V_m$, and $\hat{V}^-_m$ (think about this too) we get

$$k_m = \frac{\sigma^{2-}_m}{\sigma^{2-}_m + \sigma^2_{\nu}} \qquad (5)$$

along with this we also have from equation(1)

$$\sigma_m^{2-} = \sigma_{m-1}^2 + \sigma_\omega^2 \qquad (6)$$

Now substituting this (5) in the quadratic for $\mathbb{E}[e^2_m]$ we get the variance as

$$\sigma^2_m = (1 - k_m)\sigma_m^{2-} \qquad (7)$$

Now we have everything :D, already! It will be clear when you look at the summary below

  • Start with an initial guess for $V = V_0$
  • Get the prior estimate of voltage and it's error ($\hat{V}^-_m$ and $\sigma_m^{2-}$) from the process equation (1) and (6)
  • Using the prior estimates and the measurements data at instant m we get the posterior (read corrected) estimates of the voltage and it's error at instant m ($\hat{V}_m$ and $\sigma_m^2$)
  • Repeat this for several instances and we will converge to a solution (Yes! there exist a proof for convergence, you can google it up)

So hopefully you would have got a hang of how this works. So let's code it up


In [ ]:
# Initialize the parameters

initial_guess = 3
initial_guess_error = 1
sig_nu = 0.3
sig_omega = 0.01
estimate_vector = []
estimate_vector.append(initial_guess)
error_estimate_vector = []
error_estimate_vector.append(initial_guess_error)

# Run the Filter

for i in range(iter_max-1):
    # first the prior estimation step
    
    volt_prior_est = estimate_vector[i]
    error_prior_est = error_estimate_vector[i] + sig_omega * sig_omega
    
    # estimate correction
    
    k = error_prior_est/(error_prior_est + sig_nu * sig_nu)
    volt_corrected_est = volt_prior_est + k * (measurements[i+1] - volt_prior_est)
    error_corrected_est = (1 - k) * error_prior_est
    estimate_vector.append(volt_corrected_est)
    error_estimate_vector.append(error_corrected_est)

Let us plot the things


In [ ]:
plt.figure()
plt.plot(range(iter_max),true_voltage_mat,'b',range(iter_max), measurements,'r', range(iter_max), estimate_vector,'g')
plt.xlabel('Time Instances')
plt.ylabel('Voltage')
plt.legend(('true voltage', 'measured', 'filtered'), loc=1)

Did you have your voila moment? :D Lets also look at the error for the estimate, lets plot it.


In [ ]:
plt.figure()
plt.plot(range(iter_max),error_estimate_vector)
plt.xlabel('Time Instances')
plt.ylabel('Voltage Error')

Kalman Filter

You might be wondering where is the control in all this. So now that's easily introduced in the actual formulation of Kalman filter which we will look at now. In the actual Kalman Filter we deal with multi-variable setting unlike that of the voltmeter example. So let me describe the inputs and the outputs and the parameters

Inputs

  • $\textbf{Z}_m : $ The measurements vector at each instant m
  • $\textbf{U}_m : $ This is new!, it denotes the controls provided to the system at instance m (this is just like force being the control for velocity of the car)

Outputs

  • $\textbf{X}_m : $ Newest estimate of the current state (state can be thought of as a parameter vector)
  • $P_m :$ The newest estimate for the average error of the state

Parameters

  • A : State transition matrix, basically the constant matrix multiplied to the previous estimate in the process equation
  • B : Control matrix, one to be multiplied to the control vector in the process equation
  • H : Observation matrix, the proportionality factor for state to be equal to the measurements
  • Q : Covariance matrix for the process error (this is again assumed to be known)
  • R : Covariance matrix for the measurements error (again known)

Now we are ready to write the basic equations for the KF, don't worry much of the above will be cleared as you look at the equation.

Lets start with the two basic equations, first is the process equation

$$\textbf{X}_m = A\textbf{X}_{m-1} + B\textbf{U}_{m-1} + \pmb{\omega}_m \qquad (8)$$

and then we have the measurements equation

$$\textbf{Z}_m = H\textbf{X}_m + \pmb{\nu}_m \qquad (9)$$

Here, we model the two error terms $\pmb{\omega}_m$ and $\pmb{\nu}_m$ as multi-variate Gaussian distributions with zero mean and covariance matrices Q and P respectively, i.e. $\pmb{\omega}_m \textrm{~} \mathcal{N}(0,Q)$ and $\pmb{\nu}_m \textrm{~} \mathcal{N}(0,P)$

So now following the exact same philosophy of that we followed for the derivation in voltmeter example we get the update equations for the general Kalman Filter. I will just list it down, if people are interested they can look up the references given below

$$ P_m^- = AP_{m-1}A^T + Q \qquad (10) $$$$ K_m = P_m^-H^T(HP_m^-H^T + R) \qquad (11) $$$$\pmb{y}_m = \pmb{Z}_m - H\hat{\pmb{X}_m^-} \qquad (12)$$$$ \hat{\pmb{X}}_m = \hat{\pmb{X}_m^-} + K_m\pmb{y}_m \qquad (13)$$$$ P_m = (I - K_mH)P_m^- \qquad (14)$$

Again summarizing

  • Start with an initial guess for $\pmb{X} = \pmb{X}_0$
  • Get the prior estimate of state and it's error ($\hat{\pmb{X}}^-_m$ and $P_m^{-}$) from the process equation (1) and (6)
  • Using the prior estimates and the measurements data at instant m we get the posterior (read corrected) estimates of the voltage and it's error at instant m ($\hat{V}_m$ and $\sigma_m^2$)
  • Repeat this for several instances and we will converge to a solution (Yes! there exist a proof for convergence, you can google it up)

Now enough of this rambling, I do hope you get this but we are going to make a kalman filter class and then try to see how this fits with our voltmeter example. So lets first create a class


In [ ]:
class kalmanFilter:
    def __init__(self, X0, P0, A, B, H, Q, R):
        self.A = A                       # State Transition Matrix
        self.B = B                       # Control Matrix
        self.H = H                       # Observation Matrix
        self.Q = Q                       # Covariance for the process error
        self.R = R                       # Covariance for the measurements error
        self.current_estimate = X0       # this is the initial guess of the state
        self.current_error_estimate = P0 # initial guess for the state estimate error
        
    def getEstimate(self):
        # returns the current state estimate
        return self.current_estimate
    
    def getErrorEstimate(self):
        # returns the current state error estimate
        return self.current_error_estimate
    
    def iteration(self, U, Z):
        # here is where the updates happen
        # U = control vector
        # Z = measurements vector
        
        # prior prediction step
        prior_estimate = self.A * self.current_estimate + self.B * U
        prior_error_estimate = (self.A * self.current_error_estimate) * np.transpose(self.A) + self.Q
        
        # intermediate observation
        y = Z - self.H * prior_estimate
        y_covariance = self.H * prior_error_estimate * np.transpose(self.H) + self.R
        
        # Correction Step
        K = prior_error_estimate * np.transpose(self.H) * np.linalg.inv(y_covariance)
        self.current_estimate = prior_estimate + K * y
        # We need the size of the matrix so we can make an identity matrix.
        size = self.current_error_estimate.shape[0]
        # eye(n) = nxn identity matrix.
        self.current_error_estimate = (np.eye(size) - K * self.H) * prior_error_estimate

Now we have this nice class set up, let's test it's correctness by applying it to the Voltmeter problem. First things in the voltmeter problem we set the parameters first.


In [ ]:
A = np.matrix([1])
B = np.matrix([0])
H = np.matrix([1])
Q = np.matrix([0.0001]) # the sigmas gets squared
R = np.matrix([0.09])
X0 = np.matrix([3])
P0 = np.matrix([1])

KF = kalmanFilter(X0, P0, A, B, H, Q, R)
estimate_vector_new = []
estimate_vector_new.append(initial_guess)
error_estimate_vector_new = []
error_estimate_vector_new.append(initial_guess_error)

# Run the Filter

for i in range(iter_max-1):
    U = np.matrix([0]) # there is no control here
    Z = np.matrix([measurements[i+1]])
    estimate_vector_new.append(KF.getEstimate()[0,0])
    error_estimate_vector_new.append(KF.getErrorEstimate()[0,0])
    KF.iteration(U,Z)

Now lets plot again to see weather we are good to go or not.


In [ ]:
plt.figure()
plt.plot(range(iter_max),true_voltage_mat,'b',range(iter_max), measurements,'r', range(iter_max), estimate_vector_new,'g')
plt.xlabel('Time Instances')
plt.ylabel('Voltage')
plt.legend(('true voltage', 'measured', 'filtered'), loc=1)

Well it should exactly match with the previous plot :P duh, big deal. But now as we have this nice class we can start dealing with cooler application. So now after much search I have come up with this application to end this hopefully interesting notebook :D

Well a simple but I guess not with real wow factor example is that we have a ball thrown in a projectile and we can measure it's (x,y) position with camera system and also it's velocity (vx,vy) by the sensors on the ball (used in cricket for LBW system without the velocity part). Now we know that these cameras are noisy so we need a filtered estimate of the state of the ball finally (here the state is a vector of x,y,vx,vy). So let's get on with the system modeling first.

Remember projectiles! Remember JEE physics which you all did. All right I am not going to explain the kinematics equations just have a look and you will understand. We project at an initial velocity u and angle $\theta$ wrt horizontal. we divide into intervals of measurements into time intervals of $\Delta t$.

$$Vx_{t} = Vx_{t-1}$$$$Vy_{t} = Vy_{t-1} - g\Delta t$$$$x_{t} = x_{t-1} + Vx_{t-1}\Delta t$$$$y_{t} = y_{t-1} + Vy_{t-1}\Delta t - 0.5g\Delta t^2$$

The state of the equations at instance t is given by the vector $\pmb{X}_t = (x_t,Vx_t,y_t,Vy_t)$, and the control vector is the additional term $\pmb{u}_t = (0,0,-0.5g\Delta t^2, -g\Delta t)$. Think about this! Now we start defining matrices.

\begin{equation} \pmb{X}_t = A\pmb{X}_{t-1} + B\pmb{u}_t \\ A=\left(\begin{array}{cccc} 1 & \Delta t & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & \Delta t \\ 0 & 0 & 0 & 1 \end{array}\right) B = \left(\begin{array}{cccc} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right) \end{equation}

As our measurements are directly the state there is no proportionality and hence the H matrix is an identity matrix (H = $I_4$). Also we assume that our process has an error covariance as $Q = 0.0001*I_4$ and the measurements has an error covariance of $R = 0.3*I_4$ (all the errors are not necessary to be equal, but taken here for convenience :D). Also we need the values for $\theta$ and $u$, lets say $\theta = 45$ and $u = 100m/s$

Now we have all the matrices and so lets start with solving this using our kalmanFilter class.


In [ ]:
# Physics
# 1) sin(45)*100 = 70.710 and cos(45)*100 = 70.710
# vf = vo + at
# 0 = 70.710 + (-9.81)t
# t = 70.710/9.81 = 7.208 seconds for half
# 14.416 seconds for full journey
# distance = 70.710 m/s * 14.416 sec = 1019.36796 m

due to the above calculations we have to be careful choosing $\Delta t$ and number of iterations. $\Delta t = 0.1$ and max_iter = 145 makes sense (think). So now we create our simulated data


In [ ]:
del_t = 0.1
max_iter = 145
added_noise = 25
init_vel = 100
theta = np.pi/4

# now we define the measurement matrix
measurements = np.zeros((4,max_iter))
true_value = np.zeros((4,max_iter))
ux0 = init_vel * np.cos(theta)
uy0 = init_vel * np.sin(theta)

for i in range(max_iter):
    # we generate this by projectile equations and adding noise to it
    t = i * del_t
    true_value[0,i] = ux0 * t
    true_value[1,i] = ux0
    true_value[2,i] = uy0 * t - 0.5 * 9.8 * t * t
    true_value[3,i] = uy0 - 9.8 * t
    measurements[0,i] = random.gauss(true_value[0,i],added_noise)
    measurements[1,i] = random.gauss(true_value[1,i],added_noise)
    measurements[2,i] = random.gauss(true_value[2,i],added_noise)
    measurements[3,i] = random.gauss(true_value[3,i],added_noise)

Lets plot the position data and measurements


In [ ]:
plt.figure()
plt.plot(true_value[0,:],true_value[2,:],'b',measurements[0,:],measurements[2,:],'r')
plt.xlabel('X Position')
plt.ylabel('Y Position')
plt.legend(('true trajectory', 'measured trajectory'), loc=2)

So this is how the cricket ball's trajectory is measured, now we go to the filtering part


In [ ]:
A = np.matrix([[1,del_t,0,0],[0,1,0,0],[0,0,1,del_t],[0,0,0,1]])
B = np.matrix([[0,0,0,0],[0,0,0,0],[0,0,1,0],[0,0,0,1]])
u = np.matrix([[0],[0],[-0.5*9.8*del_t*del_t],[-9.8*del_t]]) # control vector is constant does not depend on t
H = np.eye(4)
Q = 0.0001 * np.eye(4)
R = 0.3 * np.eye(4)
X0 = np.matrix([[0],[ux0],[500],[uy0]]) # set it little different than the orig initial just to show that KF will still work
P0 = np.eye(4) # set arbitrary as identity

estimate_matrix = np.zeros((4,max_iter))
estimate_matrix[:,0] = np.asarray(X0)[:,0]
estimate_error = P0
KF = kalmanFilter(X0, P0, A, B, H, Q, R)
 
for i in range(max_iter-1):
    Z = np.matrix([[measurements[0,i+1]],[measurements[1,i+1]],[measurements[2,i+1]],[measurements[3,i+1]]])
    estimate_matrix[:,i+1] = np.asarray(KF.getEstimate())[:,0]
    KF.iteration(u,Z)

In [ ]:
plt.figure()
plt.plot(true_value[0,:],true_value[2,:],'b',measurements[0,:],measurements[2,:],'r',estimate_matrix[0,:],estimate_matrix[2,:],'g')
plt.xlabel('X Position')
plt.ylabel('Y Position')
plt.legend(('true trajectory', 'measured trajectory', 'filtered trajectory'), loc=1)

Zoom and see if you are not able to. Now I think you can also look at the error terms and velocity estimates by changing the code little bit.

This concludes it! Hopefully well :D

Here are some references

References