Change and Tracking

Kevin J. Walchko created 30 September 2017

We are going to talk about change detection. Basically, we want to know when a scene has changed because we are trying to count the number of people leaving a building, how many cars are on a road, or did someone just plant a bomb by leaving a bag in an airport? There are many ways to do this and we will look at a couple.

Objectives

understand how to detect change between 2 images
understand how to use the OpenCV blob interface

References

Setup



In [4]:

    
from __future__ import print_function
# these imports let you use opencv
import cv2          # opencv itself
# import common       # some useful opencv functions
# import video        # some video stuff
import numpy as np  # matrix manipulations

%matplotlib inline 
from matplotlib import pyplot as plt           # this lets you draw inline pictures in the notebooks



In [5]:

    
import pylab                                   # this allows you to control figure size 
pylab.rcParams['figure.figsize'] = (10.0, 18.0) # this controls figure size in the notebook

Background Subtraction

Background subtraction is a major preprocessing step in many vision based applications. For example, consider the cases like a visitor counter where a static camera takes the number of visitors entering or leaving the room, or a traffic camera extracting information about the vehicles etc. In all these cases, first you need to extract the person or vehicles alone. Technically, you need to extract the moving foreground from static background.

If you have an image of the background alone, like an image of the room without visitors, an image of the road without vehicles etc, it is an easy job. Just subtract the new image from the background. You get the foreground objects alone. But in most of the cases, you may not have such an image, so we need to extract the background from whatever images we have. It become more complicated when there is a shadow from the vehicles on the road. Since the shadow is also moving, simple subtraction will mark that also as foreground too. Basically, things are complicated. Several algorithms were introduced to OpenCV for this purpose.

Simple Subtraction

First let's do the simple method of change detection. Given an image of the background, do a simple subtraction pixel by pixel and determine if the scene has changed.

Now this method is very simple and straight forward. However, it has several issues:

I need to have a background picture before I start to determine if the scene changed ... you don't always have that
You need the same lighting in the background picture and the picture that might contain change, otherwise, a lot of pixels will register change.
- If you are doing this outdoors, then you will have issues with the sun moving and shadows changing
- You will also, obviously, have issues with day/night or full moon/new moon lighting conditions
Although most cameras are made fairly well, you will still have slight variations in the same pixel capturing the same scene from frame to frame. This is a function of:
- how photons hit the pixel during an integration time
- camera focal plane noise
- encoding or compression artifact the camera will introduce before your program gets the image

BackgroundSubtractorMOG

This one seems to be missing on my version of OpenCV 3.3 on macOS

It is a Gaussian Mixture-based Background/Foreground Segmentation Algorithm. It was introduced in the paper "An improved adaptive background mixture model for real-time tracking with shadow detection" by P. KadewTraKuPong and R. Bowden in 2001. It uses a method to model each background pixel by a mixture of K Gaussian distributions (K = 3 to 5). The weights of the mixture represent the time proportions that those colors stay in the scene. The probable background colors are the ones which stay longer and more static.

While coding, we need to create a background object using the function, cv2.createBackgroundSubtractorMOG(). It has some optional parameters like length of history, number of gaussian mixtures, threshold etc. It is all set to some default values. Then inside the video loop, use backgroundsubtractor.apply() method to get the foreground mask.

BackgroundSubtractorMOG2

It is also a Gaussian Mixture-based Background/Foreground Segmentation Algorithm. It is based on two papers by Z.Zivkovic, "Improved adaptive Gausian mixture model for background subtraction" in 2004 and "Efficient Adaptive Density Estimation per Image Pixel for the Task of Background Subtraction" in 2006. One important feature of this algorithm is that it selects the appropriate number of gaussian distribution for each pixel. (Remember, in the last case, we took a K gaussian distributions throughout the algorithm). It provides better adaptibility to varying scenes due illumination changes etc.

As in the previous case, we have to create a background subtractor object. Here, you have an option of selecting whether shadows are detected or not. If detectShadows = True (which is so by default), it detects and marks shadows, but decreases the speed. Shadows will be marked in gray color.

cv2.BackgroundSubtractorMOG2() creates a MOG2 object
MOG2.apply(image, learningRate) produces a grayscale image [0, 255] with some noise/stray pixels showing change
- learningRate: The value between 0 and 1 that indicates how fast the background model is updated. Remember, faster is not always better.
MOG2.setDetectShadows(*True or False*) toggles the shadow detection. You may not always want to remove shadows ... why? If I am detecting people in a video. Then coupling the detections with time of day and where the sun is, it helps to give me the ability to estimate a person's height. From that you could estimate gender, age, etc. We do stuff like this in IMINT and shadows tell us things.

# this function is to be run from the command line and not a jupyter notebook

def MOG2(cap):
    # this function takes in cv2.VideoCapture() object which is either a camera
    # or a video (i.e., mp4) and runs through the images performing background
    # subtraction
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
    print('kernel', kernel)
    fgbg = cv2.createBackgroundSubtractorMOG2()
    # fgbg.setDetectShadows(False)

    while True:
        ret, frame = cap.read()
        if not ret:
            break
        # find the change
        fgmask = fgbg.apply(frame)
        # clean up the image
        fgmask = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel)

        cv2.imshow('frame',fgmask)
        k = cv2.waitKey(10)
        if k == ord('q'):
            break
    cap.release()
    cv2.destroyAllWindows()

BackgroundSubtractorGMG

This algorithm combines statistical background image estimation and per-pixel Bayesian segmentation. It was introduced by Andrew B. Godbehere, Akihiro Matsukawa, Ken Goldberg in their paper “Visual Tracking of Human Visitors under Variable-Lighting Conditions for a Responsive Audio Art Installation” in 2012. As per the paper, the system ran a successful interactive audio art installation called “Are We There Yet?” from March 31 - July 31 2011 at the Contemporary Jewish Museum in San Francisco, California.

It uses first few (120 by default) frames for background modelling. It employs probabilistic foreground segmentation algorithm that identifies possible foreground objects using Bayesian inference. The estimates are adaptive; newer observations are more heavily weighted than old observations to accommodate variable illumination. Several morphological filtering operations like closing and opening are done to remove unwanted noise. You will get a black window during first few frames.

import numpy as np
import cv2

cap = cv2.VideoCapture('vtest.avi')

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
fgbg = cv2.createBackgroundSubtractorGMG()

while(1):
    ret, frame = cap.read()

    fgmask = fgbg.apply(frame)
    fgmask = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel)

    cv2.imshow('frame',fgmask)
    k = cv2.waitKey(30) & 0xff
    if k == 27:
        break

cap.release()
cv2.destroyAllWindows()

Example

Let grab some frames from a video and see MOG2 in action (sort of).



In [3]:

    
def MOG2(frames):
    # this function takes in an array of images from the cv2.VideoCapture() which is either
    # a camera or a video (i.e., mp4) and runs through the images performing background
    # subtraction
    kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
    fgbg = cv2.createBackgroundSubtractorMOG2()
    # fgbg.setDetectShadows(False)
    
    ret = []

    for frame in frames:
        # find the change
        fgmask = fgbg.apply(frame)
        # clean up the image
        fgmask = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel)
        ret.append(fgmask)
    return ret

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
fgbg = cv2.createBackgroundSubtractorMOG2()

f0 = cv2.imread('frame101.png')
f1 = cv2.imread('frame102.png')
f2 = cv2.imread('frame103.png')
f3 = cv2.imread('frame104.png')
inputs = [f0, f1,f2,f3]

masks = MOG2(inputs)

for ii, i, m in zip([1,3, 5, 7], inputs, masks):
    plt.subplot(4,2,ii)
    plt.imshow(i)
    plt.subplot(4,2,ii+1)
    plt.imshow(m, cmap='gray')

In the first image, since it is new and hasn't learned much from the first image, the change detection isn't that great. However, there is already a big difference in the second and remaining images. We can see people moving around, separated from the back ground much better.

Take a look at the bg.py file with this lesson. You can select (via commenting or uncommenting code in it) between the movie or your camera in the laptop. Also, you can play with the learning rate and see how that changes the performance.

Exercise

Create a python program and have it capture the video off your laptop camera. Then feed the video stream through the MOG2 background subtraction. First sit still and you should see a black image, then move around and watch it change.
Same as before, write a program that captures video from your laptop. This time have it display the color imagery from the camera. Now, try turning the output of the MOG2 filter into a mask and changing the color of your video images to highlight which pixels have changed. Human-machine-interfaces (HMI) like this are typical to alert operators of change.

Questions

What is change detection and identify some applications?
The first example is simple image subtraction. What are the pro's and con's of this method?
What are the steps to do change detection using image subtraction?
How does the MOG2 algorithm overcome the issues of simple image subtraction?

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.