Histogram Detection

Kevin J. Walchko, created 5 Dec 2016


Although color detection is simple and works pretty good, the world tends to be made up of objects composed of more than one color. What we want to do is build a color model and then statistically determine if an object is part of that color model or not.

References

Setup


In [1]:
%matplotlib inline

In [3]:
from __future__ import print_function
from __future__ import division
import numpy as np
from matplotlib import pyplot as plt
import cv2
import time

Histogram Detection

Finding a specific object is not hard under good conditions. Tyically it can be divided up into several steps:

  1. seach an image and find the object you want. You will need some sort of a priori model to find the image that was developed off line.
  2. assuming the object doesn't move in a non-linear motion or out of frame between 2 consequtive images. Start at the last known location of the object in the previous image and search the local area for the object again.
  3. Continue tracking through the sequence of images until you are done.

Tennis Anyone?

Let's take a look at trying to find some tennis balls. First we need to move switch between RGB and HSV color space.

  • The hue (H) of a color refers to which pure color it resembles. All tints, tones and shades of red have the same hue. Hues are described by a number that specifies the position of the corresponding pure color on the color wheel, as a fraction between 0 and 1. Value 0 refers to red; 1/6 is yellow; 1/3 is green; and so forth around the color wheel.
  • The saturation (S) of a color describes how white the color is. A pure red is fully saturated, with a saturation of 1; tints of red have saturations less than 1; and white has a saturation of 0.
  • The value (V) of a color, also called its lightness, describes how dark the color is. A value of 0 is black, with increasing lightness moving away from black

A good resource for understanding RGB and HSV is colorizer.org where you can play with some sliders and see how it changes the color in different color spaces.

For HSV, Hue range is [0,179], Saturation range is [0,255] and Value range is [0,255]. Different softwares use different scales. So if you are comparing OpenCV values with them, you need to normalize these ranges.

Why?

The big reason is that it separates color information (chroma) from intensity or lighting (luma). Because value is separated, you can construct a histogram or thresholding rules using only saturation and hue. This in theory will work regardless of lighting changes in the value channel. In practice it is just a nice improvement. Even by singling out only the hue you still have a very meaningful representation of the base color that will likely work much better than RGB. The end result is a more robust color thresholding over simpler parameters.

Hue is a continuous representation of color so that 0 and 360 are the same hue which gives you more flexibility with the buckets you use in a histogram. Geometrically you can picture the HSV color space as a cone or cylinder with H being the degree, saturation being the radius, and value being the height.


In [4]:
images = ['hist_pics/tennis_1.jpg', 'hist_pics/tennis_2.jpg', 'hist_pics/tennis_3.jpg']
im = []
for image in images:
    i = cv2.imread(image)
    i = cv2.cvtColor(i, cv2.COLOR_BGR2RGB)  # pretty images
    im.append(i)

plt.subplot(1,3,1)
plt.imshow(im[0]);
plt.subplot(1,3,2)
plt.imshow(im[1]);
plt.subplot(1,3,3)
plt.imshow(im[2]);



In [5]:
for i, image in enumerate(im):
    img = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
    im[i] = img


plt.subplot(1,3,1)
plt.imshow(im[0]);
plt.subplot(1,3,2)
plt.imshow(im[1]);
plt.subplot(1,3,3)
plt.imshow(im[2]);


Let's define a region of interest (ROI) that will be used to find a tennis ball in another image.


In [6]:
roi = []
roi.append(im[0][60:110, 50:180])
# roi.append(im[0][50:190, 50:190])  # includes seam
roi.append(im[1][70:190, 90:190]) 
# roi.append(im[1][50:190, 50:190])  # includes seam
roi.append(im[2][70:170, 50:110])

plt.subplot(1,3,1)
plt.imshow(roi[0]);
plt.subplot(1,3,2)
plt.imshow(roi[1]);
plt.subplot(1,3,3)
plt.imshow(roi[2]);



In [7]:
class hsvHistogram(object):
    """
    This class holds the histogram information of a HSV image. 
    """
    hist = None
    bins = None
    def __init__(self, bins):
        self.bins = bins
        self.kernel = np.ones((5,5),np.uint8)
        
    def calcHist(self, im_array):
        hist = cv2.calcHist(
            im_array,
            [0, 1],
            None,
            [self.bins, self.bins],
            [0,180, 0 ,256]
        )
        cv2.normalize(hist,hist,0,32,cv2.NORM_MINMAX)
        self.hist = hist
        
    def find(self, test, threshold=3):
        if self.hist is None:
            print('Need to init histogram first!')
            return 1
        
        dst = cv2.calcBackProject(
            [test],
            [0, 1],
            self.hist,
            [0,180, 0 ,256],
            1
        )
        
        disc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
        cv2.filter2D(dst,-1,disc,dst)

        ret, thresh = cv2.threshold(dst,threshold,255,cv2.THRESH_BINARY)

        # morphological/blobify --------
        # thresh = cv2.erode(thresh, self.kernel)
        # thresh = cv2.dilate(thresh, self.kernel)
        thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, self.kernel)
        return thresh
    
    def plot(self):
        plt.plot(np.linspace(0,180,roiHist.bins), roiHist.hist)
        plt.grid(True)
        plt.xlabel('Hue')

In [8]:
# hist = cv2.calcHist(roi,[0, 1],None,[32, 32],[0,180, 0 ,256])
# cv2.normalize(hist,hist,0,32,cv2.NORM_MINMAX)
# plt.plot(hist)
# plt.grid(True)
# print(hist)
# print(len(hist))

roiHist = hsvHistogram(32)
roiHist.calcHist(roi)
# plt.plot(np.linspace(0,180,roiHist.bins), roiHist.hist)
# plt.grid(True)
roiHist.plot()



In [9]:
print('h x w x channels: {} {} {}'.format(*im[0].shape))
# print(hist)


h x w x channels: 225 225 3

In [10]:
# h,s,v = cv2.split(roi)
# plt.hist(h.ravel(), 180,[0,180]);
# plt.grid(True);

In [11]:
ball2_rgb = cv2.imread('hist_pics/test5.jpg')
ball2_rgb = cv2.cvtColor(ball2_rgb, cv2.COLOR_BGR2RGB)  # make pretty for ipython
ball2 = cv2.cvtColor(ball2_rgb, cv2.COLOR_RGB2HSV)  # now convert to HSV

plt.subplot(1,2,1)
plt.imshow(ball2_rgb);
plt.subplot(1,2,2)
plt.imshow(ball2);



In [12]:
h,s,v = cv2.split(ball2)
plt.hist(h.ravel(), 32,[0,180], label='hue');
plt.hist(s.ravel(), 32,[0,255], label='saturation');
plt.hist(v.ravel(), 32,[0,255], label='value');
plt.grid(True)
plt.legend(loc='upper right');


Backprojection

What is backprojection actually in simple words? It is used for image segmentation or finding objects of interest in an image. In simple words, it creates an image of the same size (but single channel) as that of our input image, where each pixel corresponds to the probability of that pixel belonging to our object. In more simpler worlds, the output image will have our object of interest in more white compared to remaining part. Well, that is an intuitive explanation. (I can’t make it more simpler). Histogram Backprojection is used with camshift algorithm etc.

How do we do it? We create a histogram of an image containing our object of interest (in our case, the ground, leaving player and other things). The object should fill the image as far as possible for better results. And a color histogram is preferred over grayscale histogram, because color of the object is more better way to define the object than its grayscale intensity. We then “back-project” this histogram over our test image where we need to find the object, ie in other words, we calculate the probability of every pixel belonging to the ground and show it. The resulting output on proper thresholding gives us the ground alone.

Morphological Operations

  1. Erosion The basic idea of erosion is just like soil erosion only, it erodes away the boundaries of foreground object (Always try to keep foreground in white). So what does it do? The kernel slides through the image (as in 2D convolution). A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (made to zero).

    So what happends is that, all the pixels near boundary will be discarded depending upon the size of kernel. So the thickness or size of the foreground object decreases or simply white region decreases in the image. It is useful for removing small white noises (as we have seen in colorspace chapter), detach two connected objects etc.

  2. Dilation It is just opposite of erosion. Here, a pixel element is ‘1’ if atleast one pixel under the kernel is ‘1’. So it increases the white region in the image or size of foreground object increases. Normally, in cases like noise removal, erosion is followed by dilation. Because, erosion removes white noises, but it also shrinks our object. So we dilate it. Since noise is gone, they won’t come back, but our object area increases. It is also useful in joining broken parts of an object.

  3. Opening Opening is just another name of erosion followed by dilation. It is useful in removing noise, as we explained above. Here we use the function, cv2.morphologyEx()

  4. Closing Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects, or small black points on the object.


In [13]:
test = ball2.copy()

thresh = roiHist.find(test, 20)
thresh = cv2.merge((thresh,thresh,thresh))  # make 3 channels

# plot ----------
res = cv2.bitwise_and(ball2_rgb,thresh)
pics = np.hstack((test,thresh,res))
plt.imshow(pics);
# plt.colorbar();



In [14]:
a,b,c = cv2.split(thresh)
img, contours, hierarchy = cv2.findContours(a,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
(x,y),radius = cv2.minEnclosingCircle(cnt)
center = (int(x),int(y))
radius = int(radius)
cv2.circle(res,center,radius,(0,255,0),2)
cv2.circle(res,center,1,(0,255,0),2)
plt.imshow(res);


So, what is color quantization?

Color quantization is the process of reducing the number of distinct colors in an image.

Normally, the intent is to preserve the color appearance of the image as much as possible, while reducing the number of colors, whether for memory limitations or compression. ref