Kevin J. Walchko, created 5 Dec 2016
Although color detection is simple and works pretty good, the world tends to be made up of objects composed of more than one color. What we want to do is build a color model and then statistically determine if an object is part of that color model or not.
In [1]:
%matplotlib inline
In [3]:
from __future__ import print_function
from __future__ import division
import numpy as np
from matplotlib import pyplot as plt
import cv2
import time
Finding a specific object is not hard under good conditions. Tyically it can be divided up into several steps:
Let's take a look at trying to find some tennis balls. First we need to move switch between RGB and HSV color space.
A good resource for understanding RGB and HSV is colorizer.org where you can play with some sliders and see how it changes the color in different color spaces.
For HSV, Hue range is [0,179], Saturation range is [0,255] and Value range is [0,255]. Different softwares use different scales. So if you are comparing OpenCV values with them, you need to normalize these ranges.
The big reason is that it separates color information (chroma) from intensity or lighting (luma). Because value is separated, you can construct a histogram or thresholding rules using only saturation and hue. This in theory will work regardless of lighting changes in the value channel. In practice it is just a nice improvement. Even by singling out only the hue you still have a very meaningful representation of the base color that will likely work much better than RGB. The end result is a more robust color thresholding over simpler parameters.
Hue is a continuous representation of color so that 0 and 360 are the same hue which gives you more flexibility with the buckets you use in a histogram. Geometrically you can picture the HSV color space as a cone or cylinder with H being the degree, saturation being the radius, and value being the height.
In [4]:
images = ['hist_pics/tennis_1.jpg', 'hist_pics/tennis_2.jpg', 'hist_pics/tennis_3.jpg']
im = []
for image in images:
i = cv2.imread(image)
i = cv2.cvtColor(i, cv2.COLOR_BGR2RGB) # pretty images
im.append(i)
plt.subplot(1,3,1)
plt.imshow(im[0]);
plt.subplot(1,3,2)
plt.imshow(im[1]);
plt.subplot(1,3,3)
plt.imshow(im[2]);
In [5]:
for i, image in enumerate(im):
img = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)
im[i] = img
plt.subplot(1,3,1)
plt.imshow(im[0]);
plt.subplot(1,3,2)
plt.imshow(im[1]);
plt.subplot(1,3,3)
plt.imshow(im[2]);
Let's define a region of interest (ROI) that will be used to find a tennis ball in another image.
In [6]:
roi = []
roi.append(im[0][60:110, 50:180])
# roi.append(im[0][50:190, 50:190]) # includes seam
roi.append(im[1][70:190, 90:190])
# roi.append(im[1][50:190, 50:190]) # includes seam
roi.append(im[2][70:170, 50:110])
plt.subplot(1,3,1)
plt.imshow(roi[0]);
plt.subplot(1,3,2)
plt.imshow(roi[1]);
plt.subplot(1,3,3)
plt.imshow(roi[2]);
In [7]:
class hsvHistogram(object):
"""
This class holds the histogram information of a HSV image.
"""
hist = None
bins = None
def __init__(self, bins):
self.bins = bins
self.kernel = np.ones((5,5),np.uint8)
def calcHist(self, im_array):
hist = cv2.calcHist(
im_array,
[0, 1],
None,
[self.bins, self.bins],
[0,180, 0 ,256]
)
cv2.normalize(hist,hist,0,32,cv2.NORM_MINMAX)
self.hist = hist
def find(self, test, threshold=3):
if self.hist is None:
print('Need to init histogram first!')
return 1
dst = cv2.calcBackProject(
[test],
[0, 1],
self.hist,
[0,180, 0 ,256],
1
)
disc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
cv2.filter2D(dst,-1,disc,dst)
ret, thresh = cv2.threshold(dst,threshold,255,cv2.THRESH_BINARY)
# morphological/blobify --------
# thresh = cv2.erode(thresh, self.kernel)
# thresh = cv2.dilate(thresh, self.kernel)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, self.kernel)
return thresh
def plot(self):
plt.plot(np.linspace(0,180,roiHist.bins), roiHist.hist)
plt.grid(True)
plt.xlabel('Hue')
In [8]:
# hist = cv2.calcHist(roi,[0, 1],None,[32, 32],[0,180, 0 ,256])
# cv2.normalize(hist,hist,0,32,cv2.NORM_MINMAX)
# plt.plot(hist)
# plt.grid(True)
# print(hist)
# print(len(hist))
roiHist = hsvHistogram(32)
roiHist.calcHist(roi)
# plt.plot(np.linspace(0,180,roiHist.bins), roiHist.hist)
# plt.grid(True)
roiHist.plot()
In [9]:
print('h x w x channels: {} {} {}'.format(*im[0].shape))
# print(hist)
In [10]:
# h,s,v = cv2.split(roi)
# plt.hist(h.ravel(), 180,[0,180]);
# plt.grid(True);
In [11]:
ball2_rgb = cv2.imread('hist_pics/test5.jpg')
ball2_rgb = cv2.cvtColor(ball2_rgb, cv2.COLOR_BGR2RGB) # make pretty for ipython
ball2 = cv2.cvtColor(ball2_rgb, cv2.COLOR_RGB2HSV) # now convert to HSV
plt.subplot(1,2,1)
plt.imshow(ball2_rgb);
plt.subplot(1,2,2)
plt.imshow(ball2);
In [12]:
h,s,v = cv2.split(ball2)
plt.hist(h.ravel(), 32,[0,180], label='hue');
plt.hist(s.ravel(), 32,[0,255], label='saturation');
plt.hist(v.ravel(), 32,[0,255], label='value');
plt.grid(True)
plt.legend(loc='upper right');
What is backprojection actually in simple words? It is used for image segmentation or finding objects of interest in an image. In simple words, it creates an image of the same size (but single channel) as that of our input image, where each pixel corresponds to the probability of that pixel belonging to our object. In more simpler worlds, the output image will have our object of interest in more white compared to remaining part. Well, that is an intuitive explanation. (I can’t make it more simpler). Histogram Backprojection is used with camshift algorithm etc.
How do we do it? We create a histogram of an image containing our object of interest (in our case, the ground, leaving player and other things). The object should fill the image as far as possible for better results. And a color histogram is preferred over grayscale histogram, because color of the object is more better way to define the object than its grayscale intensity. We then “back-project” this histogram over our test image where we need to find the object, ie in other words, we calculate the probability of every pixel belonging to the ground and show it. The resulting output on proper thresholding gives us the ground alone.
Erosion The basic idea of erosion is just like soil erosion only, it erodes away the boundaries of foreground object (Always try to keep foreground in white). So what does it do? The kernel slides through the image (as in 2D convolution). A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (made to zero).
So what happends is that, all the pixels near boundary will be discarded depending upon the size of kernel. So the thickness or size of the foreground object decreases or simply white region decreases in the image. It is useful for removing small white noises (as we have seen in colorspace chapter), detach two connected objects etc.
Dilation It is just opposite of erosion. Here, a pixel element is ‘1’ if atleast one pixel under the kernel is ‘1’. So it increases the white region in the image or size of foreground object increases. Normally, in cases like noise removal, erosion is followed by dilation. Because, erosion removes white noises, but it also shrinks our object. So we dilate it. Since noise is gone, they won’t come back, but our object area increases. It is also useful in joining broken parts of an object.
Opening Opening is just another name of erosion followed by dilation. It is useful in removing noise, as we explained above. Here we use the function, cv2.morphologyEx()
Closing Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects, or small black points on the object.
In [13]:
test = ball2.copy()
thresh = roiHist.find(test, 20)
thresh = cv2.merge((thresh,thresh,thresh)) # make 3 channels
# plot ----------
res = cv2.bitwise_and(ball2_rgb,thresh)
pics = np.hstack((test,thresh,res))
plt.imshow(pics);
# plt.colorbar();
In [14]:
a,b,c = cv2.split(thresh)
img, contours, hierarchy = cv2.findContours(a,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
(x,y),radius = cv2.minEnclosingCircle(cnt)
center = (int(x),int(y))
radius = int(radius)
cv2.circle(res,center,radius,(0,255,0),2)
cv2.circle(res,center,1,(0,255,0),2)
plt.imshow(res);
So, what is color quantization?
Color quantization is the process of reducing the number of distinct colors in an image.
Normally, the intent is to preserve the color appearance of the image as much as possible, while reducing the number of colors, whether for memory limitations or compression. ref
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.