Project Part 1

This notebook contains an interactive session which walks through the pertinent operations required for the project. It is useful for experimenting with and documenting snippets/recipes and the behaviour of OpenCV functions, in addition to prototyping implementation to be incorporated in and/or ported to C++ in the final stage.


In [1]:
%matplotlib inline

In [2]:
import matplotlib.pyplot as plt
import numpy as np
import cv2

We are given the following sample video:


In [42]:
from IPython.display import HTML
from base64 import b64encode
video = open("../inputs/clip_test.m4v", "rb").read()
video_encoded = b64encode(video).decode('ascii')
video_tag = '<video controls alt="test" src="data:video/x-m4v;base64,{0}">'.format(video_encoded)
HTML(data=video_tag)


Out[42]:

And the following patch to match against:


In [43]:
from IPython.display import Image
Image(filename='../inputs/poster.png')


Out[43]:

We can use the videoCapture function to capture the video frame-by-frame.


In [127]:
cap = cv2.VideoCapture('../inputs/clip_test.m4v')

In [128]:
ret = 1
frames = []
while ret:
    ret, frame = cap.read()
    frames.append(frame)
cap.release()

Let's see how many frames we have


In [91]:
len(frames)


Out[91]:
1477

And let's inspect a few of these frames


In [129]:
plt.imshow(frames[0][...,::-1])


Out[129]:
<matplotlib.image.AxesImage at 0x102a009d0>

In [130]:
plt.imshow(frames[250][...,::-1])


Out[130]:
<matplotlib.image.AxesImage at 0x13c978b50>

In [131]:
plt.imshow(frames[500][...,::-1])


Out[131]:
<matplotlib.image.AxesImage at 0x1612b3850>

In [132]:
plt.imshow(frames[750][...,::-1])


Out[132]:
<matplotlib.image.AxesImage at 0x140b8f150>

Feature Detection / Description

Now let's read in the image as a NumPy array


In [134]:
patch = cv2.imread('../inputs/poster.png')

In [136]:
plt.imshow(patch[...,::-1])


Out[136]:
<matplotlib.image.AxesImage at 0x16197b0d0>

In [133]:
sift = cv2.SIFT()

In [146]:
# detect keypoints with SIFT
keypoints = sift.detect(patch)

In [210]:
# number of keypoints
len(keypoints)


Out[210]:
257

Let's draw the keypoints of the image


In [147]:
plt.imshow(cv2.drawKeypoints(patch, keypoints)[...,::-1])


Out[147]:
<matplotlib.image.AxesImage at 0x16810c410>

In [148]:
# show the different drawing options
filter(lambda s: s.startswith('DRAW_MATCHES'), dir(cv2))


Out[148]:
['DRAW_MATCHES_FLAGS_DEFAULT',
 'DRAW_MATCHES_FLAGS_DRAW_OVER_OUTIMG',
 'DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS',
 'DRAW_MATCHES_FLAGS_NOT_DRAW_SINGLE_POINTS']

In [235]:
# show the size of the keypoint, along with its orientation
plt.imshow(cv2.drawKeypoints(patch, keypoints, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)[...,::-1])


Out[235]:
<matplotlib.image.AxesImage at 0x16a3d9e50>

We can also detect keypoints and compute descriptors in one operation


In [151]:
keypoints, descriptors = sift.detectAndCompute(patch, mask=None)

In [154]:
descriptors


Out[154]:
array([[  12.,    0.,    0., ...,    2.,   14.,   37.],
       [   0.,    1.,   21., ...,    0.,    0.,    0.],
       [  13.,    7.,   10., ...,    4.,    0.,    4.],
       ..., 
       [   1.,    1.,    1., ...,  100.,   32.,    0.],
       [   0.,    0.,    0., ...,   32.,    3.,    3.],
       [   0.,    0.,    0., ...,   20.,    0.,    0.]], dtype=float32)

In [153]:
descriptors.shape


Out[153]:
(257, 128)

Feature Matching

Let's try to match the features in the given patch to a few frames


In [158]:
keypoints1, descriptors1 = sift.detectAndCompute(patch, mask=None)

In [177]:
frame = frames[750][...,::-1]

In [178]:
keypoints2, descriptors2 = sift.detectAndCompute(frame, mask=None)

In [179]:
matcher = cv2.BFMatcher()

In [180]:
matches = matcher.knnMatch(descriptors1, descriptors2, k=2)

In [181]:
len(matches)


Out[181]:
257

In [182]:
a, b = matches[0]

In [183]:
a


Out[183]:
<DMatch 0x107090c90>

Unfortunately, DMatch objects doesn't have a very rich representation, so we have to dig deeper. Let's list the attributes of the object.


In [184]:
dir(a)


Out[184]:
['__class__',
 '__delattr__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'distance',
 'imgIdx',
 'queryIdx',
 'trainIdx']

In [185]:
a.distance


Out[185]:
398.069091796875

In [186]:
b.distance


Out[186]:
428.40167236328125

Define the threshold and filter matches


In [187]:
T = 0.8

In [192]:
matches_filtered = [a for a, b in matches if a.distance < T*b.distance]

In [193]:
len(matches_filtered)


Out[193]:
73

In [191]:
cv2.drawMatches(patch, keypoints1, frame, keypoints2, matches_filtered)


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-191-1aeff3c59aeb> in <module>()
----> 1 cv2.drawMatches(patch, keypoints1, frame, keypoints2, matches_filtered)

AttributeError: 'module' object has no attribute 'drawMatches'

Woops.. The OpenCV Python binding does not have the drawMatches attribute (another reason to implement the final product in C++). Let's just whip up our own own real quick:


In [208]:
h1, w1, _ = patch.shape
h2, w2, _ = frame.shape

view = np.zeros((max(h1, h2), w1 + w2, 3), np.uint8)
view[:h1, :w1] = patch
view[:h2, w1:] = frame

for m in matches_filtered:
    color = [np.random.randint(0, 255) for _ in xrange(3)]
    p = tuple(map(int, keypoints1[m.queryIdx].pt))
    q = (int(keypoints2[m.trainIdx].pt[0] + w1), int(keypoints2[m.trainIdx].pt[1]))
    cv2.line(view, p, q, color)

In [211]:
plt.imshow(view)


Out[211]:
<matplotlib.image.AxesImage at 0x169825b50>

Decent - let's try with a few other frames:


In [221]:
frame = frames[1200][...,::-1]

In [222]:
plt.imshow(frame)


Out[222]:
<matplotlib.image.AxesImage at 0x169412c90>

In [214]:
keypoints2, descriptors2 = sift.detectAndCompute(frame, mask=None)

In [215]:
matches = matcher.knnMatch(descriptors1, descriptors2, k=2)

In [216]:
matches_filtered = [a for a, b in matches if a.distance < T*b.distance]

In [217]:
h1, w1, _ = patch.shape
h2, w2, _ = frame.shape

view = np.zeros((max(h1, h2), w1 + w2, 3), np.uint8)
view[:h1, :w1] = patch
view[:h2, w1:] = frame

for m in matches_filtered:
    color = [np.random.randint(0, 255) for _ in xrange(3)]
    p = tuple(map(int, keypoints1[m.queryIdx].pt))
    q = (int(keypoints2[m.trainIdx].pt[0] + w1), int(keypoints2[m.trainIdx].pt[1]))
    cv2.line(view, p, q, color)

In [218]:
plt.imshow(view)


Out[218]:
<matplotlib.image.AxesImage at 0x169de13d0>

In [228]:
frame = frames[1000][...,::-1]

In [229]:
plt.imshow(frame)


Out[229]:
<matplotlib.image.AxesImage at 0x16a12aed0>

In [230]:
keypoints2, descriptors2 = sift.detectAndCompute(frame, mask=None)

In [231]:
matches = matcher.knnMatch(descriptors1, descriptors2, k=2)

In [232]:
matches_filtered = [a for a, b in matches if a.distance < T*b.distance]

In [233]:
h1, w1, _ = patch.shape
h2, w2, _ = frame.shape

view = np.zeros((max(h1, h2), w1 + w2, 3), np.uint8)
view[:h1, :w1] = patch
view[:h2, w1:] = frame

for m in matches_filtered:
    color = [np.random.randint(0, 255) for _ in xrange(3)]
    p = tuple(map(int, keypoints1[m.queryIdx].pt))
    q = (int(keypoints2[m.trainIdx].pt[0] + w1), int(keypoints2[m.trainIdx].pt[1]))
    cv2.line(view, p, q, color)

In [234]:
plt.imshow(view)


Out[234]:
<matplotlib.image.AxesImage at 0x16a813cd0>