Project Part 1

This notebook contains an interactive session which walks through the pertinent operations required for the project. It is useful for experimenting with and documenting snippets/recipes and the behaviour of OpenCV functions, in addition to prototyping implementation to be incorporated in and/or ported to C++ in the final stage.



In [1]:

    
%matplotlib inline



In [2]:

    
import matplotlib.pyplot as plt
import numpy as np
import cv2

We are given the following sample video:



In [42]:

    
from IPython.display import HTML
from base64 import b64encode
video = open("../inputs/clip_test.m4v", "rb").read()
video_encoded = b64encode(video).decode('ascii')
video_tag = '<video controls alt="test" src="data:video/x-m4v;base64,{0}">'.format(video_encoded)
HTML(data=video_tag)









    Out[42]:

And the following patch to match against:



In [43]:

    
from IPython.display import Image
Image(filename='../inputs/poster.png')









    Out[43]:

We can use the videoCapture function to capture the video frame-by-frame.



In [127]:

    
cap = cv2.VideoCapture('../inputs/clip_test.m4v')



In [128]:

    
ret = 1
frames = []
while ret:
    ret, frame = cap.read()
    frames.append(frame)
cap.release()

Let's see how many frames we have



In [91]:

    
len(frames)









    Out[91]:





1477

And let's inspect a few of these frames



In [129]:

    
plt.imshow(frames[0][...,::-1])









    Out[129]:





<matplotlib.image.AxesImage at 0x102a009d0>



In [130]:

    
plt.imshow(frames[250][...,::-1])









    Out[130]:





<matplotlib.image.AxesImage at 0x13c978b50>



In [131]:

    
plt.imshow(frames[500][...,::-1])









    Out[131]:





<matplotlib.image.AxesImage at 0x1612b3850>



In [132]:

    
plt.imshow(frames[750][...,::-1])









    Out[132]:





<matplotlib.image.AxesImage at 0x140b8f150>

Feature Detection / Description

Now let's read in the image as a NumPy array



In [134]:

    
patch = cv2.imread('../inputs/poster.png')



In [136]:

    
plt.imshow(patch[...,::-1])









    Out[136]:





<matplotlib.image.AxesImage at 0x16197b0d0>



In [133]:

    
sift = cv2.SIFT()



In [146]:

    
# detect keypoints with SIFT
keypoints = sift.detect(patch)



In [210]:

    
# number of keypoints
len(keypoints)









    Out[210]:





257

Let's draw the keypoints of the image



In [147]:

    
plt.imshow(cv2.drawKeypoints(patch, keypoints)[...,::-1])









    Out[147]:





<matplotlib.image.AxesImage at 0x16810c410>



In [148]:

    
# show the different drawing options
filter(lambda s: s.startswith('DRAW_MATCHES'), dir(cv2))









    Out[148]:





['DRAW_MATCHES_FLAGS_DEFAULT',
 'DRAW_MATCHES_FLAGS_DRAW_OVER_OUTIMG',
 'DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS',
 'DRAW_MATCHES_FLAGS_NOT_DRAW_SINGLE_POINTS']



In [235]:

    
# show the size of the keypoint, along with its orientation
plt.imshow(cv2.drawKeypoints(patch, keypoints, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)[...,::-1])









    Out[235]:





<matplotlib.image.AxesImage at 0x16a3d9e50>

We can also detect keypoints and compute descriptors in one operation



In [151]:

    
keypoints, descriptors = sift.detectAndCompute(patch, mask=None)



In [154]:

    
descriptors









    Out[154]:





array([[  12.,    0.,    0., ...,    2.,   14.,   37.],
       [   0.,    1.,   21., ...,    0.,    0.,    0.],
       [  13.,    7.,   10., ...,    4.,    0.,    4.],
       ..., 
       [   1.,    1.,    1., ...,  100.,   32.,    0.],
       [   0.,    0.,    0., ...,   32.,    3.,    3.],
       [   0.,    0.,    0., ...,   20.,    0.,    0.]], dtype=float32)



In [153]:

    
descriptors.shape









    Out[153]:





(257, 128)

Feature Matching

Let's try to match the features in the given patch to a few frames



In [158]:

    
keypoints1, descriptors1 = sift.detectAndCompute(patch, mask=None)



In [177]:

    
frame = frames[750][...,::-1]



In [178]:

    
keypoints2, descriptors2 = sift.detectAndCompute(frame, mask=None)



In [179]:

    
matcher = cv2.BFMatcher()



In [180]:

    
matches = matcher.knnMatch(descriptors1, descriptors2, k=2)



In [181]:

    
len(matches)









    Out[181]:





257



In [182]:

    
a, b = matches[0]



In [183]:

    
a









    Out[183]:





<DMatch 0x107090c90>

Unfortunately, DMatch objects doesn't have a very rich representation, so we have to dig deeper. Let's list the attributes of the object.



In [184]:

    
dir(a)









    Out[184]:





['__class__',
 '__delattr__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'distance',
 'imgIdx',
 'queryIdx',
 'trainIdx']



In [185]:

    
a.distance









    Out[185]:





398.069091796875



In [186]:

    
b.distance









    Out[186]:





428.40167236328125

Define the threshold and filter matches



In [187]:

    
T = 0.8



In [192]:

    
matches_filtered = [a for a, b in matches if a.distance < T*b.distance]



In [193]:

    
len(matches_filtered)









    Out[193]:





73



In [191]:

    
cv2.drawMatches(patch, keypoints1, frame, keypoints2, matches_filtered)









    



---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-191-1aeff3c59aeb> in <module>()
----> 1 cv2.drawMatches(patch, keypoints1, frame, keypoints2, matches_filtered)

AttributeError: 'module' object has no attribute 'drawMatches'

Woops.. The OpenCV Python binding does not have the drawMatches attribute (another reason to implement the final product in C++). Let's just whip up our own own real quick:



In [208]:

    
h1, w1, _ = patch.shape
h2, w2, _ = frame.shape

view = np.zeros((max(h1, h2), w1 + w2, 3), np.uint8)
view[:h1, :w1] = patch
view[:h2, w1:] = frame

for m in matches_filtered:
    color = [np.random.randint(0, 255) for _ in xrange(3)]
    p = tuple(map(int, keypoints1[m.queryIdx].pt))
    q = (int(keypoints2[m.trainIdx].pt[0] + w1), int(keypoints2[m.trainIdx].pt[1]))
    cv2.line(view, p, q, color)



In [211]:

    
plt.imshow(view)









    Out[211]:





<matplotlib.image.AxesImage at 0x169825b50>

Decent - let's try with a few other frames:



In [221]:

    
frame = frames[1200][...,::-1]



In [222]:

    
plt.imshow(frame)









    Out[222]:





<matplotlib.image.AxesImage at 0x169412c90>



In [214]:

    
keypoints2, descriptors2 = sift.detectAndCompute(frame, mask=None)



In [215]:

    
matches = matcher.knnMatch(descriptors1, descriptors2, k=2)



In [216]:

    
matches_filtered = [a for a, b in matches if a.distance < T*b.distance]



In [217]:

    
h1, w1, _ = patch.shape
h2, w2, _ = frame.shape

view = np.zeros((max(h1, h2), w1 + w2, 3), np.uint8)
view[:h1, :w1] = patch
view[:h2, w1:] = frame

for m in matches_filtered:
    color = [np.random.randint(0, 255) for _ in xrange(3)]
    p = tuple(map(int, keypoints1[m.queryIdx].pt))
    q = (int(keypoints2[m.trainIdx].pt[0] + w1), int(keypoints2[m.trainIdx].pt[1]))
    cv2.line(view, p, q, color)



In [218]:

    
plt.imshow(view)









    Out[218]:





<matplotlib.image.AxesImage at 0x169de13d0>



In [228]:

    
frame = frames[1000][...,::-1]



In [229]:

    
plt.imshow(frame)









    Out[229]:





<matplotlib.image.AxesImage at 0x16a12aed0>



In [230]:

    
keypoints2, descriptors2 = sift.detectAndCompute(frame, mask=None)



In [231]:

    
matches = matcher.knnMatch(descriptors1, descriptors2, k=2)



In [232]:

    
matches_filtered = [a for a, b in matches if a.distance < T*b.distance]



In [233]:

    
h1, w1, _ = patch.shape
h2, w2, _ = frame.shape

view = np.zeros((max(h1, h2), w1 + w2, 3), np.uint8)
view[:h1, :w1] = patch
view[:h2, w1:] = frame

for m in matches_filtered:
    color = [np.random.randint(0, 255) for _ in xrange(3)]
    p = tuple(map(int, keypoints1[m.queryIdx].pt))
    q = (int(keypoints2[m.trainIdx].pt[0] + w1), int(keypoints2[m.trainIdx].pt[1]))
    cv2.line(view, p, q, color)



In [234]:

    
plt.imshow(view)









    Out[234]:





<matplotlib.image.AxesImage at 0x16a813cd0>