Object Detection Using SSD

Object detection detects instances of certain classes in digital images and videos (which can be regarded as sequences of images). In this notebook we demonstrate how to use pretrained Analytics Zoo model to detect objects in the video.

We used one of the videos in Youtube-8M (link) for demo, which is a scene of training a dog. We use our SSD-MobileNet model (downloadable from link) pretrained on PASCAL VOC dataset (link) for detection. It is able to detect 20 classes including person and dog. In the video the dogs and persons are are identified with boxes and class scores are shown on top.

References:

YouTube-8M: A Large and Diverse Labeled Video Dataset for Video Understanding Research (link).

Intialization

import necesary libraries



In [1]:

    
import os
from IPython.display import Image, display
%pylab inline
from moviepy.editor import *









    



Populating the interactive namespace from numpy and matplotlib

import necessary modules



In [2]:

    
from zoo.common.nncontext import *
from zoo.feature.image import *
from zoo.models.image.objectdetection import *

init SparkContext



In [3]:

    
sc = init_nncontext("Object Detection Example")

Load pretrained Analytics Zoo model

Here we use a SSD-MobileNet pretrained by PASCAL VOC dataset.

Download model



In [4]:

    
try:
    model_path = os.getenv("ANALYTICS_ZOO_HOME")+"/apps/object-detection/analytics-zoo_ssd-mobilenet-300x300_PASCAL_0.1.0.model"
    model = ObjectDetector.load_model(model_path)
    print("load model done")
except Exception as e:
    print("The pretrain model doesn't exist")
    print("you can run $ANALYTICS_ZOO_HOME/apps/object-detection/download_model.sh to download the pretrain model")









    



load model done

Load data

Load the video and get a short clip. Take this clip as a sequence of frames by given fps.



In [5]:

    
try:
    path = os.getenv("ANALYTICS_ZOO_HOME")+"/apps/object-detection/train_dog.mp4"
    myclip = VideoFileClip(path).subclip(8,18)
except Exception as e:
    print("The video doesn't exist.")
    print("Please prepare the video first.")

video_rdd = sc.parallelize(myclip.iter_frames(fps=5))
image_set = DistributedImageSet(video_rdd)









    



creating: createDistributedImageSet

Predict and visualize detection back to clips

Having prepared the model, we can start detecting objects.

Read the image as ImageSet(local/distributed) -> Perform prediction -> Visualize the detections in original images



In [6]:

    
output = model.predict_image_set(image_set)

config = model.get_config()
visualizer = Visualizer(config.label_map())
visualized = visualizer(output).get_image(to_chw=False).collect()









    



creating: createVisualizer

Save clips to file

Make sequence of frames back to a clip by given fps.



In [7]:

    
clip = ImageSequenceClip(visualized, fps=5)

output_path = '/tmp/out.mp4'
clip.write_videofile(output_path, audio=False)
clip.write_gif("train_dog.gif")









    



[MoviePy] >>>> Building video /tmp/out.mp4
[MoviePy] Writing video /tmp/out.mp4






    



100%|██████████| 50/50 [00:00<00:00, 190.64it/s]






    



[MoviePy] Done.
[MoviePy] >>>> Video ready: /tmp/out.mp4 


[MoviePy] Building file train_dog.gif with imageio






    



100%|██████████| 50/50 [00:02<00:00, 17.43it/s]

Display Object Detection Video

Display the prediction of the model.



In [8]:

    
with open("train_dog.gif",'rb') as f:
    display(Image(f.read()))