Author: Michael Gygli (Github, Twitter), 2016-01-13

Introduction

This example demonstrates how to compute C3D convolutional features using Lasagne and Theano. C3D can be used as a general video feature and has shown strong performance. You can find more information in the paper [1] or the caffe-based reference implementation [2].

[1]: D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks, ICCV 2015, http://vlg.cs.dartmouth.edu/c3d/c3d_video.pdf
[2]: http://vlg.cs.dartmouth.edu/c3d/

Preparation steps

This demo uses the pretrained C3D weights, as well as the c3d module in the Lasagne Recipes modelzoo. Thus, you will need to get the Recipes from github (https://github.com/Lasagne/Recipes) first.



In [1]:

    
# Import models and set path
import sys
model_dir='../modelzoo/' # Path to your recipes/modelzoo
sys.path.insert(0,model_dir)
import c3d
import lasagne
import theano









    



Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled, cuDNN 5103)



In [2]:

    
# Download the weights and mean of the model
!wget -N https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/c3d_model.pkl
!wget -N https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/snipplet_mean.npy    
    
# And the classes of Sports1m
!wget -N https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/labels.txt

# Finally, an example sniplet
!wget -N https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/example_snip.npy









    



converted 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/c3d_model.pkl' (ISO-8859-1) -> 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/c3d_model.pkl' (UTF-8)
--2017-01-09 19:10:34--  https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/c3d_model.pkl
Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.114.8
Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.114.8|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 319965140 (305M) [binary/octet-stream]
Saving to: 'c3d_model.pkl'

c3d_model.pkl       100%[=====================>] 305.14M  4.61MB/s   in 77s    

2017-01-09 19:11:53 (3.97 MB/s) - 'c3d_model.pkl' saved [319965140/319965140]

converted 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/snipplet_mean.npy' (ISO-8859-1) -> 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/snipplet_mean.npy' (UTF-8)
--2017-01-09 19:11:53--  https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/snipplet_mean.npy
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.0.179
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.0.179|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 8405088 (8.0M) [binary/octet-stream]
Saving to: 'snipplet_mean.npy'

snipplet_mean.npy   100%[=====================>]   8.02M  1.82MB/s   in 4.4s   

2017-01-09 19:11:58 (1.82 MB/s) - 'snipplet_mean.npy' saved [8405088/8405088]

converted 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/labels.txt' (ISO-8859-1) -> 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/labels.txt' (UTF-8)
--2017-01-09 19:11:58--  https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/labels.txt
Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.41.82
Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.41.82|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6504 (6.4K) [text/plain]
Saving to: 'labels.txt'

labels.txt          100%[=====================>]   6.35K  --.-KB/s   in 0s     

2017-01-09 19:11:59 (109 MB/s) - 'labels.txt' saved [6504/6504]

converted 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/example_snip.npy' (ISO-8859-1) -> 'https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/example_snip.npy' (UTF-8)
--2017-01-09 19:11:59--  https://s3.amazonaws.com/lasagne/recipes/pretrained/c3d/example_snip.npy
Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.41.82
Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.41.82|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3686496 (3.5M) [binary/octet-stream]
Saving to: 'example_snip.npy'

example_snip.npy    100%[=====================>]   3.52M  1.09MB/s   in 3.2s   

2017-01-09 19:12:03 (1.09 MB/s) - 'example_snip.npy' saved [3686496/3686496]



In [3]:

    
# Build model
net = c3d.build_model()

# Set the weights (takes some time)
c3d.set_weights(net['prob'],'c3d_model.pkl')









    



Load pretrained weights from c3d_model.pkl...
Set the weights...



In [4]:

    
# Load the video snipplet and show an example frame
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
snip=np.load('example_snip.npy')
plt.imshow(snip[0,:,:,:])









    Out[4]:





<matplotlib.image.AxesImage at 0x7f11cf492a90>



In [5]:

    
# Convert the video snipplet to the right format
# i.e. (nr in batch, channel, frameNr, y, x) and substract mean
caffe_snip=c3d.get_snips(snip,image_mean=np.load('snipplet_mean.npy'),start=0, with_mirrored=False)



In [6]:

    
# Compile prediction function
prediction = lasagne.layers.get_output(net['prob'], deterministic=True)
pred_fn = theano.function([net['input'].input_var], prediction, allow_input_downcast = True);



In [7]:

    
# Now we can get a prediction
probabilities=pred_fn(caffe_snip).mean(axis=0) # As we average over flipped and non-flipped



In [8]:

    
# Load labels
with open('labels.txt','r') as f:
    class2label=dict(enumerate([name.rstrip('\n') for name in f]))
    
# Show the post probable ones
print('Top 10 class probabilities:')
for class_id in (-probabilities).argsort()[0:10]:
    print('%20s: %.2f%%' % (class2label[class_id],100*probabilities[class_id]))









    



Top 10 class probabilities:
         wiffle ball: 29.87%
      knife throwing: 13.12%
             croquet: 11.36%
           disc golf: 5.30%
            kickball: 5.15%
            rounders: 4.48%
               bocce: 3.53%
           dodgeball: 2.25%
           boomerang: 1.71%
            tee ball: 1.39%

Comparison to C3D reference implementation

For this example, the Top 10 probabilities of the original C3D implementation are:

     wiffle ball: 29.91%
  knife throwing: 13.11%
         croquet: 11.27%
       disc golf: 5.29%
        kickball: 5.18%
        rounders: 4.48%
           bocce: 3.53%
       dodgeball: 2.27%
       boomerang: 1.71%
        tee ball: 1.39%



In [ ]:



In [ ]: