Using Convolutional Neural Networks

Welcome to the first week of the first deep learning certificate! We're going to use convolutional neural networks (CNNs) to allow our computer to see - something that is only possible thanks to deep learning.

Introduction to this week's task: 'Dogs vs Cats'

We're going to try to create a model to enter the Dogs vs Cats competition at Kaggle. There are 25,000 labelled dog and cat photos available for training, and 12,500 in the test set that we have to try to label for this competition. According to the Kaggle web-site, when this competition was launched (end of 2013): "State of the art: The current literature suggests machine classifiers can score above 80% accuracy on this task". So if we can beat 80%, then we will be at the cutting edge as of 2013!

Basic setup

There isn't too much to do to get started - just a few simple configuration steps.

This shows plots in the web page itself - we always wants to use this when using jupyter notebook:


In [1]:
%matplotlib inline

Define path to data: (It's a good idea to put it in a subdirectory of your notebooks folder, and then exclude that directory from git control by adding it to .gitignore.)


In [2]:
path = "/input/sample/"
#path = "data/dogscats/sample/"

A few basic libraries that we'll need for the initial exercises:


In [3]:
from __future__ import division,print_function

import os, json
from glob import glob
import numpy as np
np.set_printoptions(precision=4, linewidth=100)
from matplotlib import pyplot as plt

We have created a file most imaginatively called 'utils.py' to store any little convenience functions we'll want to use. We will discuss these as we use them.


In [4]:
import utils; reload(utils)
from utils import plots


Using Theano backend.

Use a pretrained VGG model with our Vgg16 class

Our first step is simply to use a model that has been fully created for us, which can recognise a wide variety (1,000 categories) of images. We will use 'VGG', which won the 2014 Imagenet competition, and is a very simple model to create and understand. The VGG Imagenet team created both a larger, slower, slightly more accurate model (VGG 19) and a smaller, faster model (VGG 16). We will be using VGG 16 since the much slower performance of VGG19 is generally not worth the very minor improvement in accuracy.

We have created a python class, Vgg16, which makes using the VGG 16 model very straightforward.

Hacked Model from Lesson 1 and the redux notebook

Here's everything you need to do to get >97% accuracy on the Dogs vs Cats dataset - we won't analyze how it works behind the scenes yet, since at this stage we're just going to focus on the minimum necessary to actually do useful work.


In [5]:
# As large as you can, but no larger than 64 is recommended. 
# If you have an older or cheaper GPU, you'll run out of memory, so will have to decrease this.
batch_size=64
num_epochs=2

In [6]:
# Import our class, and instantiate
import vgg16; reload(vgg16)
from vgg16 import Vgg16

In [7]:
vgg = Vgg16()

In [12]:
weights_file = None;
for epoch in range(num_epochs):
    print("Running epoch: %d" % epoch)
    
    # Grab a few images at a time for training and validation.
    # NB: They must be in subdirectories named based on their category
    batches = vgg.get_batches(path+'train', batch_size=batch_size)
    print('batches.nb_sample=%d' % (batches.nb_sample,))
    val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
    vgg.finetune(batches)
    vgg.fit(batches, val_batches, nb_epoch=1)
    
    weights_file = '/output/ft%d.h5' % epoch
    vgg.model.save_weights(weights_file)
print ("Completed %s fit operations" % num_epochs)


Running epoch: 0
Found 200 images belonging to 2 classes.
Found 50 images belonging to 2 classes.
Epoch 1/1
200/200 [==============================] - 261s - loss: 1.1918 - acc: 0.6750 - val_loss: 0.1564 - val_acc: 0.9400
Running epoch: 1
Found 200 images belonging to 2 classes.
Found 50 images belonging to 2 classes.
Epoch 1/1
200/200 [==============================] - 269s - loss: 1.0498 - acc: 0.6650 - val_loss: 0.1593 - val_acc: 0.9400
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-12-904ea698e81e> in <module>()
     12     weights_file = '/output/ft%d.h5' % epoch
     13     vgg.model.save_weights(weights_file)
---> 14 print ("Completed %s fit operations" % no_of_epochs)

NameError: name 'no_of_epochs' is not defined

In [ ]:

The code above will work for any image recognition task, with any number of categories! All you have to do is to put your images into one folder per category, and run the code above.

Let's take a look at how this works, step by step...

Prediction


In [72]:
batches, preds = vgg.test(path+'test', batch_size = batch_size*2)


Found 56 images belonging to 1 classes.

In [ ]:
#Save our test results arrays so we can use them again later
save_array('/output/test_preds.dat', preds)
save_array('/output/filenames.dat', filenames)

In [73]:
#Grab the dog prediction column
isdog = preds[:,1]
print "Raw Predictions: " + str(isdog[:5])
print "Mid Predictions: " + str(isdog[(isdog < .6) & (isdog > .4)])
print "Edge Predictions: " + str(isdog[(isdog == 1) | (isdog == 0)])

Log Loss doesn't support probability values of 0 or 1--they are undefined (and we have many). Fortunately, Kaggle helps us by offsetting our 0s and 1s by a very small value. So if we upload our submission now we will have lots of .99999999 and .000000001 values. This seems good, right?

Not so. There is an additional twist due to how log loss is calculated--log loss rewards predictions that are confident and correct (p=.9999,label=1), but it punishes predictions that are confident and wrong far more (p=.0001,label=1). See visualization below.


In [ ]:
#So to play it safe, we use a sneaky trick to round down our edge predictions
#Swap all ones with .95 and all zeros with .05
isdog = isdog.clip(min=0.05, max=0.95)

In [ ]:
#Extract imageIds from the filenames in our test/unknown directory 
filenames = batches.filenames
ids = np.array([int(f[8:f.find('.')]) for f in filenames])

Here we join the two columns into an array of [imageId, isDog]|


In [ ]:
subm = np.stack([ids,isdog], axis=1)
subm[:5]

In [ ]:
np.savetxt('/output/kaggle_submission.csv', subm, fmt='%d,%.5f', header='id,label', comments='')

Now, submit to kaggle.


In [ ]: