Since a large CNN is very time-consuming to train (even on a GPU), and requires huge amounts of data, is there any way to use a pre-calculated one instead of retraining the whole thing from scratch?
This notebook shows how this can be done. And it works surprisingly well.
This notebook extracts a vector representation of a set of images using the GoogLeNet CNN pretrained on ImageNet. It then builds a 'simple SVM classifier', allowing new images can be classified directly. No retraining of the original CNN is required.
In [ ]:
import theano
import theano.tensor as T
import lasagne
import numpy as np
import scipy
import matplotlib.pyplot as plt
%matplotlib inline
import pickle
import time
CLASS_DIR='./images/cars'
Functions for building the GoogLeNet model with Lasagne and preprocessing the images are defined in model.googlenet
.
Build the model and select layers we need - the features are taken from the final network layer, before the softmax nonlinearity.
In [ ]:
from models.imagenet_theano import googlenet
cnn_layers = googlenet.build_model()
cnn_input_var = cnn_layers['input'].input_var
cnn_feature_layer = cnn_layers['loss3/classifier']
cnn_output_layer = cnn_layers['prob']
get_cnn_features = theano.function([cnn_input_var], lasagne.layers.get_output(cnn_feature_layer))
print("GoogLeNet Model defined")
Load the pretrained weights into the network :
In [ ]:
import os
import urllib.request
imagenet_theano = './data/imagenet_theano'
googlenet_pkl = imagenet_theano+'/blvc_googlenet.pkl'
if not os.path.isfile(googlenet_pkl):
if not os.path.exists(imagenet_theano):
os.makedirs(imagenet_theano)
print("Downloading GoogLeNet parameter file")
urllib.request.urlretrieve(
'https://s3.amazonaws.com/lasagne/recipes/pretrained/imagenet/blvc_googlenet.pkl',
googlenet_pkl)
params = pickle.load(open(googlenet_pkl, 'rb'), encoding='iso-8859-1')
model_param_values = params['param values']
imagenet_classes = params['synset words']
lasagne.layers.set_all_param_values(cnn_output_layer, model_param_values)
print("Loaded GoogLeNet params")
In [ ]:
import os
classes = sorted( [ d for d in os.listdir(CLASS_DIR) if os.path.isdir("%s/%s" % (CLASS_DIR, d)) ] )
classes # Sorted for for consistency
In [ ]:
train = dict(f=[], features=[], target=[])
t0 = time.time()
for class_i,d in enumerate(classes):
for f in os.listdir("%s/%s" % (CLASS_DIR, d,)):
filepath = '%s/%s/%s' % (CLASS_DIR,d,f,)
if os.path.isdir(filepath): continue
im = plt.imread(filepath)
rawim, cnn_im = googlenet.prep_image(im)
prob = get_cnn_features(cnn_im)
train['f'].append(filepath)
train['features'].append(prob[0])
train['target'].append( class_i )
plt.figure()
plt.imshow(rawim.astype('uint8'))
plt.axis('off')
plt.text(320, 50, '{}'.format(f), fontsize=14)
plt.text(320, 80, 'Train as class "{}"'.format(d), fontsize=12)
print("DONE : %6.2f seconds each" %(float(time.time() - t0)/len(train),))
In [ ]:
#train['features'][0]
In [ ]:
from sklearn import svm
classifier = svm.LinearSVC()
classifier.fit(train['features'], train['target']) # learn from the data
In [ ]:
test_image_files = [f for f in os.listdir(CLASS_DIR) if not os.path.isdir("%s/%s" % (CLASS_DIR, f))]
t0 = time.time()
for f in sorted(test_image_files):
im = plt.imread('%s/%s' % (CLASS_DIR,f,))
rawim, cnn_im = googlenet.prep_image(im)
prob = get_cnn_features(cnn_im)
prediction_i = classifier.predict([ prob[0] ])
decision = classifier.decision_function([ prob[0] ])
plt.figure()
plt.imshow(rawim.astype('uint8'))
plt.axis('off')
prediction = classes[ prediction_i[0] ]
plt.text(350, 50, '{} : Distance from boundary = {:5.2f}'.format(prediction, decision[0]), fontsize=20)
plt.text(350, 75, '{}'.format(f), fontsize=14)
print("DONE : %6.2f seconds each" %(float(time.time() - t0)/len(test_image_files),))
Did it work?
The whole training regime here is based on the way the image directories are structured. So building your own example shouldn't be very difficult.
Suppose you wanted to classify pianos into Upright and Grand :
pianos
directory and point the CLASS_DIR
variable at itpianos
directory, create subdirectories for each of the classes (i.e. Upright
and Grand
). The directory names will be used as the class labelspianos
directory itelf (which is logical, since we don't know their classes yet)Finally, re-run everything - checking that the training images are read in correctly, that there are no errors along the way, and that (finally) the class predictions on the test set come out as expected.
If/when it works - please let everyone know : We can add that as an example for next time...
In [ ]: