For easier use of pretrained models, we provide a wrapper specifically written for the case of ImageNet, so one can take an image and directly compute features or predictions from them. Both Python and Matlab wrappers are provided. We will describe the use of the Python wrapper here, and the Matlab wrapper usage is very similar.
We assume that you have successfully compiled Caffe and set the correct PYTHONPATH
. If not, please refer to the installation instructions. You will use our pre-trained imagenet model, which you can download (232.57MB) by running examples/imagenet/get_caffe_reference_imagenet_model.sh
. Note that this pre-trained model is licensed for academic research / non-commercial use only.
Ready? Let's start.
In [1]:
from caffe import imagenet
from matplotlib import pyplot
# Set the right path to your model file, pretrained model,
# and the image you would like to classify.
MODEL_FILE = 'imagenet/imagenet_deploy.prototxt'
PRETRAINED = 'imagenet/caffe_reference_imagenet_model'
IMAGE_FILE = 'images/lena.png'
Loading a network is easy. imagenet.ImagenetClassifier wraps everything. In default, the classifier will crop the center and corners of an image, as well as their mirrored versions, thus creating a batch of 10 images. If you look at the provided MODEL_FILE you can actually see that we are defining the input batch size to be 10.
If you would like to just do the center, you need to specify center_only=1, and also change the batch size from 10 to 1 in the prototxt.
In [2]:
net = imagenet.ImageNetClassifier(
MODEL_FILE, PRETRAINED)
We will set the phase to test since we are doing testing, and will first use CPU for the computation.
In [3]:
net.caffenet.set_phase_test()
net.caffenet.set_mode_cpu()
So now, we can do a prediction. Let's show some output as well:
In [4]:
prediction = net.predict(IMAGE_FILE)
print 'prediction shape:', prediction.shape
pyplot.plot(prediction)
Out[4]:
You can see that the prediction is 1000-dimensional, and is pretty sparse. Our pretrained model uses the alphabetical order for the synsets, and if you look at the index that maximizes the prediction score, it is "sombrero". Reasonable prediction, right?
Now, why don't we see how long it takes to perform the classification end to end? This result is run from an Intel i5 CPU, so you may observe some performance differences.
In [5]:
%timeit net.predict(IMAGE_FILE)
It may look a little slow, but note that it also includes image loading, cropping, and python interfacing time, and it is running 10 images. As a performance notice, if you really want to make prediction fast, you can optionally write things in C and also pipeline the image loading part. But for most applications, the current speed might be fine I guess?
OK, so how about GPU? it is actually pretty easy:
In [6]:
net.caffenet.set_mode_gpu()
Voila! Now we are in GPU mode. Let's see if the code gives the same result:
In [7]:
prediction = net.predict(IMAGE_FILE)
print 'prediction shape:', prediction.shape
pyplot.plot(prediction)
Out[7]:
Good, everything is the same. And how about time consumption? The following benchmark is obtained on the same machine with a K20 GPU:
In [8]:
%timeit net.predict(IMAGE_FILE)
Pretty fast right? Not as fast as you expected? Indeed, in this python demo you are seeing only 4 times speedup. But remember - the GPU code is actually very fast, and the data loading, transformation and interfacing actually start to take more time than the actual convnet computation itself!
To fully utilize the power of GPUs, you really want to use one of these ideas: