By: Aris Budi Wibowo
First, we need to install Keras.
In [1]:
!pip2 install keras==1.2.2
Now, we configure Keras to use Theano as default configuration. We can change it to Tensorflow if we want.
In [2]:
!mkdir -p ~/.keras
!echo '{"epsilon":1e-07,"floatx":"float32","backend":"theano","img_dim_ordering":"th"}' > ~/.keras/keras.json
Now, we are done setup Keras. Next, we restart the kernel to make Keras use the configuration that we want.
Because we want to show plots too, we need to add extra configuration in Jupyter.
In [1]:
%matplotlib inline
Define path of the data.
In [2]:
#path = "/datasets/fast.ai/dogscats/"
#path = "D:/datasets/dogscats/"
path = "D:/datasets/dogscats/sample/"
Next, we import libraries that we will use later.
In [3]:
from __future__ import division,print_function
import os,json
from glob import glob
import numpy as np
np.set_printoptions(precision=4,linewidth=100)
from matplotlib import pyplot as plt
For convenience, we already have some functions that we will use later.
In [4]:
import utils; reload(utils)
from utils import plots
Now, for simplification, we just need 7 lines of code to classify cats and dogs photos. For using VGG, we already have pretrained model. So now we just load the model.
In [5]:
batch_size = 16
In [7]:
import vgg16; reload(vgg16)
from vgg16 import Vgg16
In [8]:
vgg = Vgg16()
In [9]:
batches = vgg.get_batches(path+'train',batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size)
vgg.finetune(batches)
vgg.fit(batches,val_batches,nb_epoch=10)
If you want to understand what is batch and epoch, you can read here.
Not that bad in performance since we have limited data. :) If you curious, you can use all dogs vs cats data (approx. 25000 train data vs 15000 test data) and get 97% in accuracy with only 1 epoch.
We will try to use VGG16 to recognise categories from ImageNet from each image. Means, we just want to know each type of breed, not necessesarily guess dog or cat (since the model are not capable to do that because ImageNet don't have 'cat' or 'dog' category). First, we load the object.
In [10]:
vgg = Vgg16()
Now, we grab batches from our training data.
In [10]:
path = "D:/datasets/dogscats/"
In [12]:
batches = vgg.get_batches(path+'train',batch_size=4)
In [13]:
imgs, labels = next(batches)
Now, we want to know images that we grabbed before.
In [14]:
plots(imgs, titles=labels)
Let's predict the species of each image.
In [15]:
vgg.predict(imgs,True)
Out[15]:
The category indexes are based on VGG model from ImageNet. If we want to know the first four of that categories:
In [16]:
vgg.classes[:4]
Out[16]:
Now, we will try to specifically using VGG16 for classify dogs and cats images. Because by default VGG16 was trained using ImageNet, we will use a technique called "finetuning". Finetuning is a process that use a pretrained model for second similiar task. Source
First, we setup batches. A batch is a subset of the training data. We need this to speed up training time and to control gpu memory usage.
In [12]:
batch_size=16
In [13]:
batches = vgg.get_batches(path+'train',batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid',batch_size=batch_size)
Now, we call finetune() so the model can specifically train based on our 'cats vs dogs' data.
In [14]:
vgg.finetune(batches)
Finally, we fit the model with our training data. We will predict data in batches either cats or dogs.
In [15]:
vgg.fit(batches,val_batches,nb_epoch=1)
Now we learned basic steps on how to use VGG16 for image recognition task. (ex: disease, paintings, etc.
Next, we will try to understand what's inside VGG16.
First, we import all the libraries that we need.
In [5]:
from numpy.random import random, permutation
from scipy import ndimage, misc
from scipy.ndimage.interpolation import zoom
import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential, Model
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers import Input
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD, RMSprop
from keras.preprocessing import image
Now, import ImageNet category ids and descriptions. It's not necessarily important for our model, but we will use this for plotting purpose.
In [7]:
FILES_PATH = 'http://files.fast.ai/models'; CLASS_FILE= 'imagenet_class_index.json'
fpath=get_file(CLASS_FILE,FILES_PATH+CLASS_FILE,cache_subdir='models')
with open(fpath) as f: class_dict=json.load(f)
# Convert dictionaries to an array
classes= [class_dict[str(i)][1] for i in range(len(class_dict))]
To check if we succeed download the file, we can check the array.
In [9]:
classes[:5]
Out[9]:
Now, we start with basic stuffs of the VGG architecture.
VGG has just one type of convolutional block, and one type of fully connected ('dense') block. For the convolutional block:
In [16]:
def ConvBlock(layers, model, filters):
for i in range(layers):
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(filters,3,3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
For the fully connected block:
In [17]:
def FCBlock(model):
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
In 2014, the creators of VGG substracted the average of each of the three (R,G,B) channel first, so that the data for each channel had a mean of zero.The problem is, while their original software expected the channels to be in B,G,R order, Python uses R,G,B. Now, we need to preprocess our data to make these two changes, so that it is compatible with the VGG model.
In [18]:
# Mean of each channel as provided by VGG researchers
vgg_mean = np.array([123.68,116.779,103.939]).reshape((3,1,1))
def vgg_preprocess(x):
x=x - vgg_mean # substract mean
return x[:, ::-1] # reverse axis bgr -> rgb
Now, it's time to build the VGG model architecture!
In [19]:
def VGG_16():
model = Sequential()
model.add(Lambda(vgg_preprocess,input_shape=(3,224,224)))
ConvBlock(2, model, 64)
ConvBlock(2, model, 128)
ConvBlock(3, model, 256)
ConvBlock(3, model, 512)
ConvBlock(3, model, 512)
model.add(Flatten())
FCBlock(model)
FCBlock(model)
model.add(Dense(1000,activation='softmax'))
return model
We'll know what these blocks do later. For now:
Now that we've defined the architecture, we can create the model like any python object.
In [20]:
model = VGG_16()
We also need the weights that the VGG creators trained. The weights are the part of the model that is learnt from the data, whereas the architecture is pre-defined based on the nature of the problem.
It's preferred to download pre-trained weights than train the model by ourselves, since it's really resources demanding (time, size of dataset, etc.)
In [23]:
#fpath = get_file('vgg16.h5', FILES_PATH+'vgg16.hdf',cache_subdir='models')
#model.load_weights(fpath)
model.load_weights('vgg16.hdf')
The setup for our model is done! We are now ready to predict cats and dogs images. Again, we will grab a batch of images and call predict() on them.
In [24]:
batch_size=4
Keras provides functionality to create batches of data from directories containing images; all we have to do is to define the size to resize the images to, what type of labels to create, whether to randomly shuffle the images, and how many images to include in each batch. We use this little wrapper to define some helpful defaults appropriate for ImageNet data.
In [27]:
def get_batches(dirname, gen=image.ImageDataGenerator(), shuffle=True,
batch_size=batch_size, class_mode='categorical'):
return gen.flow_from_directory(path+dirname, target_size=(224,224),
class_mode=class_mode, shuffle=shuffle,batch_size=batch_size)
In [28]:
batches = get_batches('train',batch_size=batch_size)
val_batches = get_batches('valid',batch_size=batch_size)
imgs,labels = next(batches)
plots(imgs, titles=labels)
In [31]:
def pred_batch(imgs):
preds = model.predict(imgs)
idxs = np.argmax(preds, axis=1)
print('Shape: {}'.format(preds.shape))
print('First 5 classes: {}'.format(classes[:5]))
print('First 5 probabilities: {}\n'.format(preds[0, :5]))
print('Predictions prob/class: ')
for i in range(len(idxs)):
idx =idxs[i]
print(' {:.4f}/{}'.format(preds[i, idx], classes[idx]))
In [33]:
pred_batch(imgs)
In [ ]: