Retrain a CNN, part 3, fine tuning bottleneck layer


In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
%matplotlib inline
%pylab inline


Populating the interactive namespace from numpy and matplotlib

In [3]:
import matplotlib.pylab as plt
import numpy as np

In [4]:
from distutils.version import StrictVersion

In [5]:
import sklearn
print(sklearn.__version__)

assert StrictVersion(sklearn.__version__ ) >= StrictVersion('0.18.1')


0.18.1

In [6]:
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.ERROR)
print(tf.__version__)

assert StrictVersion(tf.__version__) >= StrictVersion('1.1.0')


1.2.1

In [7]:
import keras
print(keras.__version__)

assert StrictVersion(keras.__version__) >= StrictVersion('2.0.0')


Using TensorFlow backend.
2.0.6

This script goes along the blog post "Building powerful image classification models using very little data" from blog.keras.io. It uses data that can be downloaded at: https://www.kaggle.com/c/dogs-vs-cats/data In our setup, we:

  • created a data/ folder
  • created train/ and validation/ subfolders inside data/
  • created cats/ and dogs/ subfolders inside train/ and validation/
  • put the cat pictures index 0-999 in data/train/cats
  • put the cat pictures index 1000-1400 in data/validation/cats
  • put the dogs pictures index 12500-13499 in data/train/dogs
  • put the dog pictures index 13500-13900 in data/validation/dogs So that we have 1000 training examples for each class, and 400 validation examples for each class. In summary, this is our directory structure:
    data/
      train/
          dogs/
              dog001.jpg
              dog002.jpg
              ...
          cats/
              cat001.jpg
              cat002.jpg
              ...
      validation/
          dogs/
              dog001.jpg
              dog002.jpg
              ...
          cats/
              cat001.jpg
              cat002.jpg
              ...

In [8]:
from keras import applications
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
from keras.models import Model, Sequential
from keras.layers import Dropout, Flatten, Dense, Input

In [9]:
# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800

In [10]:
input_tensor = Input(shape=(img_width, img_height, 3))
base_model = applications.VGG16(weights='imagenet', include_top=False, input_tensor=input_tensor)

In [11]:
base_model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

In [12]:
# would be (None, None, 512), but this is not specific enough for Flatten layer further down...
bottleneck_output_shape = base_model.output_shape[1:]

In [13]:
# so, we manually set this to the dimension we know it really has from previous step
bottleneck_output_shape = (4, 4, 512)

In [14]:
# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=bottleneck_output_shape))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

In [15]:
top_model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_1 (Flatten)          (None, 8192)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 256)               2097408   
_________________________________________________________________
dropout_1 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 257       
=================================================================
Total params: 2,097,665
Trainable params: 2,097,665
Non-trainable params: 0
_________________________________________________________________

In [16]:
# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model_weights_path = 'bottleneck_fc_model.h5'
top_model.load_weights(top_model_weights_path)

In [17]:
model = Model(input=base_model.input, output=top_model(base_model.output))

In [18]:
model.layers


Out[18]:
[<keras.engine.topology.InputLayer at 0x7f7c845e5da0>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845e5fd0>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845f71d0>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c845f7438>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845b0710>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845b00b8>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c845c5c18>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845819e8>,
 <keras.layers.convolutional.Conv2D at 0x7f7c84581438>,
 <keras.layers.convolutional.Conv2D at 0x7f7c84528c88>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c84553d30>,
 <keras.layers.convolutional.Conv2D at 0x7f7c844fb080>,
 <keras.layers.convolutional.Conv2D at 0x7f7c844fb6a0>,
 <keras.layers.convolutional.Conv2D at 0x7f7c8450c518>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c844a4f60>,
 <keras.layers.convolutional.Conv2D at 0x7f7c8445b630>,
 <keras.layers.convolutional.Conv2D at 0x7f7c8445b048>,
 <keras.layers.convolutional.Conv2D at 0x7f7c84487550>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c8442efd0>,
 <keras.models.Sequential at 0x7f7c8f0cb710>]

In [19]:
len(model.layers)


Out[19]:
20

In [20]:
first_conv_layer = model.layers[1]

In [21]:
first_conv_layer.trainable


Out[21]:
True

In [22]:
first_max_pool_layer = model.layers[3]
first_max_pool_layer.trainable


Out[22]:
True

In [23]:
# set the first 15 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
# so, the general features are kept and we (hopefully) do not have overfitting
non_trainable_layers = model.layers[:15]

In [24]:
non_trainable_layers


Out[24]:
[<keras.engine.topology.InputLayer at 0x7f7c845e5da0>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845e5fd0>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845f71d0>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c845f7438>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845b0710>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845b00b8>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c845c5c18>,
 <keras.layers.convolutional.Conv2D at 0x7f7c845819e8>,
 <keras.layers.convolutional.Conv2D at 0x7f7c84581438>,
 <keras.layers.convolutional.Conv2D at 0x7f7c84528c88>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c84553d30>,
 <keras.layers.convolutional.Conv2D at 0x7f7c844fb080>,
 <keras.layers.convolutional.Conv2D at 0x7f7c844fb6a0>,
 <keras.layers.convolutional.Conv2D at 0x7f7c8450c518>,
 <keras.layers.pooling.MaxPooling2D at 0x7f7c844a4f60>]

In [25]:
for layer in non_trainable_layers:
    layer.trainable = False

In [26]:
first_max_pool_layer.trainable


Out[26]:
False

In [27]:
first_conv_layer.trainable


Out[27]:
False

In [28]:
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate
# make updates very small and non adaptive so we do not ruin previous learnings 
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

In [29]:
model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
_________________________________________________________________
sequential_1 (Sequential)    (None, 1)                 2097665   
=================================================================
Total params: 16,812,353
Trainable params: 9,177,089
Non-trainable params: 7,635,264
_________________________________________________________________

In [30]:
# this might actually take a while even on GPU
# ~ 92% validation accuracy seems to be realistic
epochs = 50
batch_size = 16

In [31]:
# ... and viz progress in tensorboard to see what is going on
!rm -rf tf_log/
tb_callback = keras.callbacks.TensorBoard(log_dir='./tf_log')

In [32]:
# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')


Found 2000 images belonging to 2 classes.
Found 802 images belonging to 2 classes.


In [33]:
# due to very small learning rate
# takes ~ 30s per epoch on AWS K80, with 50 epochs: ~ 30 minutes
# on GPU might take up to 20 times more

# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,
    callbacks=[tb_callback])


Epoch 1/50
125/125 [==============================] - 34s - loss: 0.4091 - acc: 0.8625 - val_loss: 0.3187 - val_acc: 0.8862
Epoch 2/50
125/125 [==============================] - 33s - loss: 0.2789 - acc: 0.8970 - val_loss: 0.3067 - val_acc: 0.8715
Epoch 3/50
125/125 [==============================] - 33s - loss: 0.1814 - acc: 0.9245 - val_loss: 0.2941 - val_acc: 0.8944
Epoch 4/50
125/125 [==============================] - 33s - loss: 0.1672 - acc: 0.9350 - val_loss: 0.3766 - val_acc: 0.9033
Epoch 5/50
125/125 [==============================] - 33s - loss: 0.1244 - acc: 0.9520 - val_loss: 0.4618 - val_acc: 0.8969
Epoch 6/50
125/125 [==============================] - 33s - loss: 0.1533 - acc: 0.9510 - val_loss: 0.3068 - val_acc: 0.9008
Epoch 7/50
125/125 [==============================] - 33s - loss: 0.0992 - acc: 0.9595 - val_loss: 0.3304 - val_acc: 0.8944
Epoch 8/50
125/125 [==============================] - 33s - loss: 0.0752 - acc: 0.9730 - val_loss: 0.3888 - val_acc: 0.9173
Epoch 9/50
125/125 [==============================] - 33s - loss: 0.1049 - acc: 0.9645 - val_loss: 0.4106 - val_acc: 0.9237
Epoch 10/50
125/125 [==============================] - 33s - loss: 0.0755 - acc: 0.9725 - val_loss: 0.3562 - val_acc: 0.9109
Epoch 11/50
125/125 [==============================] - 33s - loss: 0.0620 - acc: 0.9775 - val_loss: 0.3737 - val_acc: 0.9224
Epoch 12/50
125/125 [==============================] - 33s - loss: 0.0565 - acc: 0.9790 - val_loss: 0.3597 - val_acc: 0.9249: 4s - los
Epoch 13/50
125/125 [==============================] - 33s - loss: 0.0488 - acc: 0.9840 - val_loss: 0.4069 - val_acc: 0.9186
Epoch 14/50
125/125 [==============================] - 33s - loss: 0.0409 - acc: 0.9870 - val_loss: 0.4732 - val_acc: 0.9173
Epoch 15/50
125/125 [==============================] - 33s - loss: 0.0487 - acc: 0.9815 - val_loss: 0.3588 - val_acc: 0.9288
Epoch 16/50
125/125 [==============================] - 33s - loss: 0.0470 - acc: 0.9805 - val_loss: 0.3906 - val_acc: 0.9262
Epoch 17/50
125/125 [==============================] - 33s - loss: 0.0358 - acc: 0.9880 - val_loss: 0.4580 - val_acc: 0.9148
Epoch 18/50
125/125 [==============================] - 33s - loss: 0.0374 - acc: 0.9875 - val_loss: 0.3674 - val_acc: 0.9375
Epoch 19/50
125/125 [==============================] - 33s - loss: 0.0379 - acc: 0.9865 - val_loss: 0.3847 - val_acc: 0.9173A: 5s - 
Epoch 20/50
125/125 [==============================] - 33s - loss: 0.0196 - acc: 0.9940 - val_loss: 0.3979 - val_acc: 0.9313
Epoch 21/50
125/125 [==============================] - 33s - loss: 0.0499 - acc: 0.9850 - val_loss: 0.3059 - val_acc: 0.9300
Epoch 22/50
125/125 [==============================] - 33s - loss: 0.0235 - acc: 0.9910 - val_loss: 0.3640 - val_acc: 0.9338
Epoch 23/50
125/125 [==============================] - 33s - loss: 0.0258 - acc: 0.9880 - val_loss: 0.4132 - val_acc: 0.9351
Epoch 24/50
125/125 [==============================] - 33s - loss: 0.0403 - acc: 0.9900 - val_loss: 0.3711 - val_acc: 0.9237
Epoch 25/50
125/125 [==============================] - 33s - loss: 0.0191 - acc: 0.9915 - val_loss: 0.4115 - val_acc: 0.9338
Epoch 26/50
125/125 [==============================] - 33s - loss: 0.0212 - acc: 0.9915 - val_loss: 0.3551 - val_acc: 0.9377
Epoch 27/50
125/125 [==============================] - 33s - loss: 0.0369 - acc: 0.9890 - val_loss: 0.3160 - val_acc: 0.9351
Epoch 28/50
125/125 [==============================] - 33s - loss: 0.0267 - acc: 0.9920 - val_loss: 0.3869 - val_acc: 0.9300
Epoch 29/50
125/125 [==============================] - 33s - loss: 0.0480 - acc: 0.9870 - val_loss: 0.2798 - val_acc: 0.9313
Epoch 30/50
125/125 [==============================] - 33s - loss: 0.0181 - acc: 0.9950 - val_loss: 0.3320 - val_acc: 0.9326
Epoch 31/50
125/125 [==============================] - 33s - loss: 0.0230 - acc: 0.9955 - val_loss: 0.3076 - val_acc: 0.9313.0232 - acc: 0.995
Epoch 32/50
125/125 [==============================] - 33s - loss: 0.0200 - acc: 0.9940 - val_loss: 0.3829 - val_acc: 0.9389ss: 0.0201 - acc: 0.994
Epoch 33/50
125/125 [==============================] - 33s - loss: 0.0224 - acc: 0.9920 - val_loss: 0.2956 - val_acc: 0.9402
Epoch 34/50
125/125 [==============================] - 33s - loss: 0.0129 - acc: 0.9965 - val_loss: 0.5277 - val_acc: 0.9059
Epoch 35/50
125/125 [==============================] - 33s - loss: 0.0176 - acc: 0.9945 - val_loss: 0.3572 - val_acc: 0.9387
Epoch 36/50
125/125 [==============================] - 33s - loss: 0.0083 - acc: 0.9975 - val_loss: 0.4379 - val_acc: 0.9351
Epoch 37/50
125/125 [==============================] - 33s - loss: 0.0196 - acc: 0.9945 - val_loss: 0.3765 - val_acc: 0.9211
Epoch 38/50
125/125 [==============================] - 33s - loss: 0.0126 - acc: 0.9965 - val_loss: 0.4627 - val_acc: 0.9249
Epoch 39/50
125/125 [==============================] - 33s - loss: 0.0094 - acc: 0.9975 - val_loss: 0.3406 - val_acc: 0.9313
Epoch 40/50
125/125 [==============================] - 33s - loss: 0.0111 - acc: 0.9970 - val_loss: 0.4549 - val_acc: 0.9262
Epoch 41/50
125/125 [==============================] - 33s - loss: 0.0075 - acc: 0.9975 - val_loss: 0.4611 - val_acc: 0.9249
Epoch 42/50
125/125 [==============================] - 33s - loss: 0.0139 - acc: 0.9955 - val_loss: 0.3249 - val_acc: 0.9453
Epoch 43/50
125/125 [==============================] - 33s - loss: 0.0151 - acc: 0.9945 - val_loss: 0.4354 - val_acc: 0.9300
Epoch 44/50
125/125 [==============================] - 33s - loss: 0.0053 - acc: 0.9990 - val_loss: 0.4268 - val_acc: 0.9415
Epoch 45/50
125/125 [==============================] - 33s - loss: 0.0101 - acc: 0.9975 - val_loss: 0.3936 - val_acc: 0.9364
Epoch 46/50
125/125 [==============================] - 33s - loss: 0.0032 - acc: 0.9990 - val_loss: 0.3701 - val_acc: 0.9415
Epoch 47/50
125/125 [==============================] - 33s - loss: 0.0043 - acc: 0.9990 - val_loss: 0.4462 - val_acc: 0.9288
Epoch 48/50
125/125 [==============================] - 33s - loss: 0.0030 - acc: 0.9985 - val_loss: 0.4536 - val_acc: 0.9249
Epoch 49/50
125/125 [==============================] - 33s - loss: 0.0131 - acc: 0.9950 - val_loss: 0.3224 - val_acc: 0.9453 ETA: 4s - loss
Epoch 50/50
125/125 [==============================] - 33s - loss: 0.0032 - acc: 0.9990 - val_loss: 0.4493 - val_acc: 0.9288.
Out[33]:
<keras.callbacks.History at 0x7f7c8428f048>

In [34]:
model.save('models/cat-dog-vgg-retrain.hdf5')

In [ ]: