Kaggle Dogs and Cats Image Identification Problem

Achieved 98.9% accuracy - average of two test sets. Data taken from the 25k images of the Kaggle cats vs. dogs problem. 16k images were used for training. 3k images for validation, 3k each for two test sets. Each set was balanced, 50% dogs, 50% cats. In the future I may further divide the test sets so that a mean and standard deviation of test set accuracy could be calculated.

TODO: Plot history to look for overfitting, but with class and work this will need to wait.

Note: Image sizes are a smaller than the default for the Xception base model. This is because my GPU memory could not handle a full-size Xception model.

Set up imports


In [1]:
%matplotlib inline

import os
import numpy as np
import matplotlib.pyplot as plt

from keras.applications import Xception
from keras.preprocessing.image import ImageDataGenerator
from keras import models
from keras import layers
from keras import optimizers

import tensorflow as tf


C:\ProgramData\Anaconda3\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.

Get train, validation and 2 test data sets - data had previously been split by a Python script. Validation set has variable images so that it can be doubled to produce a larger validation set. This can work since each replicated image is randomized in rotation, flip, skew, shift and so is in a sense a 'different' image. Having two test data sets allows gor some glimpse of the repeatability of the model on new data - in the future I may split these further so a standard deviation of accuracy on the test sets can be determined.


In [2]:
base_dir = r'C:\Users\Vette\Desktop\Regis\#MSDS686 Deep Learning\cats_dogs'

train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')
test_dir = os.path.join(base_dir, 'test')
test2_dir = os.path.join(base_dir, 'test2')

batch_size = 20
seed = 321

train_datagen = ImageDataGenerator(rescale=1./255,
                                      rotation_range=30,
                                      width_shift_range=0.2,
                                      height_shift_range=0.2,
                                      shear_range=0.2,
                                      zoom_range=0.2,
                                      horizontal_flip=True,
                                      fill_mode='nearest')

validation_datagen = ImageDataGenerator(rescale=1./255,
                                      rotation_range=30,
                                      width_shift_range=0.2,
                                      height_shift_range=0.2,
                                      shear_range=0.2,
                                      zoom_range=0.2,
                                      horizontal_flip=True,
                                      fill_mode='nearest')

test_datagen = ImageDataGenerator(rescale=1./255)
test2_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(train_dir,
                                                        target_size=(240, 240),
                                                        batch_size=50,
                                                        class_mode='binary')

validation_generator = validation_datagen.flow_from_directory(validation_dir,
                                                                target_size=(240, 240),
                                                                batch_size=50,
                                                                class_mode='binary')

test_generator = test_datagen.flow_from_directory(test_dir,
                                                        target_size=(240, 240),
                                                        batch_size=50,
                                                        class_mode='binary')

test2_generator = test2_datagen.flow_from_directory(test2_dir,
                                                        target_size=(240, 240),
                                                        batch_size=50,
                                                        class_mode='binary')


Found 16000 images belonging to 2 classes.
Found 3000 images belonging to 2 classes.
Found 3000 images belonging to 2 classes.
Found 3000 images belonging to 2 classes.

Set up base model - had success for this problem with the Xception model. It will not be retrained for the first training phase which will output the training for the added dense layers only.


In [3]:
conv_base = Xception(weights='imagenet',
                              include_top=False,
                              input_shape=(240, 240, 3))
conv_base.summary()
conv_base.trainable = False


__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 240, 240, 3)  0                                            
__________________________________________________________________________________________________
block1_conv1 (Conv2D)           (None, 119, 119, 32) 864         input_1[0][0]                    
__________________________________________________________________________________________________
block1_conv1_bn (BatchNormaliza (None, 119, 119, 32) 128         block1_conv1[0][0]               
__________________________________________________________________________________________________
block1_conv1_act (Activation)   (None, 119, 119, 32) 0           block1_conv1_bn[0][0]            
__________________________________________________________________________________________________
block1_conv2 (Conv2D)           (None, 117, 117, 64) 18432       block1_conv1_act[0][0]           
__________________________________________________________________________________________________
block1_conv2_bn (BatchNormaliza (None, 117, 117, 64) 256         block1_conv2[0][0]               
__________________________________________________________________________________________________
block1_conv2_act (Activation)   (None, 117, 117, 64) 0           block1_conv2_bn[0][0]            
__________________________________________________________________________________________________
block2_sepconv1 (SeparableConv2 (None, 117, 117, 128 8768        block1_conv2_act[0][0]           
__________________________________________________________________________________________________
block2_sepconv1_bn (BatchNormal (None, 117, 117, 128 512         block2_sepconv1[0][0]            
__________________________________________________________________________________________________
block2_sepconv2_act (Activation (None, 117, 117, 128 0           block2_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block2_sepconv2 (SeparableConv2 (None, 117, 117, 128 17536       block2_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block2_sepconv2_bn (BatchNormal (None, 117, 117, 128 512         block2_sepconv2[0][0]            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 59, 59, 128)  8192        block1_conv2_act[0][0]           
__________________________________________________________________________________________________
block2_pool (MaxPooling2D)      (None, 59, 59, 128)  0           block2_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 59, 59, 128)  512         conv2d_1[0][0]                   
__________________________________________________________________________________________________
add_1 (Add)                     (None, 59, 59, 128)  0           block2_pool[0][0]                
                                                                 batch_normalization_1[0][0]      
__________________________________________________________________________________________________
block3_sepconv1_act (Activation (None, 59, 59, 128)  0           add_1[0][0]                      
__________________________________________________________________________________________________
block3_sepconv1 (SeparableConv2 (None, 59, 59, 256)  33920       block3_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block3_sepconv1_bn (BatchNormal (None, 59, 59, 256)  1024        block3_sepconv1[0][0]            
__________________________________________________________________________________________________
block3_sepconv2_act (Activation (None, 59, 59, 256)  0           block3_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block3_sepconv2 (SeparableConv2 (None, 59, 59, 256)  67840       block3_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block3_sepconv2_bn (BatchNormal (None, 59, 59, 256)  1024        block3_sepconv2[0][0]            
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 30, 30, 256)  32768       add_1[0][0]                      
__________________________________________________________________________________________________
block3_pool (MaxPooling2D)      (None, 30, 30, 256)  0           block3_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 30, 30, 256)  1024        conv2d_2[0][0]                   
__________________________________________________________________________________________________
add_2 (Add)                     (None, 30, 30, 256)  0           block3_pool[0][0]                
                                                                 batch_normalization_2[0][0]      
__________________________________________________________________________________________________
block4_sepconv1_act (Activation (None, 30, 30, 256)  0           add_2[0][0]                      
__________________________________________________________________________________________________
block4_sepconv1 (SeparableConv2 (None, 30, 30, 728)  188672      block4_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block4_sepconv1_bn (BatchNormal (None, 30, 30, 728)  2912        block4_sepconv1[0][0]            
__________________________________________________________________________________________________
block4_sepconv2_act (Activation (None, 30, 30, 728)  0           block4_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block4_sepconv2 (SeparableConv2 (None, 30, 30, 728)  536536      block4_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block4_sepconv2_bn (BatchNormal (None, 30, 30, 728)  2912        block4_sepconv2[0][0]            
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 15, 15, 728)  186368      add_2[0][0]                      
__________________________________________________________________________________________________
block4_pool (MaxPooling2D)      (None, 15, 15, 728)  0           block4_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 15, 15, 728)  2912        conv2d_3[0][0]                   
__________________________________________________________________________________________________
add_3 (Add)                     (None, 15, 15, 728)  0           block4_pool[0][0]                
                                                                 batch_normalization_3[0][0]      
__________________________________________________________________________________________________
block5_sepconv1_act (Activation (None, 15, 15, 728)  0           add_3[0][0]                      
__________________________________________________________________________________________________
block5_sepconv1 (SeparableConv2 (None, 15, 15, 728)  536536      block5_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block5_sepconv1_bn (BatchNormal (None, 15, 15, 728)  2912        block5_sepconv1[0][0]            
__________________________________________________________________________________________________
block5_sepconv2_act (Activation (None, 15, 15, 728)  0           block5_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block5_sepconv2 (SeparableConv2 (None, 15, 15, 728)  536536      block5_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block5_sepconv2_bn (BatchNormal (None, 15, 15, 728)  2912        block5_sepconv2[0][0]            
__________________________________________________________________________________________________
block5_sepconv3_act (Activation (None, 15, 15, 728)  0           block5_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
block5_sepconv3 (SeparableConv2 (None, 15, 15, 728)  536536      block5_sepconv3_act[0][0]        
__________________________________________________________________________________________________
block5_sepconv3_bn (BatchNormal (None, 15, 15, 728)  2912        block5_sepconv3[0][0]            
__________________________________________________________________________________________________
add_4 (Add)                     (None, 15, 15, 728)  0           block5_sepconv3_bn[0][0]         
                                                                 add_3[0][0]                      
__________________________________________________________________________________________________
block6_sepconv1_act (Activation (None, 15, 15, 728)  0           add_4[0][0]                      
__________________________________________________________________________________________________
block6_sepconv1 (SeparableConv2 (None, 15, 15, 728)  536536      block6_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block6_sepconv1_bn (BatchNormal (None, 15, 15, 728)  2912        block6_sepconv1[0][0]            
__________________________________________________________________________________________________
block6_sepconv2_act (Activation (None, 15, 15, 728)  0           block6_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block6_sepconv2 (SeparableConv2 (None, 15, 15, 728)  536536      block6_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block6_sepconv2_bn (BatchNormal (None, 15, 15, 728)  2912        block6_sepconv2[0][0]            
__________________________________________________________________________________________________
block6_sepconv3_act (Activation (None, 15, 15, 728)  0           block6_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
block6_sepconv3 (SeparableConv2 (None, 15, 15, 728)  536536      block6_sepconv3_act[0][0]        
__________________________________________________________________________________________________
block6_sepconv3_bn (BatchNormal (None, 15, 15, 728)  2912        block6_sepconv3[0][0]            
__________________________________________________________________________________________________
add_5 (Add)                     (None, 15, 15, 728)  0           block6_sepconv3_bn[0][0]         
                                                                 add_4[0][0]                      
__________________________________________________________________________________________________
block7_sepconv1_act (Activation (None, 15, 15, 728)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block7_sepconv1 (SeparableConv2 (None, 15, 15, 728)  536536      block7_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block7_sepconv1_bn (BatchNormal (None, 15, 15, 728)  2912        block7_sepconv1[0][0]            
__________________________________________________________________________________________________
block7_sepconv2_act (Activation (None, 15, 15, 728)  0           block7_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block7_sepconv2 (SeparableConv2 (None, 15, 15, 728)  536536      block7_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block7_sepconv2_bn (BatchNormal (None, 15, 15, 728)  2912        block7_sepconv2[0][0]            
__________________________________________________________________________________________________
block7_sepconv3_act (Activation (None, 15, 15, 728)  0           block7_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
block7_sepconv3 (SeparableConv2 (None, 15, 15, 728)  536536      block7_sepconv3_act[0][0]        
__________________________________________________________________________________________________
block7_sepconv3_bn (BatchNormal (None, 15, 15, 728)  2912        block7_sepconv3[0][0]            
__________________________________________________________________________________________________
add_6 (Add)                     (None, 15, 15, 728)  0           block7_sepconv3_bn[0][0]         
                                                                 add_5[0][0]                      
__________________________________________________________________________________________________
block8_sepconv1_act (Activation (None, 15, 15, 728)  0           add_6[0][0]                      
__________________________________________________________________________________________________
block8_sepconv1 (SeparableConv2 (None, 15, 15, 728)  536536      block8_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block8_sepconv1_bn (BatchNormal (None, 15, 15, 728)  2912        block8_sepconv1[0][0]            
__________________________________________________________________________________________________
block8_sepconv2_act (Activation (None, 15, 15, 728)  0           block8_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block8_sepconv2 (SeparableConv2 (None, 15, 15, 728)  536536      block8_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block8_sepconv2_bn (BatchNormal (None, 15, 15, 728)  2912        block8_sepconv2[0][0]            
__________________________________________________________________________________________________
block8_sepconv3_act (Activation (None, 15, 15, 728)  0           block8_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
block8_sepconv3 (SeparableConv2 (None, 15, 15, 728)  536536      block8_sepconv3_act[0][0]        
__________________________________________________________________________________________________
block8_sepconv3_bn (BatchNormal (None, 15, 15, 728)  2912        block8_sepconv3[0][0]            
__________________________________________________________________________________________________
add_7 (Add)                     (None, 15, 15, 728)  0           block8_sepconv3_bn[0][0]         
                                                                 add_6[0][0]                      
__________________________________________________________________________________________________
block9_sepconv1_act (Activation (None, 15, 15, 728)  0           add_7[0][0]                      
__________________________________________________________________________________________________
block9_sepconv1 (SeparableConv2 (None, 15, 15, 728)  536536      block9_sepconv1_act[0][0]        
__________________________________________________________________________________________________
block9_sepconv1_bn (BatchNormal (None, 15, 15, 728)  2912        block9_sepconv1[0][0]            
__________________________________________________________________________________________________
block9_sepconv2_act (Activation (None, 15, 15, 728)  0           block9_sepconv1_bn[0][0]         
__________________________________________________________________________________________________
block9_sepconv2 (SeparableConv2 (None, 15, 15, 728)  536536      block9_sepconv2_act[0][0]        
__________________________________________________________________________________________________
block9_sepconv2_bn (BatchNormal (None, 15, 15, 728)  2912        block9_sepconv2[0][0]            
__________________________________________________________________________________________________
block9_sepconv3_act (Activation (None, 15, 15, 728)  0           block9_sepconv2_bn[0][0]         
__________________________________________________________________________________________________
block9_sepconv3 (SeparableConv2 (None, 15, 15, 728)  536536      block9_sepconv3_act[0][0]        
__________________________________________________________________________________________________
block9_sepconv3_bn (BatchNormal (None, 15, 15, 728)  2912        block9_sepconv3[0][0]            
__________________________________________________________________________________________________
add_8 (Add)                     (None, 15, 15, 728)  0           block9_sepconv3_bn[0][0]         
                                                                 add_7[0][0]                      
__________________________________________________________________________________________________
block10_sepconv1_act (Activatio (None, 15, 15, 728)  0           add_8[0][0]                      
__________________________________________________________________________________________________
block10_sepconv1 (SeparableConv (None, 15, 15, 728)  536536      block10_sepconv1_act[0][0]       
__________________________________________________________________________________________________
block10_sepconv1_bn (BatchNorma (None, 15, 15, 728)  2912        block10_sepconv1[0][0]           
__________________________________________________________________________________________________
block10_sepconv2_act (Activatio (None, 15, 15, 728)  0           block10_sepconv1_bn[0][0]        
__________________________________________________________________________________________________
block10_sepconv2 (SeparableConv (None, 15, 15, 728)  536536      block10_sepconv2_act[0][0]       
__________________________________________________________________________________________________
block10_sepconv2_bn (BatchNorma (None, 15, 15, 728)  2912        block10_sepconv2[0][0]           
__________________________________________________________________________________________________
block10_sepconv3_act (Activatio (None, 15, 15, 728)  0           block10_sepconv2_bn[0][0]        
__________________________________________________________________________________________________
block10_sepconv3 (SeparableConv (None, 15, 15, 728)  536536      block10_sepconv3_act[0][0]       
__________________________________________________________________________________________________
block10_sepconv3_bn (BatchNorma (None, 15, 15, 728)  2912        block10_sepconv3[0][0]           
__________________________________________________________________________________________________
add_9 (Add)                     (None, 15, 15, 728)  0           block10_sepconv3_bn[0][0]        
                                                                 add_8[0][0]                      
__________________________________________________________________________________________________
block11_sepconv1_act (Activatio (None, 15, 15, 728)  0           add_9[0][0]                      
__________________________________________________________________________________________________
block11_sepconv1 (SeparableConv (None, 15, 15, 728)  536536      block11_sepconv1_act[0][0]       
__________________________________________________________________________________________________
block11_sepconv1_bn (BatchNorma (None, 15, 15, 728)  2912        block11_sepconv1[0][0]           
__________________________________________________________________________________________________
block11_sepconv2_act (Activatio (None, 15, 15, 728)  0           block11_sepconv1_bn[0][0]        
__________________________________________________________________________________________________
block11_sepconv2 (SeparableConv (None, 15, 15, 728)  536536      block11_sepconv2_act[0][0]       
__________________________________________________________________________________________________
block11_sepconv2_bn (BatchNorma (None, 15, 15, 728)  2912        block11_sepconv2[0][0]           
__________________________________________________________________________________________________
block11_sepconv3_act (Activatio (None, 15, 15, 728)  0           block11_sepconv2_bn[0][0]        
__________________________________________________________________________________________________
block11_sepconv3 (SeparableConv (None, 15, 15, 728)  536536      block11_sepconv3_act[0][0]       
__________________________________________________________________________________________________
block11_sepconv3_bn (BatchNorma (None, 15, 15, 728)  2912        block11_sepconv3[0][0]           
__________________________________________________________________________________________________
add_10 (Add)                    (None, 15, 15, 728)  0           block11_sepconv3_bn[0][0]        
                                                                 add_9[0][0]                      
__________________________________________________________________________________________________
block12_sepconv1_act (Activatio (None, 15, 15, 728)  0           add_10[0][0]                     
__________________________________________________________________________________________________
block12_sepconv1 (SeparableConv (None, 15, 15, 728)  536536      block12_sepconv1_act[0][0]       
__________________________________________________________________________________________________
block12_sepconv1_bn (BatchNorma (None, 15, 15, 728)  2912        block12_sepconv1[0][0]           
__________________________________________________________________________________________________
block12_sepconv2_act (Activatio (None, 15, 15, 728)  0           block12_sepconv1_bn[0][0]        
__________________________________________________________________________________________________
block12_sepconv2 (SeparableConv (None, 15, 15, 728)  536536      block12_sepconv2_act[0][0]       
__________________________________________________________________________________________________
block12_sepconv2_bn (BatchNorma (None, 15, 15, 728)  2912        block12_sepconv2[0][0]           
__________________________________________________________________________________________________
block12_sepconv3_act (Activatio (None, 15, 15, 728)  0           block12_sepconv2_bn[0][0]        
__________________________________________________________________________________________________
block12_sepconv3 (SeparableConv (None, 15, 15, 728)  536536      block12_sepconv3_act[0][0]       
__________________________________________________________________________________________________
block12_sepconv3_bn (BatchNorma (None, 15, 15, 728)  2912        block12_sepconv3[0][0]           
__________________________________________________________________________________________________
add_11 (Add)                    (None, 15, 15, 728)  0           block12_sepconv3_bn[0][0]        
                                                                 add_10[0][0]                     
__________________________________________________________________________________________________
block13_sepconv1_act (Activatio (None, 15, 15, 728)  0           add_11[0][0]                     
__________________________________________________________________________________________________
block13_sepconv1 (SeparableConv (None, 15, 15, 728)  536536      block13_sepconv1_act[0][0]       
__________________________________________________________________________________________________
block13_sepconv1_bn (BatchNorma (None, 15, 15, 728)  2912        block13_sepconv1[0][0]           
__________________________________________________________________________________________________
block13_sepconv2_act (Activatio (None, 15, 15, 728)  0           block13_sepconv1_bn[0][0]        
__________________________________________________________________________________________________
block13_sepconv2 (SeparableConv (None, 15, 15, 1024) 752024      block13_sepconv2_act[0][0]       
__________________________________________________________________________________________________
block13_sepconv2_bn (BatchNorma (None, 15, 15, 1024) 4096        block13_sepconv2[0][0]           
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 8, 8, 1024)   745472      add_11[0][0]                     
__________________________________________________________________________________________________
block13_pool (MaxPooling2D)     (None, 8, 8, 1024)   0           block13_sepconv2_bn[0][0]        
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 8, 8, 1024)   4096        conv2d_4[0][0]                   
__________________________________________________________________________________________________
add_12 (Add)                    (None, 8, 8, 1024)   0           block13_pool[0][0]               
                                                                 batch_normalization_4[0][0]      
__________________________________________________________________________________________________
block14_sepconv1 (SeparableConv (None, 8, 8, 1536)   1582080     add_12[0][0]                     
__________________________________________________________________________________________________
block14_sepconv1_bn (BatchNorma (None, 8, 8, 1536)   6144        block14_sepconv1[0][0]           
__________________________________________________________________________________________________
block14_sepconv1_act (Activatio (None, 8, 8, 1536)   0           block14_sepconv1_bn[0][0]        
__________________________________________________________________________________________________
block14_sepconv2 (SeparableConv (None, 8, 8, 2048)   3159552     block14_sepconv1_act[0][0]       
__________________________________________________________________________________________________
block14_sepconv2_bn (BatchNorma (None, 8, 8, 2048)   8192        block14_sepconv2[0][0]           
__________________________________________________________________________________________________
block14_sepconv2_act (Activatio (None, 8, 8, 2048)   0           block14_sepconv2_bn[0][0]        
==================================================================================================
Total params: 20,861,480
Trainable params: 20,806,952
Non-trainable params: 54,528
__________________________________________________________________________________________________

Build the model.


In [4]:
def build_model():
    model = models.Sequential()
    model.add(conv_base)
    model.add(layers.Flatten())
    model.add(layers.Dense(256, activation='relu'))
    model.add(layers.Dropout(0.2))
    model.add(layers.Dense(32, activation='relu'))
    model.add(layers.Dense(1, activation='sigmoid')) 
    model.compile(loss='binary_crossentropy',
                      optimizer=optimizers.RMSprop(lr=2e-5),
                      metrics=['acc'])
    return model

Pre-train the added dense layers. Set workers to a reasonable number for the CPU. I have an 8 core, 16 thread, Ryzen 7. We could go higher on workers but this seemed enough. Note that this is set up to run Keras / TensorFlow with a GPU.


In [10]:
with tf.device('/gpu:0'):
    np.random.seed(seed)
    model = build_model()
    print('Pre-train dense layers')
    history = model.fit_generator(train_generator,
                                  steps_per_epoch=160,
                                  epochs=8,
                                  validation_data=validation_generator,
                                  validation_steps=30,
                                  verbose=1,
                                  workers=10)


Pre-train dense layers
Epoch 1/8
160/160 [==============================] - 62s 386ms/step - loss: 0.0678 - acc: 0.9801 - val_loss: 0.0628 - val_acc: 0.9800
Epoch 2/8
160/160 [==============================] - 54s 338ms/step - loss: 0.0427 - acc: 0.9835 - val_loss: 0.0679 - val_acc: 0.9790
Epoch 3/8
160/160 [==============================] - 54s 335ms/step - loss: 0.0379 - acc: 0.9849 - val_loss: 0.0758 - val_acc: 0.9820
Epoch 4/8
160/160 [==============================] - 55s 343ms/step - loss: 0.0423 - acc: 0.9844 - val_loss: 0.0639 - val_acc: 0.9817
Epoch 5/8
160/160 [==============================] - 54s 338ms/step - loss: 0.0356 - acc: 0.9863 - val_loss: 0.0687 - val_acc: 0.9803
Epoch 6/8
160/160 [==============================] - 54s 335ms/step - loss: 0.0328 - acc: 0.9884 - val_loss: 0.0751 - val_acc: 0.9813
Epoch 7/8
160/160 [==============================] - 54s 336ms/step - loss: 0.0313 - acc: 0.9886 - val_loss: 0.1024 - val_acc: 0.9767
Epoch 8/8
160/160 [==============================] - 54s 336ms/step - loss: 0.0358 - acc: 0.9881 - val_loss: 0.0693 - val_acc: 0.9793

Set the base model to have the last few layers be trainable. Preserve most of the layers from the pre-trained model.


In [11]:
conv_base.trainable = True

set_trainable = False
for layer in conv_base.layers:
    if 'block13' in layer.name:
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else:
        layer.trainable = False

Train the model. Now training both the dense layers and last few of the base Xception model.


In [12]:
with tf.device('/gpu:0'):
    print('Train Model')
    np.random.seed(seed)
    model = build_model()
    history = model.fit_generator(train_generator,
                                  steps_per_epoch=320,
                                  epochs=20,
                                  validation_data=validation_generator,
                                  validation_steps=60,
                                  verbose=1,
                                  initial_epoch=8,
                                  workers=10)


Train Model
Epoch 9/20
320/320 [==============================] - 103s 321ms/step - loss: 0.0509 - acc: 0.9833 - val_loss: 0.0548 - val_acc: 0.9837
Epoch 10/20
320/320 [==============================] - 95s 296ms/step - loss: 0.0372 - acc: 0.9869 - val_loss: 0.0683 - val_acc: 0.9827
Epoch 11/20
320/320 [==============================] - 95s 298ms/step - loss: 0.0359 - acc: 0.9870 - val_loss: 0.0933 - val_acc: 0.9787
Epoch 12/20
320/320 [==============================] - 95s 296ms/step - loss: 0.0281 - acc: 0.9901 - val_loss: 0.0882 - val_acc: 0.9807
Epoch 13/20
320/320 [==============================] - 95s 297ms/step - loss: 0.0281 - acc: 0.9891 - val_loss: 0.0965 - val_acc: 0.9777
Epoch 14/20
320/320 [==============================] - 95s 297ms/step - loss: 0.0261 - acc: 0.9908 - val_loss: 0.0904 - val_acc: 0.9790
Epoch 15/20
320/320 [==============================] - 95s 296ms/step - loss: 0.0261 - acc: 0.9909 - val_loss: 0.1010 - val_acc: 0.9793
Epoch 16/20
320/320 [==============================] - 95s 297ms/step - loss: 0.0267 - acc: 0.9912 - val_loss: 0.1087 - val_acc: 0.9760
Epoch 17/20
320/320 [==============================] - 95s 298ms/step - loss: 0.0234 - acc: 0.9925 - val_loss: 0.1247 - val_acc: 0.9793
Epoch 18/20
320/320 [==============================] - 95s 297ms/step - loss: 0.0229 - acc: 0.9924 - val_loss: 0.1149 - val_acc: 0.9760
Epoch 19/20
320/320 [==============================] - 96s 299ms/step - loss: 0.0236 - acc: 0.9927 - val_loss: 0.1167 - val_acc: 0.9757
Epoch 20/20
320/320 [==============================] - 95s 297ms/step - loss: 0.0232 - acc: 0.9923 - val_loss: 0.0929 - val_acc: 0.9820

Score the model on two previously unseen data sets.


In [13]:
with tf.device('/gpu:0'):
    scores = model.evaluate_generator(test_generator, workers=8)
    print('#1 Loss, Accuracy: ', scores)
    scores = model.evaluate_generator(test2_generator, workers=8)
    print('#2 Loss, Accuracy: ', scores)


#1 Loss, Accuracy:  [0.07068402187154087, 0.9886666715145112]
#2 Loss, Accuracy:  [0.05071193921592491, 0.9903333365917206]