MNIST Convolutional Neural Network

We are going to build a Convolutional Neural Network to classify handwritten digits from MNIST dataset. This will just be a simple CNN with 2 convolutional layers and 2 hidden layers. We are later going to study more accurate sequential models.

For this notebook we are going to use TensorFlow with Keras.


In [1]:
import tensorflow as tf 
# We don't really need to import TensorFlow here since it's handled by Keras, 
# but we do it in order to output the version we are using.

In [2]:
tf.__version__


Out[2]:
'0.12.1'

We are using TensorFlow-GPU 0.12.1 on Python 3.5.2, running on Windows 10 with Cuda 8.0. We have 3 machines with the same environment and 3 different GPUs, respectively with 384, 1024 and 1664 Cuda cores.

Imports


In [3]:
from IPython.display import Image

from util import Util
u = Util()

import numpy as np
# Explicit random seed for reproducibility
np.random.seed(1337)


Using TensorFlow backend.

In [4]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.utils import np_utils
from keras import backend as K

In [5]:
from keras.datasets import mnist

Definitions


In [6]:
batch_size = 512
nb_classes = 10
nb_epoch = 25
# path of the model graph
model_image_path = 'images/model_01_MNIST.png'

In [7]:
# input image dimensions
img_rows, img_cols = 28, 28
# number of convolutional filters to use
nb_filters = 32
# size of pooling area for max pooling
pool_size = (2, 2)
# convolution kernel size
kernel_size = (3, 3)
# dense layer size
dense_layer_size = 128

Data load


In [8]:
# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [9]:
u.plot_images(X_train[0:9], y_train[0:9])



In [10]:
if K.image_dim_ordering() == 'th':
    X_train = X_train.reshape(X_train.shape[0], 1, img_rows, img_cols)
    X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)
    X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

In [11]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
print('X_train shape:', X_train.shape)
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')


X_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples

In [12]:
# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

Model definition


In [13]:
model = Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
                        border_mode='valid',
                        input_shape=input_shape, name='covolution_1_' + str(nb_filters) + '_filters'))
model.add(Activation('relu', name='activation_1_relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], name='covolution_2_' + str(nb_filters) + '_filters'))
model.add(Activation('relu', name='activation_2_relu'))
model.add(MaxPooling2D(pool_size=pool_size, name='max_pooling_1_' + str(pool_size) + '_pool_size'))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(dense_layer_size, name='fully_connected_1_' + str(dense_layer_size) + '_neurons'))
model.add(Activation('relu', name='activation_3_relu'))
model.add(Dropout(0.4))
model.add(Dense(nb_classes, name='output_' + str(nb_classes) + '_neurons'))
model.add(Activation('softmax', name='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adadelta',
              metrics=['accuracy'])

Image(u.maybe_save_network(model, model_image_path), width=300)


Out[13]:

Training and evaluation


In [14]:
history = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
          verbose=1, validation_data=(X_test, Y_test))

score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])

u.plot_history(history)
u.plot_history(history, metric='loss', loc='upper left')


Train on 60000 samples, validate on 10000 samples
Epoch 1/25
60000/60000 [==============================] - 9s - loss: 0.6352 - acc: 0.8063 - val_loss: 0.2043 - val_acc: 0.9390
Epoch 2/25
60000/60000 [==============================] - 7s - loss: 0.2021 - acc: 0.9399 - val_loss: 0.1150 - val_acc: 0.9641
Epoch 3/25
60000/60000 [==============================] - 7s - loss: 0.1392 - acc: 0.9587 - val_loss: 0.0813 - val_acc: 0.9736
Epoch 4/25
60000/60000 [==============================] - 7s - loss: 0.1112 - acc: 0.9668 - val_loss: 0.0617 - val_acc: 0.9808
Epoch 5/25
60000/60000 [==============================] - 7s - loss: 0.0935 - acc: 0.9718 - val_loss: 0.0558 - val_acc: 0.9828
Epoch 6/25
60000/60000 [==============================] - 7s - loss: 0.0823 - acc: 0.9756 - val_loss: 0.0491 - val_acc: 0.9841
Epoch 7/25
60000/60000 [==============================] - 7s - loss: 0.0737 - acc: 0.9777 - val_loss: 0.0424 - val_acc: 0.9863
Epoch 8/25
60000/60000 [==============================] - 7s - loss: 0.0682 - acc: 0.9791 - val_loss: 0.0444 - val_acc: 0.9857
Epoch 9/25
60000/60000 [==============================] - 7s - loss: 0.0623 - acc: 0.9813 - val_loss: 0.0367 - val_acc: 0.9872
Epoch 10/25
60000/60000 [==============================] - 7s - loss: 0.0583 - acc: 0.9822 - val_loss: 0.0353 - val_acc: 0.9895
Epoch 11/25
60000/60000 [==============================] - 7s - loss: 0.0552 - acc: 0.9833 - val_loss: 0.0327 - val_acc: 0.9892
Epoch 12/25
60000/60000 [==============================] - 7s - loss: 0.0503 - acc: 0.9849 - val_loss: 0.0329 - val_acc: 0.9888
Epoch 13/25
60000/60000 [==============================] - 7s - loss: 0.0478 - acc: 0.9851 - val_loss: 0.0304 - val_acc: 0.9893
Epoch 14/25
60000/60000 [==============================] - 7s - loss: 0.0449 - acc: 0.9863 - val_loss: 0.0292 - val_acc: 0.9900
Epoch 15/25
60000/60000 [==============================] - 7s - loss: 0.0418 - acc: 0.9870 - val_loss: 0.0319 - val_acc: 0.9899
Epoch 16/25
60000/60000 [==============================] - 7s - loss: 0.0419 - acc: 0.9872 - val_loss: 0.0276 - val_acc: 0.9905
Epoch 17/25
60000/60000 [==============================] - 7s - loss: 0.0396 - acc: 0.9881 - val_loss: 0.0287 - val_acc: 0.9899
Epoch 18/25
60000/60000 [==============================] - 7s - loss: 0.0368 - acc: 0.9884 - val_loss: 0.0273 - val_acc: 0.9907
Epoch 19/25
60000/60000 [==============================] - 7s - loss: 0.0357 - acc: 0.9889 - val_loss: 0.0278 - val_acc: 0.9910
Epoch 20/25
60000/60000 [==============================] - 7s - loss: 0.0346 - acc: 0.9892 - val_loss: 0.0271 - val_acc: 0.9908
Epoch 21/25
60000/60000 [==============================] - 7s - loss: 0.0337 - acc: 0.9890 - val_loss: 0.0259 - val_acc: 0.9913
Epoch 22/25
60000/60000 [==============================] - 7s - loss: 0.0315 - acc: 0.9901 - val_loss: 0.0250 - val_acc: 0.9922
Epoch 23/25
60000/60000 [==============================] - 7s - loss: 0.0320 - acc: 0.9899 - val_loss: 0.0313 - val_acc: 0.9899
Epoch 24/25
60000/60000 [==============================] - 7s - loss: 0.0290 - acc: 0.9906 - val_loss: 0.0257 - val_acc: 0.9914
Epoch 25/25
60000/60000 [==============================] - 7s - loss: 0.0290 - acc: 0.9905 - val_loss: 0.0260 - val_acc: 0.9915
Test score: 0.0260191444306
Test accuracy: 0.9915
dict_keys(['acc', 'loss', 'val_loss', 'val_acc'])

Inspecting the result


In [15]:
# The predict_classes function outputs the highest probability class
# according to the trained classifier for each input example.
predicted_classes = model.predict_classes(X_test)

# Check which items we got right / wrong
correct_indices = np.nonzero(predicted_classes == y_test)[0]
incorrect_indices = np.nonzero(predicted_classes != y_test)[0]


 9920/10000 [============================>.] - ETA: 0s

In [16]:
u.plot_confusion_matrix(y_test, nb_classes, predicted_classes)


[[ 975    0    0    0    0    1    1    1    2    0]
 [   0 1133    1    0    0    0    1    0    0    0]
 [   2    2 1023    0    0    0    0    5    0    0]
 [   0    0    1 1003    0    3    0    1    1    1]
 [   0    0    1    0  977    0    0    0    0    4]
 [   1    0    0    3    0  887    1    0    0    0]
 [   6    2    0    0    2    4  944    0    0    0]
 [   0    1    3    1    0    0    0 1022    1    0]
 [   3    0    2    1    0    1    0    2  962    3]
 [   2    3    0    1    5    3    0    3    3  989]]

Examples of correct predictions


In [17]:
u.plot_images(X_test[correct_indices[:9]], y_test[correct_indices[:9]], 
              predicted_classes[correct_indices[:9]])


Examples of incorrect predictions


In [18]:
u.plot_images(X_test[incorrect_indices[:9]], y_test[incorrect_indices[:9]], 
              predicted_classes[incorrect_indices[:9]])


Results

After 25 epochs running in about 7 seconds each (on our average GPU), the model reaches over 99.1% accuracy on the test set. The record for MNIST is 0.21% error, so we still have room for improvement.