CNN on MNIST digits classification

3-layer CNN for MNIST digits classification

  • First 2 layers - Conv2D-ReLU-MaxPool
  • 3rd layer - Conv2D-ReLU-Dropout
  • 4th layer - Dense(10)
  • Output Activation - softmax
  • Optimizer - Adam

~99.3% test accuracy in 20epochs


In [2]:
''' CNN MNIST digits classification
'''

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
from keras.models import Sequential
from keras.layers import Activation, Dense, Dropout
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.utils import to_categorical, plot_model
from keras.datasets import mnist

# load mnist dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# compute the number of labels
num_labels = len(np.unique(y_train))

# convert to one-hot vector
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# input image dimensions
image_size = x_train.shape[1]
# resize and normalize
x_train = np.reshape(x_train,[-1, image_size, image_size, 1])
x_test = np.reshape(x_test,[-1, image_size, image_size, 1])
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# network parameters
# image is processed as is (square grayscale)
input_shape = (image_size, image_size, 1)
batch_size = 128
kernel_size = 3
pool_size = 2
filters = 64
dropout = 0.2

# model is a stack of CNN-ReLU-MaxPooling
model = Sequential()
model.add(Conv2D(filters=filters,
                 kernel_size=kernel_size,
                 activation='relu',
                 padding='same',
                 input_shape=input_shape))
model.add(MaxPooling2D(pool_size))
model.add(Conv2D(filters=filters,
                 kernel_size=kernel_size,
                 padding='same',
                 activation='relu'))
model.add(MaxPooling2D(pool_size))
model.add(Conv2D(filters=filters,
                 kernel_size=kernel_size,
                 padding='same',
                 activation='relu'))
model.add(Flatten())
# dropout added as regularizer
model.add(Dropout(dropout))
# output layer is 10-dim one-hot vector
model.add(Dense(num_labels))
model.add(Activation('softmax'))
model.summary()
#plot_model(model, to_file='cnn-mnist.png', show_shapes=True)

# loss function for one-hot vector
# use of adam optimizer
# accuracy is good metric for classification tasks
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
# train the network
model.fit(x_train, y_train, epochs=20, batch_size=batch_size)

loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print("\nTest accuracy: %.1f%%" % (100.0 * acc))


Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 28, 28, 64)        640       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 64)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 14, 14, 64)        36928     
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 7, 7, 64)          0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 7, 7, 64)          36928     
_________________________________________________________________
flatten_2 (Flatten)          (None, 3136)              0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 3136)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                31370     
_________________________________________________________________
activation_2 (Activation)    (None, 10)                0         
=================================================================
Total params: 105,866
Trainable params: 105,866
Non-trainable params: 0
_________________________________________________________________
WARNING:tensorflow:From //anaconda3/lib/python3.7/site-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From //anaconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3576: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From //anaconda3/lib/python3.7/site-packages/tensorflow/python/ops/math_grad.py:1250: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Epoch 1/20
60000/60000 [==============================] - 42s 697us/step - loss: 0.2128 - acc: 0.9352
Epoch 2/20
60000/60000 [==============================] - 42s 702us/step - loss: 0.0538 - acc: 0.9837
Epoch 3/20
60000/60000 [==============================] - 41s 688us/step - loss: 0.0385 - acc: 0.9878
Epoch 4/20
60000/60000 [==============================] - 41s 689us/step - loss: 0.0304 - acc: 0.9903
Epoch 5/20
60000/60000 [==============================] - 41s 689us/step - loss: 0.0258 - acc: 0.9919
Epoch 6/20
60000/60000 [==============================] - 41s 690us/step - loss: 0.0207 - acc: 0.9932
Epoch 7/20
60000/60000 [==============================] - 41s 686us/step - loss: 0.0182 - acc: 0.9942
Epoch 8/20
60000/60000 [==============================] - 41s 689us/step - loss: 0.0157 - acc: 0.9950
Epoch 9/20
60000/60000 [==============================] - 41s 686us/step - loss: 0.0134 - acc: 0.9957
Epoch 10/20
60000/60000 [==============================] - 41s 685us/step - loss: 0.0122 - acc: 0.9961
Epoch 11/20
60000/60000 [==============================] - 41s 685us/step - loss: 0.0119 - acc: 0.9961
Epoch 12/20
60000/60000 [==============================] - 41s 683us/step - loss: 0.0095 - acc: 0.9969
Epoch 13/20
60000/60000 [==============================] - 41s 687us/step - loss: 0.0089 - acc: 0.9969
Epoch 14/20
60000/60000 [==============================] - 42s 698us/step - loss: 0.0086 - acc: 0.9972
Epoch 15/20
60000/60000 [==============================] - 45s 744us/step - loss: 0.0070 - acc: 0.9975
Epoch 16/20
60000/60000 [==============================] - 41s 691us/step - loss: 0.0075 - acc: 0.9974
Epoch 17/20
60000/60000 [==============================] - 41s 688us/step - loss: 0.0062 - acc: 0.9977
Epoch 18/20
60000/60000 [==============================] - 41s 691us/step - loss: 0.0065 - acc: 0.9976
Epoch 19/20
60000/60000 [==============================] - 42s 693us/step - loss: 0.0047 - acc: 0.9986
Epoch 20/20
60000/60000 [==============================] - 41s 690us/step - loss: 0.0056 - acc: 0.9981
10000/10000 [==============================] - 1s 144us/step

Test accuracy: 99.2%

In [ ]:


In [ ]: