In this notebook, we are going to explore a deep neural network to classify MNIST dataset. We have picked Keras, which is high level wrapper over Theano/TensorFlow for this purpose.
We are using Theano backend for this particular exercise. *note: TensorFlow is also supported (as an alternative to Theano), but we stick with Theano to keep it simple. The main difference is that you'll need to reshape the data slightly differently before feeding it to your network.
In [1]:
from __future__ import print_function
import keras
# For MNIST dataset
from keras.datasets import mnist
# Keras model module
from keras.models import Sequential
# Keras core layers
from keras.layers import Dense, Dropout, Flatten
# Keras CNN Layers
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
from keras.utils import np_utils
# %matplotlib inline
import matplotlib.pyplot as plt
In [2]:
# Load MNSIT data
from keras.datasets import mnist
# Load pre-shuffled MNIST data into train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
We can look at the shape of the dataset:
In [3]:
print(x_train.shape)
# (60000, 28, 28)
num_classes = 10
In [4]:
# Plot a few example images
plt.subplot(221)
plt.imshow(x_train[0], cmap=plt.get_cmap('gray'))
plt.subplot(222)
plt.imshow(x_train[1], cmap=plt.get_cmap('gray'))
plt.subplot(223)
plt.imshow(x_train[2], cmap=plt.get_cmap('gray'))
plt.subplot(224)
plt.imshow(x_train[3], cmap=plt.get_cmap('gray'))
# show the plot
plt.show()
When using the Theano backend, you must explicitly declare a dimension for the depth of the input image. For example, a full-color image with all 3 RGB channels will have a depth of 3.
Our MNIST images only have a depth of 1, but we must explicitly declare that.
In other words, we want to transform our dataset to (n, depth, width, height).
Here's how we can do that easily:
In [5]:
# input image dimensions
img_rows, img_cols = 28, 28
# Reshape input data
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
In [6]:
# To confirm, we can print X_train's dimensions again:
print (x_train.shape)
The final preprocessing step for the input data is to convert our data type to float32 and normalize our data values to the range [0, 1].
In [7]:
# Convert data types and normalize values
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
In [8]:
print(y_train.shape)
That may be problematic. We should have 10 different classes, one for each digit, but it looks like we only have a 1-dimensional array. Let's take a look at the labels for the first 10 training samples:
In [9]:
print(y_train[:10])
In [10]:
# Convert 1-dimensional class arrays to 10-dimensional class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
In [11]:
print(y_train.shape)
In [12]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
In [13]:
# Check model output shape
print(model.output_shape)
In [14]:
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
In [15]:
batch_size = 128
epochs = 2
In [33]:
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
verbose=1, validation_data=(x_test, y_test))
Out[33]:
In [34]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
In [ ]: