In [1]:
import keras
print(f"Keras Version: {keras.__version__}")
import tensorflow as tf
print(f"Tensorflow Version {tf.__version__}")
Keras is a high level wrapper (API) for Tensorflow and Theano which aims to make them easier to use. Tensorflow gets quite verbose and there is a lot of detail to handle, which Keras trys to abstract away to sane defaults, while allowing the option to tinker with the tensors where wanted.
To get a feel for Keras, I'm seeing how it goes with MNIST.
Keras already has some datasets included, so using the ever popular mnist:
MNIST database of handwritten digits
Dataset of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images.
In [15]:
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Checking the data:
In [3]:
f"Shapes x_train: {x_train.shape}, y_train: {y_train.shape}, x_test: {x_test.shape}, y_test: {y_test.shape}"
Out[3]:
The train and test images are 28x28
sized images, which we need to reshape into a 1d vector to make our super simple NN deal with.
Now, it's a good idea to always eyeball the data, so here goes:
In [4]:
# min to max values in x_train
x_train.min(), x_train.max()
Out[4]:
In [5]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
fig, axes = plt.subplots(3,3, figsize=(5,5))
for i, ax in enumerate(axes.flatten()):
ax.imshow(x_train[i])
ax.set_title(f"Class {y_train[i]}")
ax.set_xticks([]) , ax.set_yticks([])
In [8]:
y_train[:9]
Out[8]:
We still need to normalize the data:
In [16]:
X_train = x_train.astype('float32') / 255
X_test = x_test.astype('float32') / 255
X_train.min(), X_train.max()
Out[16]:
Keras expects images to have a depth - which generally with images means they have different colors, like RGB etc. the MNIST data is greyscale and thus doesn't have a depth, but we need to assign a 1 since Keras needs the depth specified. We know the image is 28x28
, so below we are just adding a 1:
In [17]:
print(x_train.shape)
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
print(X_train.shape)
Moving on to the image labels:
the image labels are stored as a simple numpy array, with each entry telling us what number each corresponding drawing is. Since our NN will spit out a prediction of the likelyhood of what number the drawing is, our NN will work better with the y data one hot encoded.
In [18]:
print("Existing image labels")
print(f"y_train: {y_train[:10]} | y_test: {y_test[:10]}")
from keras.utils import np_utils
Y_train = np_utils.to_categorical(y_train)
Y_test = np_utils.to_categorical(y_test)
print(f"Y_Train encoded: {Y_train[0]}")
print(f"Y_test encoded: {Y_test[0]}")
In [35]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=2, verbose=1)
model = Sequential()
model.add(Conv2D(16,(2,2), input_shape=(28,28,1), activation='relu'))
model.add(Dropout(0.05))
model.add(Conv2D(32, (2, 2), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.05))
# the weights from the Conv2D layer have to be made 1D for the Dense layer
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.05))
model.add(Dense(10))
model.add(Activation('softmax'))
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
# we can either use part of the training set as validation data or provide a validation set
history = model.fit(X_train, Y_train, epochs=5, batch_size=128, shuffle=True,
validation_split=0.05, callbacks=[early_stopping])
#model.fit(X_train, Y_train, epochs=10, batch_size=128, shuffle=True, validation_data=(X_test,Y_test))
In [41]:
model.evaluate(X_test, Y_test, batch_size=256)
Out[41]:
and viola, this pretty simple CNN gets over 98% accuracy!
In [37]:
model.summary()
In [40]:
fig, axs = plt.subplots(1,2,figsize=(15,5))
acc = axs[0]
acc.plot(history.history['val_acc'])
acc.plot(history.history['acc'])
acc.legend(['val_acc', 'acc'])
acc.set_title('Model Accuracy')
acc.set_ylabel('Accuracy')
acc.set_xlabel('Epoch')
loss = axs[1]
loss.plot(history.history['val_loss'])
loss.plot(history.history['loss'])
loss.legend(['val_loss', 'loss'])
loss.set_title('Model Loss')
loss.set_ylabel('Loss')
loss.set_xlabel('Epoch')
plt.show();
In [ ]: