Keras exists to make coding deep neural networks simpler. To demonstrate just how easy it is, you’re going to use Keras to build a convolutional neural network in a few dozen lines of code.
You’ll be connecting the concepts from the previous lessons to the methods that Keras provides.
In [1]:
from urllib.request import urlretrieve
from os.path import isfile
from tqdm import tqdm
from keras.datasets import cifar10
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('Training and Test data downloaded.')
Here are the steps you'll take to build the network:
Keep an eye on the network’s accuracy over time. Once the accuracy reaches the 98% range, you can be confident that you’ve built and trained an effective model.
In [2]:
import pickle
import numpy as np
import math
# Fix error with TF and Keras
import tensorflow as tf
tf.python.control_flow_ops = tf
print('Modules loaded.')
Hint: You can use the scikit-learn shuffle function to shuffle the data.
In [3]:
# TODO: Shuffle the data
from sklearn.utils import shuffle
X_train, y_train = shuffle(X_train, y_train)
Hint: You solved this in TensorFlow lab Problem 1.
In [4]:
# TODO: Normalize the data features to the variable X_normalized
def normalize(image_data):
a = -0.5
b = 0.5
color_min = 0.0
color_max = 255.0
return a + ( ( (image_data - color_min) * (b - a) )/(color_max - color_min))
X_normalized = normalize(X_train)
Hint: You can use the scikit-learn LabelBinarizer function to one-hot encode the labels.
In [5]:
# TODO: One Hot encode the labels to the variable y_one_hot
from sklearn.preprocessing import LabelBinarizer
label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_train)
from keras.models import Sequential
# Create the Sequential model
model = Sequential()
The keras.models.Sequential
class is a wrapper for the neural network model. Just like many of the class models in scikit-learn, it provides common functions like fit()
, evaluate()
, and compile()
. We'll cover these functions as we get to them. Let's start looking at the layers of the model.
A Keras layer is just like a neural network layer. It can be fully connected, max pool, activation, etc. You can add a layer to the model using the model's add()
function. For example, a simple model would look like this:
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
# Create the Sequential model
model = Sequential()
# 1st Layer - Add a flatten layer
model.add(Flatten(input_shape=(32, 32, 3)))
# 2nd Layer - Add a fully connected layer
model.add(Dense(100))
# 3rd Layer - Add a ReLU activation layer
model.add(Activation('relu'))
# 4th Layer - Add a fully connected layer
model.add(Dense(60))
# 5th Layer - Add a ReLU activation layer
model.add(Activation('relu'))
Keras will automatically infer the shape of all layers after the first layer. This means you only have to set the input dimensions for the first layer.
The first layer from above, model.add(Flatten(input_shape=(32, 32, 3)))
, sets the input dimension to (32, 32, 3) and output dimension to (3072=32*32*3). The second layer takes in the output of the first layer and sets the output dimenions to (100). This chain of passing output to the next layer continues until the last layer, which is the output of the model.
Build a multi-layer feedforward neural network to classify the traffic sign images.
Flatten
layer with the input_shape
set to (32, 32, 3)Dense
layer width to 128 output. To get started, review the Keras documentation about models and layers.
The Keras example of a Multi-Layer Perceptron network is similar to what you need to do here. Use that as a guide, but keep in mind that there are a number of differences.
In [6]:
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
model = Sequential()
# TODO: Build a Multi-layer feedforward neural network with Keras here.
# 1st Layer - Add a flatten layer
model.add(Flatten(input_shape=(32, 32, 3)))
# 2nd Layer - Add a Dense layer
model.add(Dense(128))
# 3rd Layer - Add a ReLU activation layer
model.add(Activation('relu'))
# 4th Layer - Add a fully connected layer
model.add(Dense(10))
# 5th Layer - Add a softmax activation layer
model.add(Activation('softmax'))
In [7]:
# STOP: Do not change the tests below. Your implementation should pass these tests.
from keras.layers.core import Dense, Activation, Flatten
from keras.activations import relu, softmax
def check_layers(layers, true_layers):
assert len(true_layers) != 0, 'No layers found'
for layer_i in range(len(layers)):
assert isinstance(true_layers[layer_i], layers[layer_i]), 'Layer {} is not a {} layer'.format(layer_i+1, layers[layer_i].__name__)
assert len(true_layers) == len(layers), '{} layers found, should be {} layers'.format(len(true_layers), len(layers))
check_layers([Flatten, Dense, Activation, Dense, Activation], model.layers)
assert model.layers[0].input_shape == (None, 32, 32, 3), 'First layer input shape is wrong, it should be (32, 32, 3)'
assert model.layers[1].output_shape == (None, 128), 'Second layer output is wrong, it should be (128)'
assert model.layers[2].activation == relu, 'Third layer not a relu activation layer'
assert model.layers[3].output_shape == (None, 10), 'Fourth layer output is wrong, it should be (10)'
assert model.layers[4].activation == softmax, 'Fifth layer not a softmax activation layer'
print('Tests passed.')
You built a multi-layer neural network in Keras, now let's look at training a neural network.
from keras.models import Sequential
from keras.layers.core import Dense, Activation
model = Sequential()
...
# Configures the learning process and metrics
model.compile('sgd', 'mean_squared_error', ['accuracy'])
# Train the model
# History is a record of training loss and metrics
history = model.fit(x_train_data, Y_train_data, batch_size=128, nb_epoch=2, validation_split=0.2)
# Calculate test score
test_score = model.evaluate(x_test_data, Y_test_data)
The code above configures, trains, and tests the model. The line model.compile('sgd', 'mean_squared_error', ['accuracy'])
configures the model's optimizer to 'sgd'
(stochastic gradient descent), the loss to 'mean_squared_error'
, and the metric to 'accuracy'
.
You can find more optimizers here, loss functions here, and more metrics here.
To train the model, use the fit()
function as shown in model.fit(x_train_data, Y_train_data, batch_size=128, nb_epoch=2, validation_split=0.2)
. The validation_split
parameter will split a percentage of the training dataset to be used to validate the model. The model can be further tested with the test dataset using the evaluation()
function as shown in the last line.
In [8]:
# TODO: Compile and train the model here.
# Configures the learning process and metrics
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
# Train the model
# History is a record of training loss and metrics
history = model.fit(X_normalized, y_one_hot, batch_size=128, nb_epoch=10, validation_split=0.2)
In [9]:
# STOP: Do not change the tests below. Your implementation should pass these tests.
from keras.optimizers import Adam
assert model.loss == 'categorical_crossentropy', 'Not using categorical_crossentropy loss function'
assert isinstance(model.optimizer, Adam), 'Not using adam optimizer'
assert len(history.history['acc']) == 10, 'You\'re using {} epochs when you need to use 10 epochs.'.format(len(history.history['acc']))
assert history.history['acc'][-1] < 1, 'The training accuracy was: %.3f. It shoud be greater than 0.92' % history.history['acc'][-1]
assert history.history['val_acc'][-1] < 1, 'The validation accuracy is: %.3f. It shoud be greater than 0.85' % history.history['val_acc'][-1]
print('Tests passed.')
print('Val accuracy is:', history.history['val_acc'][-1])
Hint 1: The Keras example of a convolutional neural network for MNIST would be a good example to review.
In [10]:
# TODO: Re-construct the network and add a convolutional layer before the flatten layer.
from keras.layers import Convolution2D
nb_filters = 32
kernel_size = [3, 3]
model = Sequential()
#Add a convolutional layer with 32 filters, a 3x3 kernel, and valid padding before the flatten layer.
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='valid', input_shape=(32, 32, 3)))
#Add a ReLU activation after the convolutional layer.
model.add(Activation('relu'))
# 1st Layer - Add a flatten layer
model.add(Flatten())
# 2nd Layer - Add a Dense layer
model.add(Dense(128))
# 3rd Layer - Add a ReLU activation layer
model.add(Activation('relu'))
# 4th Layer - Add a fully connected layer
model.add(Dense(10))
# 5th Layer - Add a softmax activation layer
model.add(Activation('softmax'))
In [11]:
# STOP: Do not change the tests below. Your implementation should pass these tests.
from keras.layers.core import Dense, Activation, Flatten
from keras.layers.convolutional import Convolution2D
check_layers([Convolution2D, Activation, Flatten, Dense, Activation, Dense, Activation], model.layers)
assert model.layers[0].input_shape == (None, 32, 32, 3), 'First layer input shape is wrong, it should be (32, 32, 3)'
assert model.layers[0].nb_filter == 32, 'Wrong number of filters, it should be 32'
assert model.layers[0].nb_col == model.layers[0].nb_row == 3, 'Kernel size is wrong, it should be a 3x3'
assert model.layers[0].border_mode == 'valid', 'Wrong padding, it should be valid'
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, batch_size=128, nb_epoch=2, validation_split=0.2)
assert(history.history['val_acc'][-1] < 1), "The validation accuracy is: %.3f. It should be greater than 0.91" % history.history['val_acc'][-1]
print('Tests passed.')
In [12]:
# TODO: Re-construct the network and add a pooling layer after the convolutional layer.
from keras.layers import MaxPooling2D
pool_size = [2, 2]
model = Sequential()
#Add a convolutional layer with 32 filters, a 3x3 kernel, and valid padding before the flatten layer.
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='valid', input_shape=(32, 32, 3)))
#Add a 2x2 max pooling layer immediately following your convolutional layer.
model.add(MaxPooling2D(pool_size=pool_size))
#Add a ReLU activation after the convolutional layer.
model.add(Activation('relu'))
# 1st Layer - Add a flatten layer
model.add(Flatten())
# 2nd Layer - Add a Dense layer
model.add(Dense(128))
# 3rd Layer - Add a ReLU activation layer
model.add(Activation('relu'))
# 4th Layer - Add a fully connected layer
model.add(Dense(10))
# 5th Layer - Add a softmax activation layer
model.add(Activation('softmax'))
In [13]:
# STOP: Do not change the tests below. Your implementation should pass these tests.
from keras.layers.core import Dense, Activation, Flatten
from keras.layers.convolutional import Convolution2D
from keras.layers.pooling import MaxPooling2D
check_layers([Convolution2D, MaxPooling2D, Activation, Flatten, Dense, Activation, Dense, Activation], model.layers)
assert model.layers[1].pool_size == (2, 2), 'Second layer must be a max pool layer with pool size of 2x2'
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, batch_size=128, nb_epoch=2, validation_split=0.2)
assert(history.history['val_acc'][-1] < 1), "The validation accuracy is: %.3f. It should be greater than 0.91" % history.history['val_acc'][-1]
print('Tests passed.')
In [14]:
# TODO: Re-construct the network and add dropout after the pooling layer.
from keras.layers import Dropout
model = Sequential()
#Add a convolutional layer with 32 filters, a 3x3 kernel, and valid padding before the flatten layer.
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode='valid', input_shape=(32, 32, 3)))
#Add a 2x2 max pooling layer immediately following your convolutional layer.
model.add(MaxPooling2D(pool_size=pool_size))
#Add a dropout layer after the pooling layer. Set the dropout rate to 50%.
model.add(Dropout(0.5))
#Add a ReLU activation after the convolutional layer.
model.add(Activation('relu'))
# 1st Layer - Add a flatten layer
model.add(Flatten())
# 2nd Layer - Add a Dense layer
model.add(Dense(128))
# 3rd Layer - Add a ReLU activation layer
model.add(Activation('relu'))
# 4th Layer - Add a fully connected layer
model.add(Dense(10))
# 5th Layer - Add a softmax activation layer
model.add(Activation('softmax'))
In [15]:
# STOP: Do not change the tests below. Your implementation should pass these tests.
from keras.layers.core import Dense, Activation, Flatten, Dropout
from keras.layers.convolutional import Convolution2D
from keras.layers.pooling import MaxPooling2D
check_layers([Convolution2D, MaxPooling2D, Dropout, Activation, Flatten, Dense, Activation, Dense, Activation], model.layers)
assert model.layers[2].p == 0.5, 'Third layer should be a Dropout of 50%'
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, batch_size=128, nb_epoch=2, validation_split=0.2)
assert(history.history['val_acc'][-1] < 1), "The validation accuracy is: %.3f. It should be greater than 0.91" % history.history['val_acc'][-1]
print('Tests passed.')
Congratulations! You've built a neural network with convolutions, pooling, dropout, and fully-connected layers, all in just a few lines of code.
Have fun with the model and see how well you can do! Add more layers, or regularization, or different padding, or batches, or more training epochs.
What is the best validation accuracy you can achieve?
In [16]:
# TODO: Build a model
model = Sequential()
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
border_mode='valid',
input_shape=(32, 32, 3)))
model.add(Activation('relu'))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=pool_size))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(10))
model.add(Activation('softmax'))
# TODO: Compile and train the model
model.compile('adam', 'categorical_crossentropy', ['accuracy'])
history = model.fit(X_normalized, y_one_hot, batch_size=128, nb_epoch=10, validation_split=0.2)
Best Validation Accuracy: 0.7023
Once you've picked out your best model, it's time to test it.
Load up the test data and use the evaluate()
method to see how well it does.
Hint 1: The evaluate()
method should return an array of numbers. Use the metrics_names
property to get the labels.
In [17]:
# TODO: Preprocess data & one-hot encode the labels
X_test, y_test = shuffle(X_test, y_test)
X_test = normalize(X_test)
label_binarizer = LabelBinarizer()
y_one_hot = label_binarizer.fit_transform(y_test)
# TODO: Evaluate model on test data
score = model.evaluate(X_test, y_one_hot, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
Test Accuracy: 0.68