In [1]:
import numpy as np
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
We're going to use some examples from https://github.com/fchollet/keras/tree/master/examples. There are tons more and you should check them out! We'll use these examples to learn about some different sorts of layers, and strategies for our activation functions, loss functions, optimizers, etc.
This examples is from https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py. We'll learn about some simple convolutional layers!
In [2]:
import keras
from keras.datasets import mnist # load up the training data!
from keras.models import Sequential # our model
from keras.layers import Dense, Dropout, Flatten # layers we've seen
from keras.layers import Conv2D, MaxPooling2D # new layers
from keras import backend as K # see later
Typically it's good practice to specify your parameters together
In [3]:
batch_size = 128
num_classes = 10
epochs = 12
In this case we already know something about the shape of the input data! Let's load it in
In [4]:
img_rows, img_cols = 28, 28
(x_train, y_train), (x_test, y_test) = mnist.load_data()
Keras has many different backends that can be used (we're using TensorFlow).
One of the subtle differences mentioned is the shape of the input data.
The backend module which we imported as K lets the code know which format we're using and re-shapes accordingly.
Good to keep in mind for later if you're having bugs with this
In [5]:
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
As before we'll set our data to be float32 and rescale
In [6]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
And yet again we're going to do the same thing with our $y$ labels
In [7]:
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
OK now we're going to define a model with some new layers
In [8]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
The Conv2D and MaxPooling2D layers are new.
Let's think about what they're doing:
In [9]:
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
In [10]:
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Out[10]:
Now we'll compile as before. Here the loss function and optimizer are specified differently -- both ways are fine. We're using a new optimizer; it's a good idea to look up the different loss functions and optimizers if you have the time.
In [11]:
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
Now fit the model
In [12]:
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
Why was that so much slower? A lot more parameters!
In [13]:
model.summary()