Getting started

First we import Keras and as a test we check to see what backend it's using


In [1]:
# matplotlib is used for...you guessed it: plotting!
import matplotlib.pyplot as plt

# This next line is a Jupyter directive. It tells Jupyter that we want our plots to show
# up right below the code that creates them.
%matplotlib inline

In [2]:
import tensorflow as tf
from keras import backend,__version__ as keras_version

print "Current backend:\t" + backend._BACKEND
print "TensorFlow version:\t" + tf.__version__
print "Using Keras Version:\t" + keras_version


Current backend:	tensorflow
TensorFlow version:	1.0.1
Using Keras Version:	2.0.2
Using TensorFlow backend.

Initial Setup

We import the needed modules and setup our intial configuration.

Then we go and download the dataset using the keras.datasets module to download the mnist dataset from amazon. We save it to our /notebooks/data/ directory so that we can persist the dataset between container runs.


In [3]:
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''

from __future__ import print_function
import numpy as np
np.random.seed(1337)  # for reproducibility

from keras.callbacks import TensorBoard
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils import np_utils

batch_size = 128
nb_classes = 10
nb_epoch = 10

# the data, shuffled and split between train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data('/notebooks/data/mnist.npz')

X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

In [4]:
print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()
model.add(Dense(512, input_shape=(784,)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(10))
model.add(Activation('softmax'))

model.summary()

model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

history = model.fit(X_train, Y_train,
                    batch_size=batch_size, epochs=nb_epoch,
                    callbacks=[TensorBoard(log_dir='/tensorboard', histogram_freq=1,write_images=True, write_graph=True)],
                    verbose=1, validation_data=(X_test, Y_test))
score = model.evaluate(X_test, Y_test, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])


60000 train samples
10000 test samples
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 512)               401920    
_________________________________________________________________
activation_1 (Activation)    (None, 512)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               262656    
_________________________________________________________________
activation_2 (Activation)    (None, 512)               0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
_________________________________________________________________
activation_3 (Activation)    (None, 10)                0         
=================================================================
Total params: 669,706.0
Trainable params: 669,706.0
Non-trainable params: 0.0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.
INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.
INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.
INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.
INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.
INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.
INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.
INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.
INFO:tensorflow:Summary name dense_3/kernel:0 is illegal; using dense_3/kernel_0 instead.
INFO:tensorflow:Summary name dense_3/kernel:0 is illegal; using dense_3/kernel_0 instead.
INFO:tensorflow:Summary name dense_3/bias:0 is illegal; using dense_3/bias_0 instead.
INFO:tensorflow:Summary name dense_3/bias:0 is illegal; using dense_3/bias_0 instead.
Epoch 1/10
60000/60000 [==============================] - 8s - loss: 0.2432 - acc: 0.9248 - val_loss: 0.1103 - val_acc: 0.9659
Epoch 2/10
60000/60000 [==============================] - 10s - loss: 0.1011 - acc: 0.9690 - val_loss: 0.0946 - val_acc: 0.9730
Epoch 3/10
60000/60000 [==============================] - 11s - loss: 0.0748 - acc: 0.9776 - val_loss: 0.0814 - val_acc: 0.9758
Epoch 4/10
60000/60000 [==============================] - 10s - loss: 0.0600 - acc: 0.9820 - val_loss: 0.1093 - val_acc: 0.9694
Epoch 5/10
60000/60000 [==============================] - 10s - loss: 0.0510 - acc: 0.9857 - val_loss: 0.0825 - val_acc: 0.9790
Epoch 6/10
60000/60000 [==============================] - 10s - loss: 0.0446 - acc: 0.9869 - val_loss: 0.0891 - val_acc: 0.9789
Epoch 7/10
60000/60000 [==============================] - 11s - loss: 0.0391 - acc: 0.9885 - val_loss: 0.0816 - val_acc: 0.9804
Epoch 8/10
60000/60000 [==============================] - 10s - loss: 0.0355 - acc: 0.9895 - val_loss: 0.0869 - val_acc: 0.9819
Epoch 9/10
60000/60000 [==============================] - 10s - loss: 0.0322 - acc: 0.9905 - val_loss: 0.0935 - val_acc: 0.9811
Epoch 10/10
60000/60000 [==============================] - 10s - loss: 0.0282 - acc: 0.9919 - val_loss: 0.0878 - val_acc: 0.9809
Test score: 0.0877838852293
Test accuracy: 0.9809

Display our model as a graph

Here we can take our model and use Keras' visualization utitlities to show a simple graph of our network.


In [5]:
from IPython.display import SVG 
from keras.utils.vis_utils import model_to_dot 
SVG(model_to_dot(model).create(prog='dot', format='svg'))


Out[5]:
G 139848945050128 dense_1_input: InputLayer 139849146645584 dense_1: Dense 139848945050128->139849146645584 139849055337424 activation_1: Activation 139849146645584->139849055337424 139848945050896 dropout_1: Dropout 139849055337424->139848945050896 139848945050640 dense_2: Dense 139848945050896->139848945050640 139848945053200 activation_2: Activation 139848945050640->139848945053200 139848944765712 dropout_2: Dropout 139848945053200->139848944765712 139848944864976 dense_3: Dense 139848944765712->139848944864976 139848944950096 activation_3: Activation 139848944864976->139848944950096

Save our Model

Here we could take our model and save it out to disk.


In [6]:
# model.save('mnist-example-model.h5')