Model Building for MNIST


In [1]:
from theano.sandbox import cuda
cuda.use('gpu1')


WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

WARNING (theano.sandbox.cuda): Ignoring call to use(1), GPU number 0 is already in use.

In [2]:
%matplotlib inline
from importlib import reload
import utils; reload(utils)
from utils import *
from __future__ import division, print_function


Using Theano backend.

Setup


In [3]:
batch_size = 64
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(X_train.shape, y_train.shape, X_test.shape, y_test.shape)


Out[3]:
((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))

In [4]:
# Because MNIST is grey-scale images, it does not have the color column,
# Let's add one empty dim  to the X data
X_test = np.expand_dims(X_test, 1)
X_train = np.expand_dims(X_train, 1)
X_train.shape


Out[4]:
(60000, 1, 28, 28)

In [5]:
y_train[:5]


Out[5]:
array([5, 0, 4, 1, 9], dtype=uint8)

In [6]:
y_train = onehot(y_train)
y_test = onehot(y_test)
y_train[:5]


Out[6]:
array([[ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])

Now, let's normalize the inputs


In [7]:
mean_px = X_train.mean().astype(np.float32)
std_px = X_train.std().astype(np.float32)

In [8]:
def norm_input(x): return (x-mean_px)/std_px

Linear model

Why not we just fine-tune the imagenet model?

Because imageNet is 214 x 214 and is full-color. Here we have 28 x 28 and greyscale.

So we need to start from scratch.


In [9]:
def get_lin_model():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28,28)),
        Flatten(),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

lm = get_lin_model()


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_1 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))

In [10]:
gen = image.ImageDataGenerator()
batches = gen.flow(X_train, y_train, batch_size=64)
test_batches = gen.flow(X_test, y_test, batch_size=64)

In [17]:
lm.fit_generator(batches, batches.N, nb_epoch=1, 
                validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/1
60000/60000 [==============================] - 14s - loss: 0.4259 - acc: 0.8735 - val_loss: 0.3073 - val_acc: 0.9142
Out[17]:
<keras.callbacks.History at 0x7faf22bab400>

It's always recommended to start with epoch 1 and a low learning rate. Defaut is 0.0001


In [18]:
lm.optimizer.lr = 0.1
lm.fit_generator(batches, batches.N, nb_epoch=3,
                validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/3
60000/60000 [==============================] - 14s - loss: 0.2987 - acc: 0.9149 - val_loss: 0.2854 - val_acc: 0.9181
Epoch 2/3
60000/60000 [==============================] - 14s - loss: 0.2842 - acc: 0.9201 - val_loss: 0.2820 - val_acc: 0.9192
Epoch 3/3
60000/60000 [==============================] - 13s - loss: 0.2769 - acc: 0.9224 - val_loss: 0.2733 - val_acc: 0.9223
Out[18]:
<keras.callbacks.History at 0x7faf22bab7b8>

Single Dense Layer


In [11]:
def get_fc_model():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28,28)),
        Flatten(),
        Dense(512, activation='softmax'),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

fc = get_fc_model()


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_2 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))

As before, let's start with 1 epoch and a default low learning rate.


In [12]:
fc.fit_generator(batches, batches.N, nb_epoch=1, 
                validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/1
60000/60000 [==============================] - 14s - loss: 1.5465 - acc: 0.8880 - val_loss: 1.0166 - val_acc: 0.9237
Out[12]:
<keras.callbacks.History at 0x7fb8ef23b9e8>

In [14]:
fc.optimizer.lr=0.01
fc.fit_generator(batches, batches.N, nb_epoch=4, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/4
60000/60000 [==============================] - 13s - loss: 0.2707 - acc: 0.9417 - val_loss: 0.2827 - val_acc: 0.9352
Epoch 2/4
60000/60000 [==============================] - 14s - loss: 0.2521 - acc: 0.9445 - val_loss: 0.2799 - val_acc: 0.9369
Epoch 3/4
60000/60000 [==============================] - 14s - loss: 0.2386 - acc: 0.9460 - val_loss: 0.2612 - val_acc: 0.9384
Epoch 4/4
60000/60000 [==============================] - 14s - loss: 0.2302 - acc: 0.9465 - val_loss: 0.2702 - val_acc: 0.9346
Out[14]:
<keras.callbacks.History at 0x7fb8eee00320>

Basic 'VGG-style' CNN


In [15]:
def get_model():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28, 28)),
        Convolution2D(32,3,3, activation='relu'),
        Convolution2D(32,3,3, activation='relu'),
        MaxPooling2D(),
        Convolution2D(64,3,3, activation='relu'),
        Convolution2D(64,3,3, activation='relu'),
        MaxPooling2D(),
        Flatten(),
        Dense(512, activation='relu'),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

In [16]:
model = get_model()
model.fit_generator(batches, batches.N, nb_epoch=1,
                   validation_data=test_batches, nb_val_samples=test_batches.N)


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_3 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
Epoch 1/1
60000/60000 [==============================] - 21s - loss: 0.1100 - acc: 0.9671 - val_loss: 0.0299 - val_acc: 0.9900
Out[16]:
<keras.callbacks.History at 0x7fb8ed4de748>

In [25]:
model.optimizer.lr=0.1
model.fit_generator(batches, batches.N, nb_epoch=1, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/1
60000/60000 [==============================] - 22s - loss: 0.0361 - acc: 0.9888 - val_loss: 0.0304 - val_acc: 0.9904
Out[25]:
<keras.callbacks.History at 0x7faf22b3d7b8>

In [26]:
model.optimizer.lr=0.01
model.fit_generator(batches, batches.N, nb_epoch=8, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/8
60000/60000 [==============================] - 22s - loss: 0.0232 - acc: 0.9928 - val_loss: 0.0298 - val_acc: 0.9906
Epoch 2/8
60000/60000 [==============================] - 22s - loss: 0.0189 - acc: 0.9938 - val_loss: 0.0332 - val_acc: 0.9901
Epoch 3/8
60000/60000 [==============================] - 22s - loss: 0.0146 - acc: 0.9955 - val_loss: 0.0287 - val_acc: 0.9915
Epoch 4/8
60000/60000 [==============================] - 22s - loss: 0.0137 - acc: 0.9953 - val_loss: 0.0196 - val_acc: 0.9934
Epoch 5/8
60000/60000 [==============================] - 22s - loss: 0.0110 - acc: 0.9963 - val_loss: 0.0349 - val_acc: 0.9917
Epoch 6/8
60000/60000 [==============================] - 22s - loss: 0.0103 - acc: 0.9966 - val_loss: 0.0283 - val_acc: 0.9930
Epoch 7/8
60000/60000 [==============================] - 22s - loss: 0.0086 - acc: 0.9970 - val_loss: 0.0314 - val_acc: 0.9919
Epoch 8/8
60000/60000 [==============================] - 22s - loss: 0.0065 - acc: 0.9982 - val_loss: 0.0287 - val_acc: 0.9936
Out[26]:
<keras.callbacks.History at 0x7faf22b3d4a8>

Data Augmentation


In [17]:
model = get_model()


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_4 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))

In [19]:
# Now, we don't user the default settings for ImageDataGenerator
gen = image.ImageDataGenerator(rotation_range=8, width_shift_range=0.08, shear_range=0.3,
                               height_shift_range=0.08, zoom_range=0.08)
batches = gen.flow(X_train, y_train, batch_size=64)
test_batches = gen.flow(X_test, y_test, batch_size=64)

In [20]:
model.fit_generator(batches, batches.N, nb_epoch=1,
                   validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/1
60000/60000 [==============================] - 23s - loss: 0.1987 - acc: 0.9369 - val_loss: 0.0773 - val_acc: 0.9745
Out[20]:
<keras.callbacks.History at 0x7fb8de319c88>

In [21]:
model.optimizer.lr=0.1
model.fit_generator(batches, batches.N, nb_epoch=4,
                   validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/4
60000/60000 [==============================] - 22s - loss: 0.0706 - acc: 0.9784 - val_loss: 0.0521 - val_acc: 0.9838
Epoch 2/4
60000/60000 [==============================] - 23s - loss: 0.0549 - acc: 0.9836 - val_loss: 0.0391 - val_acc: 0.9852
Epoch 3/4
60000/60000 [==============================] - 23s - loss: 0.0475 - acc: 0.9854 - val_loss: 0.0635 - val_acc: 0.9819
Epoch 4/4
60000/60000 [==============================] - 22s - loss: 0.0438 - acc: 0.9861 - val_loss: 0.0451 - val_acc: 0.9855
Out[21]:
<keras.callbacks.History at 0x7fb8ddfad8d0>

In [22]:
model.optimizer.lr=0.01
model.fit_generator(batches, batches.N, nb_epoch=8, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/8
60000/60000 [==============================] - 23s - loss: 0.0397 - acc: 0.9882 - val_loss: 0.0376 - val_acc: 0.9888
Epoch 2/8
60000/60000 [==============================] - 23s - loss: 0.0365 - acc: 0.9884 - val_loss: 0.0356 - val_acc: 0.9891
Epoch 3/8
60000/60000 [==============================] - 22s - loss: 0.0348 - acc: 0.9892 - val_loss: 0.0405 - val_acc: 0.9877
Epoch 4/8
60000/60000 [==============================] - 22s - loss: 0.0320 - acc: 0.9899 - val_loss: 0.0281 - val_acc: 0.9908
Epoch 5/8
60000/60000 [==============================] - 22s - loss: 0.0304 - acc: 0.9901 - val_loss: 0.0301 - val_acc: 0.9907
Epoch 6/8
60000/60000 [==============================] - 22s - loss: 0.0278 - acc: 0.9911 - val_loss: 0.0321 - val_acc: 0.9897
Epoch 7/8
60000/60000 [==============================] - 22s - loss: 0.0285 - acc: 0.9906 - val_loss: 0.0289 - val_acc: 0.9901
Epoch 8/8
60000/60000 [==============================] - 23s - loss: 0.0271 - acc: 0.9917 - val_loss: 0.0350 - val_acc: 0.9899
Out[22]:
<keras.callbacks.History at 0x7fb8ec05fba8>

Batchnorm + data augmentation


In [23]:
def get_model_bn():
    model = Sequential([
            Lambda(norm_input, input_shape=(1,28,28)),
            Convolution2D(32,3,3, activation='relu'),
            BatchNormalization(axis=1),
            Convolution2D(32,3,3, activation='relu'),
            MaxPooling2D(),
            BatchNormalization(axis=1),
            Convolution2D(64,3,3, activation='relu'),
            BatchNormalization(axis=1),
            Convolution2D(64,3,3, activation='relu'),
            MaxPooling2D(),
            Flatten(),
            BatchNormalization(),
            Dense(512, activation='relu'),
            BatchNormalization(),
            Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

In [25]:
model = get_model_bn()
model.fit_generator(batches, batches.N, nb_epoch=1,
                   validation_data=test_batches, nb_val_samples=test_batches.N)


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_6 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
Epoch 1/1
60000/60000 [==============================] - 33s - loss: 0.1643 - acc: 0.9502 - val_loss: 0.0628 - val_acc: 0.9806
Out[25]:
<keras.callbacks.History at 0x7fb8cedd7a20>

In [26]:
model.optimizer.lr=0.1
model.fit_generator(batches, batches.N, nb_epoch=4, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/4
60000/60000 [==============================] - 33s - loss: 0.0700 - acc: 0.9781 - val_loss: 0.0534 - val_acc: 0.9836
Epoch 2/4
60000/60000 [==============================] - 33s - loss: 0.0604 - acc: 0.9817 - val_loss: 0.0532 - val_acc: 0.9835
Epoch 3/4
60000/60000 [==============================] - 32s - loss: 0.0528 - acc: 0.9835 - val_loss: 0.0387 - val_acc: 0.9876
Epoch 4/4
60000/60000 [==============================] - 32s - loss: 0.0466 - acc: 0.9849 - val_loss: 0.0372 - val_acc: 0.9875
Out[26]:
<keras.callbacks.History at 0x7fb8ce1952b0>

In [27]:
model.optimizer.lr=0.001
model.fit_generator(batches, batches.N, nb_epoch=12, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/12
60000/60000 [==============================] - 32s - loss: 0.0458 - acc: 0.9855 - val_loss: 0.0332 - val_acc: 0.9886
Epoch 2/12
60000/60000 [==============================] - 32s - loss: 0.0408 - acc: 0.9874 - val_loss: 0.0336 - val_acc: 0.9897
Epoch 3/12
60000/60000 [==============================] - 32s - loss: 0.0394 - acc: 0.9874 - val_loss: 0.0324 - val_acc: 0.9897
Epoch 4/12
60000/60000 [==============================] - 32s - loss: 0.0372 - acc: 0.9890 - val_loss: 0.0436 - val_acc: 0.9868
Epoch 5/12
60000/60000 [==============================] - 32s - loss: 0.0344 - acc: 0.9893 - val_loss: 0.0358 - val_acc: 0.9888
Epoch 6/12
60000/60000 [==============================] - 33s - loss: 0.0320 - acc: 0.9899 - val_loss: 0.0420 - val_acc: 0.9861
Epoch 7/12
60000/60000 [==============================] - 32s - loss: 0.0309 - acc: 0.9906 - val_loss: 0.0295 - val_acc: 0.9899
Epoch 8/12
60000/60000 [==============================] - 33s - loss: 0.0309 - acc: 0.9901 - val_loss: 0.0269 - val_acc: 0.9916
Epoch 9/12
60000/60000 [==============================] - 31s - loss: 0.0280 - acc: 0.9911 - val_loss: 0.0326 - val_acc: 0.9905
Epoch 10/12
60000/60000 [==============================] - 32s - loss: 0.0278 - acc: 0.9914 - val_loss: 0.0203 - val_acc: 0.9937
Epoch 11/12
60000/60000 [==============================] - 32s - loss: 0.0279 - acc: 0.9914 - val_loss: 0.0266 - val_acc: 0.9914
Epoch 12/12
60000/60000 [==============================] - 32s - loss: 0.0241 - acc: 0.9922 - val_loss: 0.0294 - val_acc: 0.9915
Out[27]:
<keras.callbacks.History at 0x7fb8cdcf6390>

Batchnorm + dropout + data augmentation


In [28]:
def get_model_bn_do():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28,28)),
        Convolution2D(32,3,3, activation='relu'),
        BatchNormalization(axis=1),
        Convolution2D(32,3,3, activation='relu'),
        MaxPooling2D(),
        BatchNormalization(axis=1),
        Convolution2D(64,3,3, activation='relu'),
        BatchNormalization(axis=1),
        Convolution2D(64,3,3, activation='relu'),
        MaxPooling2D(),
        Flatten(),
        BatchNormalization(),
        Dense(512, activation='relu'),
        BatchNormalization(),
        Dropout(0.5),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

In [29]:
model = get_model_bn_do()


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_7 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))

In [30]:
model.optimizer.lr=0.01
model.fit_generator(batches, batches.N, nb_epoch=12, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)


Epoch 1/12
60000/60000 [==============================] - 32s - loss: 0.2491 - acc: 0.9341 - val_loss: 0.1253 - val_acc: 0.9624
Epoch 2/12
60000/60000 [==============================] - 33s - loss: 0.1139 - acc: 0.9672 - val_loss: 0.0875 - val_acc: 0.9750
Epoch 3/12
60000/60000 [==============================] - 32s - loss: 0.0986 - acc: 0.9708 - val_loss: 0.0978 - val_acc: 0.9682
Epoch 4/12
60000/60000 [==============================] - 32s - loss: 0.0986 - acc: 0.9718 - val_loss: 0.0581 - val_acc: 0.9825
Epoch 5/12
60000/60000 [==============================] - 33s - loss: 0.0900 - acc: 0.9746 - val_loss: 0.0687 - val_acc: 0.9799
Epoch 6/12
60000/60000 [==============================] - 32s - loss: 0.0933 - acc: 0.9742 - val_loss: 0.0656 - val_acc: 0.9822
Epoch 7/12
60000/60000 [==============================] - 33s - loss: 0.0897 - acc: 0.9758 - val_loss: 0.0466 - val_acc: 0.9862
Epoch 8/12
60000/60000 [==============================] - 32s - loss: 0.0873 - acc: 0.9759 - val_loss: 0.0655 - val_acc: 0.9857
Epoch 9/12
60000/60000 [==============================] - 32s - loss: 0.0863 - acc: 0.9768 - val_loss: 0.0660 - val_acc: 0.9831
Epoch 10/12
60000/60000 [==============================] - 32s - loss: 0.0830 - acc: 0.9782 - val_loss: 0.0462 - val_acc: 0.9873
Epoch 11/12
60000/60000 [==============================] - 33s - loss: 0.0876 - acc: 0.9780 - val_loss: 0.0709 - val_acc: 0.9835
Epoch 12/12
60000/60000 [==============================] - 32s - loss: 0.0800 - acc: 0.9787 - val_loss: 0.0430 - val_acc: 0.9883
Out[30]:
<keras.callbacks.History at 0x7fb8c9da4828>

Ensembling

Ensembling is a way that can often improve your accuracy. It takes many models and combines them together.


In [33]:
def fit_model():
    model = get_model_bn_do()
    model.fit_generator(batches, batches.N, nb_epoch=1, verbose=0,
                        validation_data=test_batches, nb_val_samples=test_batches.N)
    model.optimizer.lr=0.1
    model.fit_generator(batches, batches.N, nb_epoch=4, verbose=0,
                        validation_data=test_batches, nb_val_samples=test_batches.N)
    model.optimizer.lr=0.01
    model.fit_generator(batches, batches.N, nb_epoch=12, verbose=0,
                        validation_data=test_batches, nb_val_samples=test_batches.N)
    # model.optimizer.lr=0.001
    # model.fit_generator(batches, batches.N, nb_epoch=18, verbose=0,
    #                    validation_data=test_batches, nb_val_samples=test_batches.N)
    return model

In [34]:
# Return a list of models
models = [fit_model() for i in range(6)]


/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_9 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_10 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_11 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_12 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_13 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_14 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))

In [35]:
path = 'data/mnist/'
model_path = path + 'models/'

In [37]:
for i, m in enumerate(models):
    m.save_weights(model_path+'cnn-mnist23-'+str(i)+'.pkl')

In [38]:
evals = np.array([m.evaluate(X_test, y_test, batch_size=256) for m in models])


 9984/10000 [============================>.] - ETA: 0s

In [39]:
evals.mean(axis=0)


Out[39]:
array([ 0.0158,  0.9952])

In [41]:
all_preds = np.stack([m.predict(X_test, batch_size=256) for m in models])
all_preds.shape


Out[41]:
(6, 10000, 10)

In [42]:
avg_preds = all_preds.mean(axis=0)

In [43]:
keras.metrics.categorical_accuracy(y_test, avg_preds).eval()


Out[43]:
array(0.996999979019165, dtype=float32)

In [ ]: