Model Building for MNIST



In [1]:

    
from theano.sandbox import cuda
cuda.use('gpu1')









    



WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10).  Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

WARNING (theano.sandbox.cuda): Ignoring call to use(1), GPU number 0 is already in use.



In [2]:

    
%matplotlib inline
from importlib import reload
import utils; reload(utils)
from utils import *
from __future__ import division, print_function









    



Using Theano backend.

Setup



In [3]:

    
batch_size = 64
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
(X_train.shape, y_train.shape, X_test.shape, y_test.shape)









    Out[3]:





((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))



In [4]:

    
# Because MNIST is grey-scale images, it does not have the color column,
# Let's add one empty dim  to the X data
X_test = np.expand_dims(X_test, 1)
X_train = np.expand_dims(X_train, 1)
X_train.shape









    Out[4]:





(60000, 1, 28, 28)



In [5]:

    
y_train[:5]









    Out[5]:





array([5, 0, 4, 1, 9], dtype=uint8)



In [6]:

    
y_train = onehot(y_train)
y_test = onehot(y_test)
y_train[:5]









    Out[6]:





array([[ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])

Now, let's normalize the inputs



In [7]:

    
mean_px = X_train.mean().astype(np.float32)
std_px = X_train.std().astype(np.float32)



In [8]:

    
def norm_input(x): return (x-mean_px)/std_px

Linear model

Why not we just fine-tune the imagenet model?

Because imageNet is 214 x 214 and is full-color. Here we have 28 x 28 and greyscale.

So we need to start from scratch.



In [9]:

    
def get_lin_model():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28,28)),
        Flatten(),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

lm = get_lin_model()









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_1 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))



In [10]:

    
gen = image.ImageDataGenerator()
batches = gen.flow(X_train, y_train, batch_size=64)
test_batches = gen.flow(X_test, y_test, batch_size=64)



In [17]:

    
lm.fit_generator(batches, batches.N, nb_epoch=1, 
                validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/1
60000/60000 [==============================] - 14s - loss: 0.4259 - acc: 0.8735 - val_loss: 0.3073 - val_acc: 0.9142






    Out[17]:





<keras.callbacks.History at 0x7faf22bab400>

It's always recommended to start with epoch 1 and a low learning rate. Defaut is 0.0001



In [18]:

    
lm.optimizer.lr = 0.1
lm.fit_generator(batches, batches.N, nb_epoch=3,
                validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/3
60000/60000 [==============================] - 14s - loss: 0.2987 - acc: 0.9149 - val_loss: 0.2854 - val_acc: 0.9181
Epoch 2/3
60000/60000 [==============================] - 14s - loss: 0.2842 - acc: 0.9201 - val_loss: 0.2820 - val_acc: 0.9192
Epoch 3/3
60000/60000 [==============================] - 13s - loss: 0.2769 - acc: 0.9224 - val_loss: 0.2733 - val_acc: 0.9223






    Out[18]:





<keras.callbacks.History at 0x7faf22bab7b8>

Single Dense Layer



In [11]:

    
def get_fc_model():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28,28)),
        Flatten(),
        Dense(512, activation='softmax'),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model

fc = get_fc_model()









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_2 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))

As before, let's start with 1 epoch and a default low learning rate.



In [12]:

    
fc.fit_generator(batches, batches.N, nb_epoch=1, 
                validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/1
60000/60000 [==============================] - 14s - loss: 1.5465 - acc: 0.8880 - val_loss: 1.0166 - val_acc: 0.9237






    Out[12]:





<keras.callbacks.History at 0x7fb8ef23b9e8>



In [14]:

    
fc.optimizer.lr=0.01
fc.fit_generator(batches, batches.N, nb_epoch=4, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/4
60000/60000 [==============================] - 13s - loss: 0.2707 - acc: 0.9417 - val_loss: 0.2827 - val_acc: 0.9352
Epoch 2/4
60000/60000 [==============================] - 14s - loss: 0.2521 - acc: 0.9445 - val_loss: 0.2799 - val_acc: 0.9369
Epoch 3/4
60000/60000 [==============================] - 14s - loss: 0.2386 - acc: 0.9460 - val_loss: 0.2612 - val_acc: 0.9384
Epoch 4/4
60000/60000 [==============================] - 14s - loss: 0.2302 - acc: 0.9465 - val_loss: 0.2702 - val_acc: 0.9346






    Out[14]:





<keras.callbacks.History at 0x7fb8eee00320>

Basic 'VGG-style' CNN



In [15]:

    
def get_model():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28, 28)),
        Convolution2D(32,3,3, activation='relu'),
        Convolution2D(32,3,3, activation='relu'),
        MaxPooling2D(),
        Convolution2D(64,3,3, activation='relu'),
        Convolution2D(64,3,3, activation='relu'),
        MaxPooling2D(),
        Flatten(),
        Dense(512, activation='relu'),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model



In [16]:

    
model = get_model()
model.fit_generator(batches, batches.N, nb_epoch=1,
                   validation_data=test_batches, nb_val_samples=test_batches.N)









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_3 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))






    



Epoch 1/1
60000/60000 [==============================] - 21s - loss: 0.1100 - acc: 0.9671 - val_loss: 0.0299 - val_acc: 0.9900






    Out[16]:





<keras.callbacks.History at 0x7fb8ed4de748>



In [25]:

    
model.optimizer.lr=0.1
model.fit_generator(batches, batches.N, nb_epoch=1, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/1
60000/60000 [==============================] - 22s - loss: 0.0361 - acc: 0.9888 - val_loss: 0.0304 - val_acc: 0.9904






    Out[25]:





<keras.callbacks.History at 0x7faf22b3d7b8>



In [26]:

    
model.optimizer.lr=0.01
model.fit_generator(batches, batches.N, nb_epoch=8, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/8
60000/60000 [==============================] - 22s - loss: 0.0232 - acc: 0.9928 - val_loss: 0.0298 - val_acc: 0.9906
Epoch 2/8
60000/60000 [==============================] - 22s - loss: 0.0189 - acc: 0.9938 - val_loss: 0.0332 - val_acc: 0.9901
Epoch 3/8
60000/60000 [==============================] - 22s - loss: 0.0146 - acc: 0.9955 - val_loss: 0.0287 - val_acc: 0.9915
Epoch 4/8
60000/60000 [==============================] - 22s - loss: 0.0137 - acc: 0.9953 - val_loss: 0.0196 - val_acc: 0.9934
Epoch 5/8
60000/60000 [==============================] - 22s - loss: 0.0110 - acc: 0.9963 - val_loss: 0.0349 - val_acc: 0.9917
Epoch 6/8
60000/60000 [==============================] - 22s - loss: 0.0103 - acc: 0.9966 - val_loss: 0.0283 - val_acc: 0.9930
Epoch 7/8
60000/60000 [==============================] - 22s - loss: 0.0086 - acc: 0.9970 - val_loss: 0.0314 - val_acc: 0.9919
Epoch 8/8
60000/60000 [==============================] - 22s - loss: 0.0065 - acc: 0.9982 - val_loss: 0.0287 - val_acc: 0.9936






    Out[26]:





<keras.callbacks.History at 0x7faf22b3d4a8>

Data Augmentation



In [17]:

    
model = get_model()









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_4 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))



In [19]:

    
# Now, we don't user the default settings for ImageDataGenerator
gen = image.ImageDataGenerator(rotation_range=8, width_shift_range=0.08, shear_range=0.3,
                               height_shift_range=0.08, zoom_range=0.08)
batches = gen.flow(X_train, y_train, batch_size=64)
test_batches = gen.flow(X_test, y_test, batch_size=64)



In [20]:

    
model.fit_generator(batches, batches.N, nb_epoch=1,
                   validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/1
60000/60000 [==============================] - 23s - loss: 0.1987 - acc: 0.9369 - val_loss: 0.0773 - val_acc: 0.9745






    Out[20]:





<keras.callbacks.History at 0x7fb8de319c88>



In [21]:

    
model.optimizer.lr=0.1
model.fit_generator(batches, batches.N, nb_epoch=4,
                   validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/4
60000/60000 [==============================] - 22s - loss: 0.0706 - acc: 0.9784 - val_loss: 0.0521 - val_acc: 0.9838
Epoch 2/4
60000/60000 [==============================] - 23s - loss: 0.0549 - acc: 0.9836 - val_loss: 0.0391 - val_acc: 0.9852
Epoch 3/4
60000/60000 [==============================] - 23s - loss: 0.0475 - acc: 0.9854 - val_loss: 0.0635 - val_acc: 0.9819
Epoch 4/4
60000/60000 [==============================] - 22s - loss: 0.0438 - acc: 0.9861 - val_loss: 0.0451 - val_acc: 0.9855






    Out[21]:





<keras.callbacks.History at 0x7fb8ddfad8d0>



In [22]:

    
model.optimizer.lr=0.01
model.fit_generator(batches, batches.N, nb_epoch=8, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/8
60000/60000 [==============================] - 23s - loss: 0.0397 - acc: 0.9882 - val_loss: 0.0376 - val_acc: 0.9888
Epoch 2/8
60000/60000 [==============================] - 23s - loss: 0.0365 - acc: 0.9884 - val_loss: 0.0356 - val_acc: 0.9891
Epoch 3/8
60000/60000 [==============================] - 22s - loss: 0.0348 - acc: 0.9892 - val_loss: 0.0405 - val_acc: 0.9877
Epoch 4/8
60000/60000 [==============================] - 22s - loss: 0.0320 - acc: 0.9899 - val_loss: 0.0281 - val_acc: 0.9908
Epoch 5/8
60000/60000 [==============================] - 22s - loss: 0.0304 - acc: 0.9901 - val_loss: 0.0301 - val_acc: 0.9907
Epoch 6/8
60000/60000 [==============================] - 22s - loss: 0.0278 - acc: 0.9911 - val_loss: 0.0321 - val_acc: 0.9897
Epoch 7/8
60000/60000 [==============================] - 22s - loss: 0.0285 - acc: 0.9906 - val_loss: 0.0289 - val_acc: 0.9901
Epoch 8/8
60000/60000 [==============================] - 23s - loss: 0.0271 - acc: 0.9917 - val_loss: 0.0350 - val_acc: 0.9899






    Out[22]:





<keras.callbacks.History at 0x7fb8ec05fba8>

Batchnorm + data augmentation



In [23]:

    
def get_model_bn():
    model = Sequential([
            Lambda(norm_input, input_shape=(1,28,28)),
            Convolution2D(32,3,3, activation='relu'),
            BatchNormalization(axis=1),
            Convolution2D(32,3,3, activation='relu'),
            MaxPooling2D(),
            BatchNormalization(axis=1),
            Convolution2D(64,3,3, activation='relu'),
            BatchNormalization(axis=1),
            Convolution2D(64,3,3, activation='relu'),
            MaxPooling2D(),
            Flatten(),
            BatchNormalization(),
            Dense(512, activation='relu'),
            BatchNormalization(),
            Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model



In [25]:

    
model = get_model_bn()
model.fit_generator(batches, batches.N, nb_epoch=1,
                   validation_data=test_batches, nb_val_samples=test_batches.N)









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_6 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))






    



Epoch 1/1
60000/60000 [==============================] - 33s - loss: 0.1643 - acc: 0.9502 - val_loss: 0.0628 - val_acc: 0.9806






    Out[25]:





<keras.callbacks.History at 0x7fb8cedd7a20>



In [26]:

    
model.optimizer.lr=0.1
model.fit_generator(batches, batches.N, nb_epoch=4, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/4
60000/60000 [==============================] - 33s - loss: 0.0700 - acc: 0.9781 - val_loss: 0.0534 - val_acc: 0.9836
Epoch 2/4
60000/60000 [==============================] - 33s - loss: 0.0604 - acc: 0.9817 - val_loss: 0.0532 - val_acc: 0.9835
Epoch 3/4
60000/60000 [==============================] - 32s - loss: 0.0528 - acc: 0.9835 - val_loss: 0.0387 - val_acc: 0.9876
Epoch 4/4
60000/60000 [==============================] - 32s - loss: 0.0466 - acc: 0.9849 - val_loss: 0.0372 - val_acc: 0.9875






    Out[26]:





<keras.callbacks.History at 0x7fb8ce1952b0>



In [27]:

    
model.optimizer.lr=0.001
model.fit_generator(batches, batches.N, nb_epoch=12, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/12
60000/60000 [==============================] - 32s - loss: 0.0458 - acc: 0.9855 - val_loss: 0.0332 - val_acc: 0.9886
Epoch 2/12
60000/60000 [==============================] - 32s - loss: 0.0408 - acc: 0.9874 - val_loss: 0.0336 - val_acc: 0.9897
Epoch 3/12
60000/60000 [==============================] - 32s - loss: 0.0394 - acc: 0.9874 - val_loss: 0.0324 - val_acc: 0.9897
Epoch 4/12
60000/60000 [==============================] - 32s - loss: 0.0372 - acc: 0.9890 - val_loss: 0.0436 - val_acc: 0.9868
Epoch 5/12
60000/60000 [==============================] - 32s - loss: 0.0344 - acc: 0.9893 - val_loss: 0.0358 - val_acc: 0.9888
Epoch 6/12
60000/60000 [==============================] - 33s - loss: 0.0320 - acc: 0.9899 - val_loss: 0.0420 - val_acc: 0.9861
Epoch 7/12
60000/60000 [==============================] - 32s - loss: 0.0309 - acc: 0.9906 - val_loss: 0.0295 - val_acc: 0.9899
Epoch 8/12
60000/60000 [==============================] - 33s - loss: 0.0309 - acc: 0.9901 - val_loss: 0.0269 - val_acc: 0.9916
Epoch 9/12
60000/60000 [==============================] - 31s - loss: 0.0280 - acc: 0.9911 - val_loss: 0.0326 - val_acc: 0.9905
Epoch 10/12
60000/60000 [==============================] - 32s - loss: 0.0278 - acc: 0.9914 - val_loss: 0.0203 - val_acc: 0.9937
Epoch 11/12
60000/60000 [==============================] - 32s - loss: 0.0279 - acc: 0.9914 - val_loss: 0.0266 - val_acc: 0.9914
Epoch 12/12
60000/60000 [==============================] - 32s - loss: 0.0241 - acc: 0.9922 - val_loss: 0.0294 - val_acc: 0.9915






    Out[27]:





<keras.callbacks.History at 0x7fb8cdcf6390>

Batchnorm + dropout + data augmentation



In [28]:

    
def get_model_bn_do():
    model = Sequential([
        Lambda(norm_input, input_shape=(1,28,28)),
        Convolution2D(32,3,3, activation='relu'),
        BatchNormalization(axis=1),
        Convolution2D(32,3,3, activation='relu'),
        MaxPooling2D(),
        BatchNormalization(axis=1),
        Convolution2D(64,3,3, activation='relu'),
        BatchNormalization(axis=1),
        Convolution2D(64,3,3, activation='relu'),
        MaxPooling2D(),
        Flatten(),
        BatchNormalization(),
        Dense(512, activation='relu'),
        BatchNormalization(),
        Dropout(0.5),
        Dense(10, activation='softmax')
        ])
    model.compile(Adam(), loss='categorical_crossentropy', metrics=['accuracy'])
    return model



In [29]:

    
model = get_model_bn_do()









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_7 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))



In [30]:

    
model.optimizer.lr=0.01
model.fit_generator(batches, batches.N, nb_epoch=12, 
                    validation_data=test_batches, nb_val_samples=test_batches.N)









    



Epoch 1/12
60000/60000 [==============================] - 32s - loss: 0.2491 - acc: 0.9341 - val_loss: 0.1253 - val_acc: 0.9624
Epoch 2/12
60000/60000 [==============================] - 33s - loss: 0.1139 - acc: 0.9672 - val_loss: 0.0875 - val_acc: 0.9750
Epoch 3/12
60000/60000 [==============================] - 32s - loss: 0.0986 - acc: 0.9708 - val_loss: 0.0978 - val_acc: 0.9682
Epoch 4/12
60000/60000 [==============================] - 32s - loss: 0.0986 - acc: 0.9718 - val_loss: 0.0581 - val_acc: 0.9825
Epoch 5/12
60000/60000 [==============================] - 33s - loss: 0.0900 - acc: 0.9746 - val_loss: 0.0687 - val_acc: 0.9799
Epoch 6/12
60000/60000 [==============================] - 32s - loss: 0.0933 - acc: 0.9742 - val_loss: 0.0656 - val_acc: 0.9822
Epoch 7/12
60000/60000 [==============================] - 33s - loss: 0.0897 - acc: 0.9758 - val_loss: 0.0466 - val_acc: 0.9862
Epoch 8/12
60000/60000 [==============================] - 32s - loss: 0.0873 - acc: 0.9759 - val_loss: 0.0655 - val_acc: 0.9857
Epoch 9/12
60000/60000 [==============================] - 32s - loss: 0.0863 - acc: 0.9768 - val_loss: 0.0660 - val_acc: 0.9831
Epoch 10/12
60000/60000 [==============================] - 32s - loss: 0.0830 - acc: 0.9782 - val_loss: 0.0462 - val_acc: 0.9873
Epoch 11/12
60000/60000 [==============================] - 33s - loss: 0.0876 - acc: 0.9780 - val_loss: 0.0709 - val_acc: 0.9835
Epoch 12/12
60000/60000 [==============================] - 32s - loss: 0.0800 - acc: 0.9787 - val_loss: 0.0430 - val_acc: 0.9883






    Out[30]:





<keras.callbacks.History at 0x7fb8c9da4828>

Ensembling

Ensembling is a way that can often improve your accuracy. It takes many models and combines them together.



In [33]:

    
def fit_model():
    model = get_model_bn_do()
    model.fit_generator(batches, batches.N, nb_epoch=1, verbose=0,
                        validation_data=test_batches, nb_val_samples=test_batches.N)
    model.optimizer.lr=0.1
    model.fit_generator(batches, batches.N, nb_epoch=4, verbose=0,
                        validation_data=test_batches, nb_val_samples=test_batches.N)
    model.optimizer.lr=0.01
    model.fit_generator(batches, batches.N, nb_epoch=12, verbose=0,
                        validation_data=test_batches, nb_val_samples=test_batches.N)
    # model.optimizer.lr=0.001
    # model.fit_generator(batches, batches.N, nb_epoch=18, verbose=0,
    #                    validation_data=test_batches, nb_val_samples=test_batches.N)
    return model



In [34]:

    
# Return a list of models
models = [fit_model() for i in range(6)]









    



/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_9 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_10 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_11 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_12 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_13 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))
/home/ubuntu/anaconda2/envs/py36/lib/python3.6/site-packages/keras/layers/core.py:577: UserWarning: `output_shape` argument not specified for layer lambda_14 and cannot be automatically inferred with the Theano backend. Defaulting to output shape `(None, 1, 28, 28)` (same as input shape). If the expected output shape is different, specify it via the `output_shape` argument.
  .format(self.name, input_shape))



In [35]:

    
path = 'data/mnist/'
model_path = path + 'models/'



In [37]:

    
for i, m in enumerate(models):
    m.save_weights(model_path+'cnn-mnist23-'+str(i)+'.pkl')



In [38]:

    
evals = np.array([m.evaluate(X_test, y_test, batch_size=256) for m in models])









    



 9984/10000 [============================>.] - ETA: 0s



In [39]:

    
evals.mean(axis=0)









    Out[39]:





array([ 0.0158,  0.9952])



In [41]:

    
all_preds = np.stack([m.predict(X_test, batch_size=256) for m in models])
all_preds.shape









    Out[41]:





(6, 10000, 10)



In [42]:

    
avg_preds = all_preds.mean(axis=0)



In [43]:

    
keras.metrics.categorical_accuracy(y_test, avg_preds).eval()









    Out[43]:





array(0.996999979019165, dtype=float32)



In [ ]: