In [17]:
from IPython.display import Image
from IPython.core.display import HTML

Autoencoders

What are Autoencoders?


In [16]:
PATH = "/Users/raghu/Downloads/"
Image(filename = PATH + "autoencoder_schema.jpg", width=500, height=500)


Out[16]:

"Autoencoding" is a data compression algorithm where the compression and decompression functions are 1) data-specific, 2) lossy, and 3) learned automatically from examples rather than engineered by a human. Additionally, in almost all contexts where the term "autoencoder" is used, the compression and decompression functions are implemented with neural networks.

1) Autoencoders are data-specific, which means that they will only be able to compress data similar to what they have been trained on. This is different from, say, the MPEG-2 Audio Layer III (MP3) compression algorithm, which only holds assumptions about "sound" in general, but not about specific types of sounds. An autoencoder trained on pictures of faces would do a rather poor job of compressing pictures of trees, because the features it would learn would be face-specific.

2) Autoencoders are lossy, which means that the decompressed outputs will be degraded compared to the original inputs (similar to MP3 or JPEG compression). This differs from lossless arithmetic compression.

3) Autoencoders are learned automatically from data examples, which is a useful property: it means that it is easy to train specialized instances of the algorithm that will perform well on a specific type of input. It doesn't require any new engineering, just appropriate training data.

To build an autoencoder, you need three things: an encoding function, a decoding function, and a distance function between the amount of information loss between the compressed representation of your data and the decompressed representation (i.e. a "loss" function). The encoder and decoder will be chosen to be parametric functions (typically neural networks), and to be differentiable with respect to the distance function, so the parameters of the encoding/decoding functions can be optimize to minimize the reconstruction loss, using Stochastic Gradient Descent. It's simple! And you don't even need to understand any of these words to start using autoencoders in practice.

Are they good at data compression?

Usually, not really. In picture compression for instance, it is pretty difficult to train an autoencoder that does a better job than a basic algorithm like JPEG, and typically the only way it can be achieved is by restricting yourself to a very specific type of picture (e.g. one for which JPEG does not do a good job). The fact that autoencoders are data-specific makes them generally impractical for real-world data compression problems: you can only use them on data that is similar to what they were trained on, and making them more general thus requires lots of training data.

What are applications of autoencoders?

They are rarely used in practical applications. In 2012 they briefly found an application in greedy layer-wise pretraining for deep convolutional neural networks, but this quickly fell out of fashion as we started realizing that better random weight initialization schemes were sufficient for training deep networks from scratch. In 2014, batch normalization started allowing for even deeper networks, and from late 2015 we could train arbitrarily deep networks from scratch using residual learning.

Today two interesting practical applications of autoencoders are data denoising (which we feature later in this post), and dimensionality reduction for data visualization. With appropriate dimensionality and sparsity constraints, autoencoders can learn data projections that are more interesting than PCA or other basic techniques.

For 2D visualization specifically, t-SNE (pronounced "tee-snee") is probably the best algorithm around, but it typically requires relatively low-dimensional data. So a good strategy for visualizing similarity relationships in high-dimensional data is to start by using an autoencoder to compress your data into a low-dimensional space (e.g. 32 dimensional), then use t-SNE for mapping the compressed data to a 2D plane.

Lets build simplest Autoencoder


In [76]:
from keras.layers import Input, Dense
from keras.models import Model

# this is the size of our encoded representations
encoding_dim = 32  # 32 floats -> compression of factor 24.5, assuming the input is 784 floats

# this is our input placeholder
input_img = Input(shape=(784,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu')(input_img)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(784, activation='sigmoid')(encoded)

# this model maps an input to its reconstruction
autoencoder = Model(input_img, decoded)

Let's also create a separate encoder model:


In [77]:
# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)

As well as the decoder model:


In [78]:
# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))

Now let's train our autoencoder to reconstruct MNIST digits.

First, we'll configure our model to use a per-pixel categorical crossentropy loss, and the Adagrad optimizer:


In [79]:
autoencoder.compile(optimizer='adagrad', loss='binary_crossentropy')

Let's prepare our input data. We're using MNIST digits, and we're discarding the labels (since we're only interested in encoding/decoding the input images).


In [80]:
from keras.datasets import mnist
import numpy as np
(x_train, _), (x_test, _) = mnist.load_data()

We will normalize all values between 0 and 1 and we will flatten the 28x28 images into vectors of size 784.


In [81]:
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)


(60000, 784)
(10000, 784)

Now let's train our autoencoder for 100 epochs:


In [67]:
autoencoder.fit(x_train, x_train,
                epochs=100,
                batch_size=32,
                shuffle=True,
                validation_data=(x_test, x_test))


Train on 60000 samples, validate on 10000 samples
Epoch 1/100
60000/60000 [==============================] - 13s 223us/step - loss: 0.1603 - val_loss: 0.1345
Epoch 2/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.1281 - val_loss: 0.1207
Epoch 3/100
60000/60000 [==============================] - 13s 214us/step - loss: 0.1184 - val_loss: 0.1141
Epoch 4/100
60000/60000 [==============================] - 13s 208us/step - loss: 0.1128 - val_loss: 0.1093
Epoch 5/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.1089 - val_loss: 0.1060
Epoch 6/100
60000/60000 [==============================] - 12s 208us/step - loss: 0.1061 - val_loss: 0.1036
Epoch 7/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.1039 - val_loss: 0.1018
Epoch 8/100
60000/60000 [==============================] - 13s 211us/step - loss: 0.1023 - val_loss: 0.1005
Epoch 9/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.1011 - val_loss: 0.0993
Epoch 10/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.1001 - val_loss: 0.0984
Epoch 11/100
60000/60000 [==============================] - 12s 206us/step - loss: 0.0993 - val_loss: 0.0977
Epoch 12/100
60000/60000 [==============================] - 12s 205us/step - loss: 0.0986 - val_loss: 0.0971
Epoch 13/100
60000/60000 [==============================] - 13s 209us/step - loss: 0.0980 - val_loss: 0.0965
Epoch 14/100
60000/60000 [==============================] - 12s 202us/step - loss: 0.0975 - val_loss: 0.0961
Epoch 15/100
60000/60000 [==============================] - 12s 205us/step - loss: 0.0971 - val_loss: 0.0957
Epoch 16/100
60000/60000 [==============================] - 12s 203us/step - loss: 0.0968 - val_loss: 0.0954
Epoch 17/100
60000/60000 [==============================] - 12s 205us/step - loss: 0.0965 - val_loss: 0.0951
Epoch 18/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0962 - val_loss: 0.0948
Epoch 19/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0960 - val_loss: 0.0946
Epoch 20/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.0958 - val_loss: 0.0944
Epoch 21/100
60000/60000 [==============================] - 12s 203us/step - loss: 0.0956 - val_loss: 0.0942
Epoch 22/100
60000/60000 [==============================] - 12s 203us/step - loss: 0.0954 - val_loss: 0.0941
Epoch 23/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0953 - val_loss: 0.0940
Epoch 24/100
60000/60000 [==============================] - 12s 208us/step - loss: 0.0951 - val_loss: 0.0939
Epoch 25/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.0950 - val_loss: 0.0937
Epoch 26/100
60000/60000 [==============================] - 12s 202us/step - loss: 0.0949 - val_loss: 0.0936
Epoch 27/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0948 - val_loss: 0.0936
Epoch 28/100
60000/60000 [==============================] - 12s 201us/step - loss: 0.0948 - val_loss: 0.0935
Epoch 29/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.0947 - val_loss: 0.0934
Epoch 30/100
60000/60000 [==============================] - 12s 203us/step - loss: 0.0946 - val_loss: 0.0933
Epoch 31/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0945 - val_loss: 0.0933
Epoch 32/100
60000/60000 [==============================] - 12s 201us/step - loss: 0.0945 - val_loss: 0.0932
Epoch 33/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0944 - val_loss: 0.0932
Epoch 34/100
60000/60000 [==============================] - 13s 209us/step - loss: 0.0944 - val_loss: 0.0931
Epoch 35/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0943 - val_loss: 0.0931
Epoch 36/100
60000/60000 [==============================] - 13s 212us/step - loss: 0.0943 - val_loss: 0.0930
Epoch 37/100
60000/60000 [==============================] - 13s 217us/step - loss: 0.0942 - val_loss: 0.0930
Epoch 38/100
60000/60000 [==============================] - 14s 232us/step - loss: 0.0942 - val_loss: 0.0929
Epoch 39/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.0942 - val_loss: 0.0929
Epoch 40/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0941 - val_loss: 0.0929
Epoch 41/100
60000/60000 [==============================] - 12s 201us/step - loss: 0.0941 - val_loss: 0.0929
Epoch 42/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0941 - val_loss: 0.0928
Epoch 43/100
60000/60000 [==============================] - 12s 195us/step - loss: 0.0940 - val_loss: 0.0928
Epoch 44/100
60000/60000 [==============================] - 12s 192us/step - loss: 0.0940 - val_loss: 0.0928
Epoch 45/100
60000/60000 [==============================] - 12s 199us/step - loss: 0.0940 - val_loss: 0.0927
Epoch 46/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0939 - val_loss: 0.0927
Epoch 47/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0939 - val_loss: 0.0927
Epoch 48/100
60000/60000 [==============================] - 11s 191us/step - loss: 0.0939 - val_loss: 0.0927
Epoch 49/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.0939 - val_loss: 0.0926
Epoch 50/100
60000/60000 [==============================] - 12s 196us/step - loss: 0.0939 - val_loss: 0.0926
Epoch 51/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0938 - val_loss: 0.0926
Epoch 52/100
60000/60000 [==============================] - 12s 195us/step - loss: 0.0938 - val_loss: 0.0926
Epoch 53/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.0938 - val_loss: 0.0926
Epoch 54/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0938 - val_loss: 0.0926
Epoch 55/100
60000/60000 [==============================] - 12s 196us/step - loss: 0.0937 - val_loss: 0.0925
Epoch 56/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0937 - val_loss: 0.0925
Epoch 57/100
60000/60000 [==============================] - 12s 205us/step - loss: 0.0937 - val_loss: 0.0925
Epoch 58/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0937 - val_loss: 0.0925
Epoch 59/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0937 - val_loss: 0.0925
Epoch 60/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0937 - val_loss: 0.0924
Epoch 61/100
60000/60000 [==============================] - 12s 198us/step - loss: 0.0936 - val_loss: 0.0924
Epoch 62/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0936 - val_loss: 0.0924
Epoch 63/100
60000/60000 [==============================] - 12s 199us/step - loss: 0.0936 - val_loss: 0.0924
Epoch 64/100
60000/60000 [==============================] - 12s 199us/step - loss: 0.0936 - val_loss: 0.0924
Epoch 65/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0936 - val_loss: 0.0924
Epoch 66/100
60000/60000 [==============================] - 12s 199us/step - loss: 0.0936 - val_loss: 0.0924
Epoch 67/100
60000/60000 [==============================] - 12s 196us/step - loss: 0.0936 - val_loss: 0.0923
Epoch 68/100
60000/60000 [==============================] - 12s 204us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 69/100
60000/60000 [==============================] - 12s 198us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 70/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 71/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 72/100
60000/60000 [==============================] - 12s 196us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 73/100
60000/60000 [==============================] - 12s 200us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 74/100
60000/60000 [==============================] - 12s 198us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 75/100
60000/60000 [==============================] - 12s 192us/step - loss: 0.0935 - val_loss: 0.0923
Epoch 76/100
60000/60000 [==============================] - 13s 209us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 77/100
60000/60000 [==============================] - 13s 211us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 78/100
60000/60000 [==============================] - 13s 211us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 79/100
60000/60000 [==============================] - 13s 209us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 80/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 81/100
60000/60000 [==============================] - 13s 208us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 82/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 83/100
60000/60000 [==============================] - 12s 206us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 84/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 85/100
60000/60000 [==============================] - 13s 209us/step - loss: 0.0934 - val_loss: 0.0922
Epoch 86/100
60000/60000 [==============================] - 12s 206us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 87/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 88/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 89/100
60000/60000 [==============================] - 13s 215us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 90/100
60000/60000 [==============================] - 12s 206us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 91/100
60000/60000 [==============================] - 13s 219us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 92/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 93/100
60000/60000 [==============================] - 12s 203us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 94/100
60000/60000 [==============================] - 13s 211us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 95/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 96/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 97/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0933 - val_loss: 0.0921
Epoch 98/100
60000/60000 [==============================] - 12s 205us/step - loss: 0.0932 - val_loss: 0.0921
Epoch 99/100
60000/60000 [==============================] - 13s 209us/step - loss: 0.0932 - val_loss: 0.0920
Epoch 100/100
60000/60000 [==============================] - 12s 208us/step - loss: 0.0932 - val_loss: 0.0920
Out[67]:
<keras.callbacks.History at 0x1a6863fbf98>

After 100 epochs, the autoencoder seems to reach a stable train/test loss value of about 0.0932. We can try to visualize the reconstructed inputs and the encoded representations. We will use Matplotlib.


In [82]:
# encode and decode some digits
# note that we take them from the *test* set
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)

In [69]:
# use Matplotlib 
import matplotlib.pyplot as plt

n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()


Here's what we get. The top row is the original digits, and the bottom row is the reconstructed digits. We are losing quite a bit of detail with this basic approach.

Adding a sparsity constraint on the encoded representations

In the previous example, the representations were only constrained by the size of the hidden layer (32). In such a situation, what typically happens is that the hidden layer is learning an approximation of PCA (principal component analysis). But another way to constrain the representations to be compact is to add a sparsity contraint on the activity of the hidden representations, so fewer units would "fire" at a given time. In Keras, this can be done by adding an activity_regularizer to our Dense layer:


In [70]:
from keras import regularizers

encoding_dim = 32

input_img = Input(shape=(784,))

# add a Dense layer with a L1 activity regularizer
encoded = Dense(encoding_dim, activation='relu',
                activity_regularizer=regularizers.l1(10e-5))(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)

autoencoder = Model(input_img, decoded)

Let's train this model for 100 epochs (with the added regularization the model is less likely to overfit and can be trained longer).


In [72]:
autoencoder.compile(optimizer='adagrad', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train,
                epochs=250,
                batch_size=25,
                shuffle=True,
                validation_data=(x_test, x_test))


Train on 60000 samples, validate on 10000 samples
Epoch 1/250
60000/60000 [==============================] - 15s 249us/step - loss: 0.1960 - val_loss: 0.1943
Epoch 2/250
60000/60000 [==============================] - 14s 233us/step - loss: 0.1952 - val_loss: 0.1939
Epoch 3/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1948 - val_loss: 0.1936
Epoch 4/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1946 - val_loss: 0.1933
Epoch 5/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1943 - val_loss: 0.1931
Epoch 6/250
60000/60000 [==============================] - 14s 232us/step - loss: 0.1941 - val_loss: 0.1929
Epoch 7/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1940 - val_loss: 0.1928
Epoch 8/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1938 - val_loss: 0.1926
Epoch 9/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1937 - val_loss: 0.1925
Epoch 10/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1935 - val_loss: 0.1924
Epoch 11/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1934 - val_loss: 0.1923
Epoch 12/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1933 - val_loss: 0.1922
Epoch 13/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1932 - val_loss: 0.1921
Epoch 14/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1931 - val_loss: 0.1920
Epoch 15/250
60000/60000 [==============================] - 13s 225us/step - loss: 0.1930 - val_loss: 0.1919
Epoch 16/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1930 - val_loss: 0.1918
Epoch 17/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1929 - val_loss: 0.1917
Epoch 18/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1928 - val_loss: 0.1916
Epoch 19/250
60000/60000 [==============================] - 13s 225us/step - loss: 0.1927 - val_loss: 0.1916
Epoch 20/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1927 - val_loss: 0.1915
Epoch 21/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1926 - val_loss: 0.1914
Epoch 22/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1925 - val_loss: 0.1914
Epoch 23/250
60000/60000 [==============================] - 15s 244us/step - loss: 0.1925 - val_loss: 0.1913
Epoch 24/250
60000/60000 [==============================] - 15s 248us/step - loss: 0.1924 - val_loss: 0.1913
Epoch 25/250
60000/60000 [==============================] - 15s 250us/step - loss: 0.1923 - val_loss: 0.1912
Epoch 26/250
60000/60000 [==============================] - 15s 250us/step - loss: 0.1923 - val_loss: 0.1911
Epoch 27/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1922 - val_loss: 0.1911
Epoch 28/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1922 - val_loss: 0.1910
Epoch 29/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1921 - val_loss: 0.1910
Epoch 30/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1921 - val_loss: 0.1909
Epoch 31/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1920 - val_loss: 0.1909
Epoch 32/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1920 - val_loss: 0.1909
Epoch 33/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1919 - val_loss: 0.1908
Epoch 34/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1919 - val_loss: 0.1907
Epoch 35/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1919 - val_loss: 0.1907
Epoch 36/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1918 - val_loss: 0.1907
Epoch 37/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1918 - val_loss: 0.1906
Epoch 38/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1917 - val_loss: 0.1906
Epoch 39/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1917 - val_loss: 0.1905
Epoch 40/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1917 - val_loss: 0.1905
Epoch 41/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1916 - val_loss: 0.1905
Epoch 42/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1916 - val_loss: 0.1904
Epoch 43/250
60000/60000 [==============================] - 13s 214us/step - loss: 0.1915 - val_loss: 0.1904
Epoch 44/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1915 - val_loss: 0.1904
Epoch 45/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1915 - val_loss: 0.1903
Epoch 46/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1914 - val_loss: 0.1903
Epoch 47/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1914 - val_loss: 0.1903
Epoch 48/250
60000/60000 [==============================] - 13s 210us/step - loss: 0.1914 - val_loss: 0.1902
Epoch 49/250
60000/60000 [==============================] - 13s 214us/step - loss: 0.1913 - val_loss: 0.1902
Epoch 50/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1913 - val_loss: 0.1902
Epoch 51/250
60000/60000 [==============================] - 13s 213us/step - loss: 0.1913 - val_loss: 0.1901
Epoch 52/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1912 - val_loss: 0.1901
Epoch 53/250
60000/60000 [==============================] - 13s 208us/step - loss: 0.1912 - val_loss: 0.1901
Epoch 54/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1912 - val_loss: 0.1900
Epoch 55/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1912 - val_loss: 0.1900
Epoch 56/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1911 - val_loss: 0.1900
Epoch 57/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1911 - val_loss: 0.1900
Epoch 58/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1911 - val_loss: 0.1899
Epoch 59/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1911 - val_loss: 0.1899
Epoch 60/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1910 - val_loss: 0.1899
Epoch 61/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1910 - val_loss: 0.1899
Epoch 62/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1910 - val_loss: 0.1898
Epoch 63/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1909 - val_loss: 0.1898
Epoch 64/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1909 - val_loss: 0.1898
Epoch 65/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1909 - val_loss: 0.1898
Epoch 66/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1909 - val_loss: 0.1897
Epoch 67/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1909 - val_loss: 0.1897
Epoch 68/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1908 - val_loss: 0.1897
Epoch 69/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1908 - val_loss: 0.1897
Epoch 70/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1908 - val_loss: 0.1896
Epoch 71/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1908 - val_loss: 0.1896
Epoch 72/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1907 - val_loss: 0.1896
Epoch 73/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1907 - val_loss: 0.1896
Epoch 74/250
60000/60000 [==============================] - 13s 215us/step - loss: 0.1907 - val_loss: 0.1895
Epoch 75/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1907 - val_loss: 0.1895
Epoch 76/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1906 - val_loss: 0.1895
Epoch 77/250
60000/60000 [==============================] - 14s 234us/step - loss: 0.1906 - val_loss: 0.1895
Epoch 78/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1906 - val_loss: 0.1895
Epoch 79/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1906 - val_loss: 0.1894
Epoch 80/250
60000/60000 [==============================] - 14s 233us/step - loss: 0.1906 - val_loss: 0.1894
Epoch 81/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1905 - val_loss: 0.1894
Epoch 82/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1905 - val_loss: 0.1894
Epoch 83/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1905 - val_loss: 0.1894
Epoch 84/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1905 - val_loss: 0.1893
Epoch 85/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1905 - val_loss: 0.1893
Epoch 86/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1905 - val_loss: 0.1893
Epoch 87/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1904 - val_loss: 0.1893
Epoch 88/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1904 - val_loss: 0.1893
Epoch 89/250
60000/60000 [==============================] - 14s 225us/step - loss: 0.1904 - val_loss: 0.1892
Epoch 90/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1904 - val_loss: 0.1892
Epoch 91/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1904 - val_loss: 0.1892
Epoch 92/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1903 - val_loss: 0.1892
Epoch 93/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1903 - val_loss: 0.1892
Epoch 94/250
60000/60000 [==============================] - 13s 225us/step - loss: 0.1903 - val_loss: 0.1892
Epoch 95/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1903 - val_loss: 0.1891
Epoch 96/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1903 - val_loss: 0.1891
Epoch 97/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1903 - val_loss: 0.1891
Epoch 98/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1902 - val_loss: 0.1891
Epoch 99/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1902 - val_loss: 0.1891
Epoch 100/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1902 - val_loss: 0.1891
Epoch 101/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1902 - val_loss: 0.1890
Epoch 102/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1902 - val_loss: 0.1890
Epoch 103/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1902 - val_loss: 0.1890
Epoch 104/250
60000/60000 [==============================] - 14s 238us/step - loss: 0.1901 - val_loss: 0.1890
Epoch 105/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1901 - val_loss: 0.1890
Epoch 106/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1901 - val_loss: 0.1890
Epoch 107/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1901 - val_loss: 0.1889
Epoch 108/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1901 - val_loss: 0.1889
Epoch 109/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1901 - val_loss: 0.1889
Epoch 110/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1901 - val_loss: 0.1889
Epoch 111/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1900 - val_loss: 0.1889
Epoch 112/250
60000/60000 [==============================] - 14s 225us/step - loss: 0.1900 - val_loss: 0.1889
Epoch 113/250
60000/60000 [==============================] - 13s 225us/step - loss: 0.1900 - val_loss: 0.1889
Epoch 114/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1900 - val_loss: 0.1888
Epoch 115/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1900 - val_loss: 0.1888
Epoch 116/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1900 - val_loss: 0.1888
Epoch 117/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1900 - val_loss: 0.1888
Epoch 118/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1899 - val_loss: 0.1888
Epoch 119/250
60000/60000 [==============================] - 14s 225us/step - loss: 0.1899 - val_loss: 0.1888
Epoch 120/250
60000/60000 [==============================] - 13s 225us/step - loss: 0.1899 - val_loss: 0.1888
Epoch 121/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1899 - val_loss: 0.1888
Epoch 122/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1899 - val_loss: 0.1887
Epoch 123/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1899 - val_loss: 0.1887
Epoch 124/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1899 - val_loss: 0.1887
Epoch 125/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1898 - val_loss: 0.1887
Epoch 126/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1898 - val_loss: 0.1887
Epoch 127/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1898 - val_loss: 0.1887
Epoch 128/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1898 - val_loss: 0.1887
Epoch 129/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1898 - val_loss: 0.1886
Epoch 130/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1898 - val_loss: 0.1886
Epoch 131/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1898 - val_loss: 0.1886
Epoch 132/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1898 - val_loss: 0.1886
Epoch 133/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1897 - val_loss: 0.1886
Epoch 134/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1897 - val_loss: 0.1886
Epoch 135/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1897 - val_loss: 0.1886
Epoch 136/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1897 - val_loss: 0.1885
Epoch 137/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1897 - val_loss: 0.1885
Epoch 138/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1897 - val_loss: 0.1885
Epoch 139/250
60000/60000 [==============================] - 13s 215us/step - loss: 0.1897 - val_loss: 0.1885
Epoch 140/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1897 - val_loss: 0.1885
Epoch 141/250
60000/60000 [==============================] - 13s 215us/step - loss: 0.1896 - val_loss: 0.1885
Epoch 142/250
60000/60000 [==============================] - 13s 215us/step - loss: 0.1896 - val_loss: 0.1885
Epoch 143/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1896 - val_loss: 0.1885
Epoch 144/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1896 - val_loss: 0.1885
Epoch 145/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1896 - val_loss: 0.1884
Epoch 146/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1896 - val_loss: 0.1884
Epoch 147/250
60000/60000 [==============================] - 13s 215us/step - loss: 0.1896 - val_loss: 0.1884
Epoch 148/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1896 - val_loss: 0.1884
Epoch 149/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1896 - val_loss: 0.1884
Epoch 150/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1895 - val_loss: 0.1884
Epoch 151/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1895 - val_loss: 0.1884
Epoch 152/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1895 - val_loss: 0.1884
Epoch 153/250
60000/60000 [==============================] - 14s 234us/step - loss: 0.1895 - val_loss: 0.1884
Epoch 154/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1895 - val_loss: 0.1883
Epoch 155/250
60000/60000 [==============================] - 14s 232us/step - loss: 0.1895 - val_loss: 0.1883
Epoch 156/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1895 - val_loss: 0.1883
Epoch 157/250
60000/60000 [==============================] - 14s 229us/step - loss: 0.1895 - val_loss: 0.1883
Epoch 158/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1895 - val_loss: 0.1883
Epoch 159/250
60000/60000 [==============================] - 14s 227us/step - loss: 0.1894 - val_loss: 0.1883
Epoch 160/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1894 - val_loss: 0.1883
Epoch 161/250
60000/60000 [==============================] - 14s 226us/step - loss: 0.1894 - val_loss: 0.1883
Epoch 162/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1894 - val_loss: 0.1883
Epoch 163/250
60000/60000 [==============================] - 14s 236us/step - loss: 0.1894 - val_loss: 0.1883
Epoch 164/250
60000/60000 [==============================] - 15s 248us/step - loss: 0.1894 - val_loss: 0.1883
Epoch 165/250
60000/60000 [==============================] - 14s 225us/step - loss: 0.1894 - val_loss: 0.1882
Epoch 166/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1894 - val_loss: 0.1882
Epoch 167/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1894 - val_loss: 0.1882
Epoch 168/250
60000/60000 [==============================] - 19s 320us/step - loss: 0.1894 - val_loss: 0.1882
Epoch 169/250
60000/60000 [==============================] - 18s 300us/step - loss: 0.1893 - val_loss: 0.1882
Epoch 170/250
60000/60000 [==============================] - 20s 330us/step - loss: 0.1893 - val_loss: 0.1882
Epoch 171/250
60000/60000 [==============================] - 19s 323us/step - loss: 0.1893 - val_loss: 0.1882
Epoch 172/250
60000/60000 [==============================] - 19s 314us/step - loss: 0.1893 - val_loss: 0.1882
Epoch 173/250
60000/60000 [==============================] - 19s 313us/step - loss: 0.1893 - val_loss: 0.1882
Epoch 174/250
60000/60000 [==============================] - 20s 329us/step - loss: 0.1893 - val_loss: 0.1881
Epoch 175/250
60000/60000 [==============================] - 19s 323us/step - loss: 0.1893 - val_loss: 0.1881
Epoch 176/250
60000/60000 [==============================] - 19s 319us/step - loss: 0.1893 - val_loss: 0.1881
Epoch 177/250
60000/60000 [==============================] - 19s 323us/step - loss: 0.1893 - val_loss: 0.1881
Epoch 178/250
60000/60000 [==============================] - 20s 326us/step - loss: 0.1893 - val_loss: 0.1881
Epoch 179/250
60000/60000 [==============================] - 20s 330us/step - loss: 0.1893 - val_loss: 0.1881
Epoch 180/250
60000/60000 [==============================] - 18s 307us/step - loss: 0.1892 - val_loss: 0.1881
Epoch 181/250
60000/60000 [==============================] - 18s 301us/step - loss: 0.1892 - val_loss: 0.1881
Epoch 182/250
60000/60000 [==============================] - 16s 260us/step - loss: 0.1892 - val_loss: 0.1881
Epoch 183/250
60000/60000 [==============================] - 17s 287us/step - loss: 0.1892 - val_loss: 0.1881
Epoch 184/250
60000/60000 [==============================] - 16s 267us/step - loss: 0.1892 - val_loss: 0.1881
Epoch 185/250
60000/60000 [==============================] - 15s 251us/step - loss: 0.1892 - val_loss: 0.1880
Epoch 186/250
60000/60000 [==============================] - 15s 257us/step - loss: 0.1892 - val_loss: 0.1880
Epoch 187/250
60000/60000 [==============================] - 14s 236us/step - loss: 0.1892 - val_loss: 0.1880
Epoch 188/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1892 - val_loss: 0.1880
Epoch 189/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1892 - val_loss: 0.1880
Epoch 190/250
60000/60000 [==============================] - 14s 225us/step - loss: 0.1892 - val_loss: 0.1880
Epoch 191/250
60000/60000 [==============================] - 15s 253us/step - loss: 0.1891 - val_loss: 0.1880
Epoch 192/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1891 - val_loss: 0.1880
Epoch 193/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1891 - val_loss: 0.1880
Epoch 194/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1891 - val_loss: 0.1880
Epoch 195/250
60000/60000 [==============================] - 14s 228us/step - loss: 0.1891 - val_loss: 0.1880
Epoch 196/250
60000/60000 [==============================] - 14s 231us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 197/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 198/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 199/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 200/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 201/250
60000/60000 [==============================] - 13s 212us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 202/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1891 - val_loss: 0.1879
Epoch 203/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1890 - val_loss: 0.1879
Epoch 204/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1890 - val_loss: 0.1879
Epoch 205/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1890 - val_loss: 0.1879
Epoch 206/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1890 - val_loss: 0.1879
Epoch 207/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1890 - val_loss: 0.1879
Epoch 208/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 209/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 210/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 211/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 212/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 213/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 214/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1890 - val_loss: 0.1878
Epoch 215/250
60000/60000 [==============================] - 13s 220us/step - loss: 0.1889 - val_loss: 0.1878
Epoch 216/250
60000/60000 [==============================] - 13s 216us/step - loss: 0.1889 - val_loss: 0.1878
Epoch 217/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1889 - val_loss: 0.1878
Epoch 218/250
60000/60000 [==============================] - 13s 215us/step - loss: 0.1889 - val_loss: 0.1878
Epoch 219/250
60000/60000 [==============================] - 13s 217us/step - loss: 0.1889 - val_loss: 0.1878
Epoch 220/250
60000/60000 [==============================] - 13s 222us/step - loss: 0.1889 - val_loss: 0.1878
Epoch 221/250
60000/60000 [==============================] - 13s 214us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 222/250
60000/60000 [==============================] - 14s 234us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 223/250
60000/60000 [==============================] - 15s 244us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 224/250
60000/60000 [==============================] - 14s 230us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 225/250
60000/60000 [==============================] - 14s 237us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 226/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 227/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 228/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1889 - val_loss: 0.1877
Epoch 229/250
60000/60000 [==============================] - 14s 234us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 230/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 231/250
60000/60000 [==============================] - 13s 210us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 232/250
60000/60000 [==============================] - 13s 209us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 233/250
60000/60000 [==============================] - 13s 210us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 234/250
60000/60000 [==============================] - 13s 211us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 235/250
60000/60000 [==============================] - 13s 214us/step - loss: 0.1888 - val_loss: 0.1877
Epoch 236/250
60000/60000 [==============================] - 13s 212us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 237/250
60000/60000 [==============================] - 13s 213us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 238/250
60000/60000 [==============================] - 13s 211us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 239/250
60000/60000 [==============================] - 13s 211us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 240/250
60000/60000 [==============================] - 12s 205us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 241/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 242/250
60000/60000 [==============================] - 13s 223us/step - loss: 0.1888 - val_loss: 0.1876
Epoch 243/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 244/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 245/250
60000/60000 [==============================] - 13s 224us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 246/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 247/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 248/250
60000/60000 [==============================] - 13s 219us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 249/250
60000/60000 [==============================] - 13s 218us/step - loss: 0.1887 - val_loss: 0.1876
Epoch 250/250
60000/60000 [==============================] - 13s 221us/step - loss: 0.1887 - val_loss: 0.1876
Out[72]:
<keras.callbacks.History at 0x1a688dadeb8>

The models ends with a train loss of 0.1887 and test loss of 0.876. The difference between the two is mostly due to the regularization term being added to the loss during training (worth about 0.01).

Here's a visualization of our new results:


In [73]:
# encode and decode some digits
# note that we take them from the *test* set
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)

In [74]:
# use Matplotlib 
import matplotlib.pyplot as plt

n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()


Deep autoencoder

We do not have to limit ourselves to a single layer as encoder or decoder, we could instead use a stack of layers, such as:


In [94]:
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
encoded = Dense(32, activation='relu')(encoded)

decoded = Dense(64, activation='relu')(encoded)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(decoded)

Let's try this:


In [96]:
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

autoencoder.fit(x_train, x_train,
                epochs=100,
                batch_size=100,
                shuffle=True,
                validation_data=(x_test, x_test))


Train on 60000 samples, validate on 10000 samples
Epoch 1/100
60000/60000 [==============================] - 14s 233us/step - loss: 0.2519 - val_loss: 0.2283
Epoch 2/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.2081 - val_loss: 0.1880
Epoch 3/100
60000/60000 [==============================] - 12s 195us/step - loss: 0.1807 - val_loss: 0.1729
Epoch 4/100
60000/60000 [==============================] - 12s 198us/step - loss: 0.1681 - val_loss: 0.1633
Epoch 5/100
60000/60000 [==============================] - 11s 191us/step - loss: 0.1590 - val_loss: 0.1521
Epoch 6/100
60000/60000 [==============================] - 12s 194us/step - loss: 0.1513 - val_loss: 0.1461
Epoch 7/100
60000/60000 [==============================] - 12s 195us/step - loss: 0.1458 - val_loss: 0.1408
Epoch 8/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.1408 - val_loss: 0.1358
Epoch 9/100
60000/60000 [==============================] - 11s 188us/step - loss: 0.1368 - val_loss: 0.1334
Epoch 10/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.1334 - val_loss: 0.1297
Epoch 11/100
60000/60000 [==============================] - 12s 193us/step - loss: 0.1305 - val_loss: 0.1269
Epoch 12/100
60000/60000 [==============================] - 11s 191us/step - loss: 0.1281 - val_loss: 0.1246
Epoch 13/100
60000/60000 [==============================] - 11s 188us/step - loss: 0.1258 - val_loss: 0.1239
Epoch 14/100
60000/60000 [==============================] - 11s 188us/step - loss: 0.1236 - val_loss: 0.1212
Epoch 15/100
60000/60000 [==============================] - 12s 192us/step - loss: 0.1219 - val_loss: 0.1185
Epoch 16/100
60000/60000 [==============================] - 11s 186us/step - loss: 0.1202 - val_loss: 0.1171
Epoch 17/100
60000/60000 [==============================] - 11s 191us/step - loss: 0.1184 - val_loss: 0.1165
Epoch 18/100
60000/60000 [==============================] - 11s 188us/step - loss: 0.1169 - val_loss: 0.1149
Epoch 19/100
60000/60000 [==============================] - 11s 185us/step - loss: 0.1153 - val_loss: 0.1137
Epoch 20/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.1138 - val_loss: 0.1123
Epoch 21/100
60000/60000 [==============================] - 11s 191us/step - loss: 0.1126 - val_loss: 0.1101
Epoch 22/100
60000/60000 [==============================] - 11s 187us/step - loss: 0.1113 - val_loss: 0.1095
Epoch 23/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.1102 - val_loss: 0.1081
Epoch 24/100
60000/60000 [==============================] - 11s 185us/step - loss: 0.1092 - val_loss: 0.1067
Epoch 25/100
60000/60000 [==============================] - 11s 185us/step - loss: 0.1084 - val_loss: 0.1055
Epoch 26/100
60000/60000 [==============================] - 11s 184us/step - loss: 0.1076 - val_loss: 0.1055
Epoch 27/100
60000/60000 [==============================] - 11s 185us/step - loss: 0.1069 - val_loss: 0.1048
Epoch 28/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.1063 - val_loss: 0.1044
Epoch 29/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.1056 - val_loss: 0.1028
Epoch 30/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.1051 - val_loss: 0.1032
Epoch 31/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.1045 - val_loss: 0.1031
Epoch 32/100
60000/60000 [==============================] - 11s 182us/step - loss: 0.1040 - val_loss: 0.1023
Epoch 33/100
60000/60000 [==============================] - 11s 177us/step - loss: 0.1035 - val_loss: 0.1014
Epoch 34/100
60000/60000 [==============================] - 11s 184us/step - loss: 0.1030 - val_loss: 0.1013
Epoch 35/100
60000/60000 [==============================] - 11s 182us/step - loss: 0.1025 - val_loss: 0.1004
Epoch 36/100
60000/60000 [==============================] - 11s 176us/step - loss: 0.1021 - val_loss: 0.1011
Epoch 37/100
60000/60000 [==============================] - 11s 178us/step - loss: 0.1016 - val_loss: 0.1009
Epoch 38/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.1012 - val_loss: 0.0998
Epoch 39/100
60000/60000 [==============================] - 11s 179us/step - loss: 0.1007 - val_loss: 0.0989
Epoch 40/100
60000/60000 [==============================] - 11s 180us/step - loss: 0.1003 - val_loss: 0.0986
Epoch 41/100
60000/60000 [==============================] - 11s 182us/step - loss: 0.0999 - val_loss: 0.0991
Epoch 42/100
60000/60000 [==============================] - 11s 178us/step - loss: 0.0996 - val_loss: 0.0984
Epoch 43/100
60000/60000 [==============================] - 11s 180us/step - loss: 0.0991 - val_loss: 0.0972
Epoch 44/100
60000/60000 [==============================] - 11s 175us/step - loss: 0.0987 - val_loss: 0.0969
Epoch 45/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.0983 - val_loss: 0.0971
Epoch 46/100
60000/60000 [==============================] - 11s 178us/step - loss: 0.0979 - val_loss: 0.0975
Epoch 47/100
60000/60000 [==============================] - 11s 180us/step - loss: 0.0975 - val_loss: 0.0966
Epoch 48/100
60000/60000 [==============================] - 11s 175us/step - loss: 0.0972 - val_loss: 0.0957
Epoch 49/100
60000/60000 [==============================] - 11s 178us/step - loss: 0.0968 - val_loss: 0.0960
Epoch 50/100
60000/60000 [==============================] - 11s 177us/step - loss: 0.0964 - val_loss: 0.0955
Epoch 51/100
60000/60000 [==============================] - 11s 179us/step - loss: 0.0961 - val_loss: 0.0944
Epoch 52/100
60000/60000 [==============================] - 11s 178us/step - loss: 0.0958 - val_loss: 0.0940
Epoch 53/100
60000/60000 [==============================] - 11s 179us/step - loss: 0.0955 - val_loss: 0.0940
Epoch 54/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.0952 - val_loss: 0.0937
Epoch 55/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.0949 - val_loss: 0.0945
Epoch 56/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.0946 - val_loss: 0.0934
Epoch 57/100
60000/60000 [==============================] - 11s 180us/step - loss: 0.0944 - val_loss: 0.0947
Epoch 58/100
60000/60000 [==============================] - 11s 180us/step - loss: 0.0941 - val_loss: 0.0933
Epoch 59/100
60000/60000 [==============================] - 11s 184us/step - loss: 0.0939 - val_loss: 0.0932
Epoch 60/100
60000/60000 [==============================] - 11s 180us/step - loss: 0.0936 - val_loss: 0.0922
Epoch 61/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.0934 - val_loss: 0.0929
Epoch 62/100
60000/60000 [==============================] - 11s 184us/step - loss: 0.0932 - val_loss: 0.0912
Epoch 63/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.0930 - val_loss: 0.0922
Epoch 64/100
60000/60000 [==============================] - 11s 179us/step - loss: 0.0928 - val_loss: 0.0915
Epoch 65/100
60000/60000 [==============================] - 11s 184us/step - loss: 0.0926 - val_loss: 0.0916
Epoch 66/100
60000/60000 [==============================] - 11s 182us/step - loss: 0.0924 - val_loss: 0.0916
Epoch 67/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.0922 - val_loss: 0.0916
Epoch 68/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.0920 - val_loss: 0.0905
Epoch 69/100
60000/60000 [==============================] - 11s 183us/step - loss: 0.0918 - val_loss: 0.0903
Epoch 70/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.0916 - val_loss: 0.0904
Epoch 71/100
60000/60000 [==============================] - 11s 182us/step - loss: 0.0914 - val_loss: 0.0902
Epoch 72/100
60000/60000 [==============================] - 11s 184us/step - loss: 0.0913 - val_loss: 0.0908
Epoch 73/100
60000/60000 [==============================] - 11s 177us/step - loss: 0.0912 - val_loss: 0.0910
Epoch 74/100
60000/60000 [==============================] - 11s 177us/step - loss: 0.0911 - val_loss: 0.0899
Epoch 75/100
60000/60000 [==============================] - 11s 181us/step - loss: 0.0909 - val_loss: 0.0898
Epoch 76/100
60000/60000 [==============================] - 12s 201us/step - loss: 0.0908 - val_loss: 0.0902
Epoch 77/100
60000/60000 [==============================] - 12s 201us/step - loss: 0.0907 - val_loss: 0.0901
Epoch 78/100
60000/60000 [==============================] - 12s 198us/step - loss: 0.0906 - val_loss: 0.0893
Epoch 79/100
60000/60000 [==============================] - 11s 188us/step - loss: 0.0904 - val_loss: 0.0890
Epoch 80/100
60000/60000 [==============================] - 12s 199us/step - loss: 0.0903 - val_loss: 0.0901
Epoch 81/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0902 - val_loss: 0.0893
Epoch 82/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0901 - val_loss: 0.0891
Epoch 83/100
60000/60000 [==============================] - 13s 218us/step - loss: 0.0901 - val_loss: 0.0888
Epoch 84/100
60000/60000 [==============================] - 12s 206us/step - loss: 0.0899 - val_loss: 0.0896
Epoch 85/100
60000/60000 [==============================] - 13s 210us/step - loss: 0.0898 - val_loss: 0.0889
Epoch 86/100
60000/60000 [==============================] - 14s 231us/step - loss: 0.0897 - val_loss: 0.0900
Epoch 87/100
60000/60000 [==============================] - 13s 224us/step - loss: 0.0896 - val_loss: 0.0908
Epoch 88/100
60000/60000 [==============================] - 13s 216us/step - loss: 0.0895 - val_loss: 0.0884
Epoch 89/100
60000/60000 [==============================] - 13s 215us/step - loss: 0.0894 - val_loss: 0.0885
Epoch 90/100
60000/60000 [==============================] - 13s 217us/step - loss: 0.0894 - val_loss: 0.0877
Epoch 91/100
60000/60000 [==============================] - 13s 213us/step - loss: 0.0893 - val_loss: 0.0884
Epoch 92/100
60000/60000 [==============================] - 12s 197us/step - loss: 0.0892 - val_loss: 0.0880
Epoch 93/100
60000/60000 [==============================] - 12s 198us/step - loss: 0.0891 - val_loss: 0.0887
Epoch 94/100
60000/60000 [==============================] - 11s 191us/step - loss: 0.0890 - val_loss: 0.0881
Epoch 95/100
60000/60000 [==============================] - 12s 194us/step - loss: 0.0889 - val_loss: 0.0883
Epoch 96/100
60000/60000 [==============================] - 12s 207us/step - loss: 0.0889 - val_loss: 0.0877
Epoch 97/100
60000/60000 [==============================] - 11s 190us/step - loss: 0.0888 - val_loss: 0.0881
Epoch 98/100
60000/60000 [==============================] - 11s 188us/step - loss: 0.0887 - val_loss: 0.0874
Epoch 99/100
60000/60000 [==============================] - 12s 194us/step - loss: 0.0886 - val_loss: 0.0880
Epoch 100/100
60000/60000 [==============================] - 11s 190us/step - loss: 0.0885 - val_loss: 0.0886
Out[96]:
<keras.callbacks.History at 0x1a68d759fd0>

After 100 epochs, it reaches a train and test loss of ~0.0885, a bit better than our previous models.


In [99]:


In [ ]:


In [ ]: