This is Lesson 8 in the Deep Learning track.
At the end of this lesson, you will understand and know how to use stride lengths to make your model faster and reduce memory consumption, as well as dropout to combat overfitting.
Both of these techniques are especially useful in large models.


In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo('fwNLf4t7MR8', width=800, height=450)


Out[1]:

In [2]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from tensorflow.python import keras
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, Conv2D, Dropout


/Users/benjamingrove/.pyenv/versions/3.6.1/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

In [3]:
# Prep the data:

img_rows, img_cols = 28, 28
num_classes = 10

def data_prep(raw):
    out_y = keras.utils.to_categorical(raw.label, num_classes)
    
    num_images = raw.shape[0]
    x_as_array = raw.values[:,1:]
    x_shaped_array = x_as_array.reshape(num_images, img_rows, img_cols, 1)
    out_x = x_shaped_array / 255
    return out_x, out_y

train_size = 30000
train_file = 'inputs/digit_recognizer/train.csv'
raw_data = pd.read_csv(train_file)

In [4]:
# Build the model:

x, y = data_prep(raw_data)
model = Sequential()
model.add(Conv2D(30, kernel_size=(3,3),
                strides=2,
                activation='relu',
                input_shape=(img_rows, img_cols, 1)))
model.add(Dropout(0.5))
model.add(Conv2D(30, kernel_size=(3,3), strides=2, activation='relu'))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

In [5]:
# Compile and fit the model:
model.compile(loss=keras.losses.categorical_crossentropy,
             optimizer='adam',
             metrics=['accuracy'])

model.fit(x, y,
         batch_size=128,
         epochs=2,
         validation_split=0.2)


Train on 33600 samples, validate on 8400 samples
Epoch 1/2
33600/33600 [==============================] - 5s 147us/step - loss: 0.6069 - acc: 0.8124 - val_loss: 0.2114 - val_acc: 0.9410
Epoch 2/2
33600/33600 [==============================] - 5s 149us/step - loss: 0.2587 - acc: 0.9207 - val_loss: 0.1292 - val_acc: 0.9614
Out[5]:
<tensorflow.python.keras._impl.keras.callbacks.History at 0x1235a1f28>

You've built a model to identify clothing types in the MNIST for Fashion dataset.
Now you will make your model bigger, specify larger stride lengths and apply dropout.
These changes will make your model faster and more accurate. This is the last step in the Deep Learning Track.


In [6]:
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.python import keras

img_rows, img_cols = 28, 28
num_classes = 10

def prep_data(raw, train_size, val_size):
    y = raw[:, 0]
    out_y = keras.utils.to_categorical(y, num_classes)
    x = raw[:,1:]
    num_images = raw.shape[0]
    out_x = x.reshape(num_images, img_rows, img_cols, 1)
    out_x = out_x / 255
    return out_x, out_y

fashion_file = 'inputs/fashionmnist/train.csv'
fashion_data = np.loadtxt(fashion_file, skiprows=1, delimiter=',')
x, y = prep_data(fashion_data, train_size=50000, val_size=5000)

Sample Model Code


In [7]:
fashion_model = Sequential()
fashion_model.add(Conv2D(12, kernel_size=(3, 3), strides=2, activation='relu',
                        input_shape=(img_rows, img_cols, 1)))
fashion_model.add(Conv2D(12, (3, 3), strides=2, activation='relu'))
fashion_model.add(Flatten())
fashion_model.add(Dense(128, activation='relu'))
fashion_model.add(Dense(num_classes, activation='softmax'))

fashion_model.compile(loss=keras.losses.categorical_crossentropy,
                     optimizer='adam',
                     metrics=['accuracy'])
batch_size = 128
epochs = 3
fashion_model.fit(x, y,
                 batch_size=batch_size,
                 epochs=epochs,
                 validation_split = 0.2)


Train on 48000 samples, validate on 12000 samples
Epoch 1/3
48000/48000 [==============================] - 3s 71us/step - loss: 0.6757 - acc: 0.7641 - val_loss: 0.4884 - val_acc: 0.8271
Epoch 2/3
48000/48000 [==============================] - 3s 64us/step - loss: 0.4438 - acc: 0.8421 - val_loss: 0.4369 - val_acc: 0.8437
Epoch 3/3
48000/48000 [==============================] - 3s 65us/step - loss: 0.3985 - acc: 0.8569 - val_loss: 0.3939 - val_acc: 0.8597
Out[7]:
<tensorflow.python.keras._impl.keras.callbacks.History at 0x12359c978>

Adding Strides

Your turn!
Specify and fit a model much like the one above, but specify a stride length of 2 for each convolutional layer.
Call your new model fashion_model_1.


In [8]:
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, Conv2D, Dropout

fashion_model_1 = Sequential()
fashion_model_1.add(Conv2D(12, kernel_size=(3, 3), strides=2,
                          activation='relu',
                          input_shape=(img_rows, img_cols, 1)))
fashion_model_1.add(Conv2D(12, (3, 3), strides=2, activation='relu'))
fashion_model_1.add(Flatten())
fashion_model_1.add(Dense(128, activation='relu'))
fashion_model_1.add(Dense(num_classes, activation='softmax'))

fashion_model_1.compile(loss=keras.losses.categorical_crossentropy,
                        optimizer='adam',
                        metrics=['accuracy'])

fashion_model_1.fit(x, y,
                  batch_size=128,
                  epochs=2,
                  validation_split=0.2)


Train on 48000 samples, validate on 12000 samples
Epoch 1/2
48000/48000 [==============================] - 3s 72us/step - loss: 0.6849 - acc: 0.7616 - val_loss: 0.5161 - val_acc: 0.8168
Epoch 2/2
48000/48000 [==============================] - 3s 64us/step - loss: 0.4500 - acc: 0.8408 - val_loss: 0.4548 - val_acc: 0.8378
Out[8]:
<tensorflow.python.keras._impl.keras.callbacks.History at 0x133c27f28>

Make Model Larger

You should have noticed that fashion_model_1 trained pretty quickly.
This makes it reasonable to make the model larger.
Specify a new model called fashion_model_2 that is identical to fashion_model_1, except:

  1. Add an additional Conv2D layer immediately before the Flatten layer. Make it similar to the Conv2D layers you already have, except don't set the stride length in this new layer (we have already shrunk the representation enough with the existing layers).
  2. Change the number of filters in each convolutional layer to 24.

After specifying fashion_model_2, compile and fit it.


In [9]:
fashion_model_2 = Sequential()
fashion_model_2.add(Conv2D(24, kernel_size=(3, 3), strides=2,
                        activation='relu',
                        input_shape=(img_rows, img_cols, 1)))
fashion_model_2.add(Conv2D(24, (3, 3), strides=2, activation='relu'))
fashion_model_2.add(Conv2D(24, (3, 3), activation='relu'))
fashion_model_2.add(Flatten())
fashion_model_2.add(Dense(num_classes, activation='softmax'))

fashion_model_2.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])
fashion_model_2.fit(x, y, batch_size=128, epochs=2, validation_split=0.2)


Train on 48000 samples, validate on 12000 samples
Epoch 1/2
48000/48000 [==============================] - 4s 89us/step - loss: 0.7279 - acc: 0.7392 - val_loss: 0.5026 - val_acc: 0.8220
Epoch 2/2
48000/48000 [==============================] - 4s 81us/step - loss: 0.4584 - acc: 0.8343 - val_loss: 0.4581 - val_acc: 0.8347
Out[9]:
<tensorflow.python.keras._impl.keras.callbacks.History at 0x133c49b00>

Add Dropout

Specify fashion_model_3, which is identical to fashion_model_2 except that it adds dropout immediately after each convolutional layer; dropout is added three times in total.
Compile and fit this model, and then compare the different models' performance on the validation data.


In [12]:
fashion_model_3 = Sequential()
fashion_model_3.add(Conv2D(24, kernel_size=(3, 3), strides=2, activation='relu', input_shape=(img_rows, img_cols, 1)))
fashion_model_3.add(Dropout(0.5))
fashion_model_3.add(Conv2D(24, (3, 3), strides=2, activation='relu'))
fashion_model_3.add(Dropout(0.5))
fashion_model_3.add(Conv2D(24, (3, 3), activation='relu'))
fashion_model_3.add(Dropout(0.5))
fashion_model_3.add(Flatten())
fashion_model_3.add(Dense(128, activation='relu'))
fashion_model_3.add(Dense(num_classes, activation='softmax'))

fashion_model_3.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])

fashion_model.fit(x, y, batch_size=3, epochs=2, validation_split=0.2)


Train on 48000 samples, validate on 12000 samples
Epoch 1/2
48000/48000 [==============================] - 55s 1ms/step - loss: 0.4129 - acc: 0.8471 - val_loss: 0.4037 - val_acc: 0.8476
Epoch 2/2
48000/48000 [==============================] - 55s 1ms/step - loss: 0.3285 - acc: 0.8778 - val_loss: 0.3241 - val_acc: 0.8828
Out[12]:
<tensorflow.python.keras._impl.keras.callbacks.History at 0x146ee5898>