This is Lesson 8 in the Deep Learning track.
At the end of this lesson, you will understand and know how to use stride lengths to make your model faster and reduce memory consumption, as well as dropout to combat overfitting.
Both of these techniques are especially useful in large models.
In [1]:
from IPython.display import YouTubeVideo
YouTubeVideo('fwNLf4t7MR8', width=800, height=450)
Out[1]:
In [2]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from tensorflow.python import keras
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, Conv2D, Dropout
In [3]:
# Prep the data:
img_rows, img_cols = 28, 28
num_classes = 10
def data_prep(raw):
out_y = keras.utils.to_categorical(raw.label, num_classes)
num_images = raw.shape[0]
x_as_array = raw.values[:,1:]
x_shaped_array = x_as_array.reshape(num_images, img_rows, img_cols, 1)
out_x = x_shaped_array / 255
return out_x, out_y
train_size = 30000
train_file = 'inputs/digit_recognizer/train.csv'
raw_data = pd.read_csv(train_file)
In [4]:
# Build the model:
x, y = data_prep(raw_data)
model = Sequential()
model.add(Conv2D(30, kernel_size=(3,3),
strides=2,
activation='relu',
input_shape=(img_rows, img_cols, 1)))
model.add(Dropout(0.5))
model.add(Conv2D(30, kernel_size=(3,3), strides=2, activation='relu'))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
In [5]:
# Compile and fit the model:
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer='adam',
metrics=['accuracy'])
model.fit(x, y,
batch_size=128,
epochs=2,
validation_split=0.2)
Out[5]:
You've built a model to identify clothing types in the MNIST for Fashion dataset.
Now you will make your model bigger, specify larger stride lengths and apply dropout.
These changes will make your model faster and more accurate.
This is the last step in the Deep Learning Track.
In [6]:
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.python import keras
img_rows, img_cols = 28, 28
num_classes = 10
def prep_data(raw, train_size, val_size):
y = raw[:, 0]
out_y = keras.utils.to_categorical(y, num_classes)
x = raw[:,1:]
num_images = raw.shape[0]
out_x = x.reshape(num_images, img_rows, img_cols, 1)
out_x = out_x / 255
return out_x, out_y
fashion_file = 'inputs/fashionmnist/train.csv'
fashion_data = np.loadtxt(fashion_file, skiprows=1, delimiter=',')
x, y = prep_data(fashion_data, train_size=50000, val_size=5000)
In [7]:
fashion_model = Sequential()
fashion_model.add(Conv2D(12, kernel_size=(3, 3), strides=2, activation='relu',
input_shape=(img_rows, img_cols, 1)))
fashion_model.add(Conv2D(12, (3, 3), strides=2, activation='relu'))
fashion_model.add(Flatten())
fashion_model.add(Dense(128, activation='relu'))
fashion_model.add(Dense(num_classes, activation='softmax'))
fashion_model.compile(loss=keras.losses.categorical_crossentropy,
optimizer='adam',
metrics=['accuracy'])
batch_size = 128
epochs = 3
fashion_model.fit(x, y,
batch_size=batch_size,
epochs=epochs,
validation_split = 0.2)
Out[7]:
Your turn!
Specify and fit a model much like the one above, but specify a stride length of 2 for each convolutional layer.
Call your new model fashion_model_1
.
In [8]:
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Flatten, Conv2D, Dropout
fashion_model_1 = Sequential()
fashion_model_1.add(Conv2D(12, kernel_size=(3, 3), strides=2,
activation='relu',
input_shape=(img_rows, img_cols, 1)))
fashion_model_1.add(Conv2D(12, (3, 3), strides=2, activation='relu'))
fashion_model_1.add(Flatten())
fashion_model_1.add(Dense(128, activation='relu'))
fashion_model_1.add(Dense(num_classes, activation='softmax'))
fashion_model_1.compile(loss=keras.losses.categorical_crossentropy,
optimizer='adam',
metrics=['accuracy'])
fashion_model_1.fit(x, y,
batch_size=128,
epochs=2,
validation_split=0.2)
Out[8]:
You should have noticed that fashion_model_1
trained pretty quickly.
This makes it reasonable to make the model larger.
Specify a new model called fashion_model_2
that is identical to fashion_model_1
, except:
Conv2D
layer immediately before the Flatten
layer. Make it similar to the Conv2D
layers you already have, except don't set the stride length in this new layer (we have already shrunk the representation enough with the existing layers). After specifying fashion_model_2
, compile and fit it.
In [9]:
fashion_model_2 = Sequential()
fashion_model_2.add(Conv2D(24, kernel_size=(3, 3), strides=2,
activation='relu',
input_shape=(img_rows, img_cols, 1)))
fashion_model_2.add(Conv2D(24, (3, 3), strides=2, activation='relu'))
fashion_model_2.add(Conv2D(24, (3, 3), activation='relu'))
fashion_model_2.add(Flatten())
fashion_model_2.add(Dense(num_classes, activation='softmax'))
fashion_model_2.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])
fashion_model_2.fit(x, y, batch_size=128, epochs=2, validation_split=0.2)
Out[9]:
Specify fashion_model_3
, which is identical to fashion_model_2
except that it adds dropout immediately after each convolutional layer; dropout is added three times in total.
Compile and fit this model, and then compare the different models' performance on the validation data.
In [12]:
fashion_model_3 = Sequential()
fashion_model_3.add(Conv2D(24, kernel_size=(3, 3), strides=2, activation='relu', input_shape=(img_rows, img_cols, 1)))
fashion_model_3.add(Dropout(0.5))
fashion_model_3.add(Conv2D(24, (3, 3), strides=2, activation='relu'))
fashion_model_3.add(Dropout(0.5))
fashion_model_3.add(Conv2D(24, (3, 3), activation='relu'))
fashion_model_3.add(Dropout(0.5))
fashion_model_3.add(Flatten())
fashion_model_3.add(Dense(128, activation='relu'))
fashion_model_3.add(Dense(num_classes, activation='softmax'))
fashion_model_3.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])
fashion_model.fit(x, y, batch_size=3, epochs=2, validation_split=0.2)
Out[12]: