In this notebook, I want to continue working with the model form the experiment 1. The model was able to learn the steering angles for the three hand-picked images but the question is can it learn to actually steer the car in the simulator's autonomous mode. Given the discussion about recovery in the project material, it is unlikely that the provided sample training data is enough to teach the model to drive, but doing a test with that data would give at least a baseline to work from.
Here is the overall plan
Here are some utility functions.
In [1]:
import os
from PIL import Image
def get_record_and_image(index):
record = df.iloc[index]
path = os.path.join('data', record.center)
return record, Image.open(path)
def layer_info(model):
for n, layer in enumerate(model.layers, 1):
print('Layer {:2} {:16} input shape {} output shape {}'.format(n, layer.name, layer.input_shape, layer.output_shape))
In [2]:
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
model = Sequential()
model.add(Convolution2D(6, 5, 5, border_mode='valid', subsample=(5, 5), input_shape=(80, 160, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(16, 5, 5, border_mode='valid', subsample=(2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(120))
model.add(Activation('relu'))
model.add(Dense(84))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('tanh'))
layer_info(model)
In [3]:
import numpy as np
import pandas as pd
df = pd.read_csv('data/driving_log.csv')
Now I need to create the actual training data, X_train and y_train. I will just read all the images and store them as NumPy arrays to X_train. Similary, I read the corresponding steering angles and store them to y_train.
Note: I ended up scaling the images down to half size to conserve memory and speed up training. This was also mentioned in the project cheat sheet (https://carnd-forums.udacity.com/questions/26214464/behavioral-cloning-cheatsheet).
In [4]:
from tqdm import tqdm
X_train = []
y_train = []
for i in tqdm(range(len(df))):
record, image = get_record_and_image(i)
image = image.resize((image.width // 2, image.height // 2))
X_train.append(np.array(image))
image.close()
y_train.append(record['steering'])
Some preprocessing: normalize the images and convert the y_train to a NumPy array because that is what the Keras fit() seems to want. This step takes some time and consumes also a lot of memory; downscaling the images above helps.
In [59]:
X_min = np.min(X_train)
X_max = np.max(X_train)
X_normalized = (X_train - X_min) / (X_max - X_min) - 0.5
y_train = np.array(y_train)
Here I use all the data from the sample training data, 8036 images and their steering angles. Instead of using the training data generator as in the experiment 1, I just give the whole training set to model.fit and let it split it to training and validation sets. After training, I save the model so it can be loaded to the simulator for testing if the training seems to proceed well.
In [30]:
import keras.optimizers
def train(model, nb_epoch=10, learning_rate=0.001):
adam = keras.optimizers.Adam(lr=learning_rate)
model.compile(loss='mse', optimizer=adam)
model.fit(X_normalized, y_train, validation_split=0.2, nb_epoch=nb_epoch, verbose=2)
model.save('model.h5')
In [8]:
train(model)
The validation error does not get much lower after epoch 4 or so, whereas the training error keeps falling. This indicates overtraining and poor generalization ability.
Lets do a bit of random sampling of the predicted steering angles to get a feeling how they match with the actual angles.
In [9]:
from random import randrange
def sample_predictions(model):
for i in range(10):
index = randrange(len(df))
X = np.expand_dims(X_normalized[index], axis=0)
y = y_train[index]
print('Actual steering angle {} model prediction {}'.format(y, model.predict(X)[0][0]))
sample_predictions(model)
The sample predictions do not look very good. Some tweaks to the model are in place.
So what could be done to the model to improve it? Basically there are three different approaches for changing the model:
Before going for options 2 or 3, let's consider option 1 as it is more conservative than the other. A simple way to try to increase the generalization ability is add dropout layers, which force the model to learn redundant connections. Let's try that.
In [11]:
from keras.layers import Dropout
model_2 = Sequential()
model_2.add(Convolution2D(6, 5, 5, border_mode='valid', subsample=(5, 5), input_shape=(80, 160, 3)))
model_2.add(Dropout(0.5))
model_2.add(Activation('relu'))
model_2.add(MaxPooling2D(pool_size=(2, 2)))
model_2.add(Convolution2D(16, 5, 5, border_mode='valid', subsample=(2, 2)))
model_2.add(Dropout(0.5))
model_2.add(Activation('relu'))
model_2.add(MaxPooling2D(pool_size=(2, 2)))
model_2.add(Flatten())
model_2.add(Dense(120))
model_2.add(Activation('relu'))
model_2.add(Dense(84))
model_2.add(Activation('relu'))
model_2.add(Dense(1))
model_2.add(Activation('tanh'))
layer_info(model_2)
In [12]:
train(model_2)
sample_predictions(model_2)
The performace is even poorer now so the model is probably not complex enough to learn the given data set. I could increase the layer dimensions directly, but there is another way: remove the pooling layers. Pooling is analogous to downsampling and it reduces the amount of weights in the model. Let's strip the pooling layers and see what happens.
In [13]:
model_3 = Sequential()
model_3.add(Convolution2D(6, 5, 5, border_mode='valid', subsample=(5, 5), input_shape=(80, 160, 3)))
model_3.add(Dropout(0.5))
model_3.add(Activation('relu'))
#model_3.add(MaxPooling2D(pool_size=(2, 2)))
model_3.add(Convolution2D(16, 5, 5, border_mode='valid'))
model_3.add(Dropout(0.5))
model_3.add(Activation('relu'))
#model_3.add(MaxPooling2D(pool_size=(2, 2)))
model_3.add(Flatten())
model_3.add(Dense(120))
model_3.add(Activation('relu'))
model_3.add(Dense(84))
model_3.add(Activation('relu'))
model_3.add(Dense(1))
model_3.add(Activation('tanh'))
layer_info(model_3)
In [14]:
train(model_3, 20)
sample_predictions(model_3)
A bit better but even after 20 epochs not that much of an improvement. I begin to suspect that I need to increase the model's complexity quite a bit. At this point I will try to replicate the architecture from the NVidia paper (http://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf) and see what kind of difference it makes.
In [15]:
model_4 = Sequential()
model_4.add(Convolution2D(24, 5, 5, border_mode='valid', subsample=(2, 2), input_shape=(80, 160, 3)))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(36, 5, 5, border_mode='valid', subsample=(2, 2)))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(48, 5, 5, border_mode='valid', subsample=(2, 2)))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(64, 3, 3, border_mode='valid'))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(64, 3, 3, border_mode='valid'))
model_4.add(Activation('relu'))
model_4.add(Flatten())
model_4.add(Dense(100))
model_4.add(Activation('relu'))
model_4.add(Dense(50))
model_4.add(Activation('relu'))
model_4.add(Dense(10))
model_4.add(Activation('relu'))
model_4.add(Dense(1))
model_4.add(Activation('tanh'))
layer_info(model_4)
In [16]:
train(model_4)
sample_predictions(model_4)
In [17]:
model_4 = Sequential()
model_4.add(Convolution2D(24, 5, 5, border_mode='valid', subsample=(2, 2), input_shape=(80, 160, 3)))
model_4.add(Activation('relu'))
model_4.add(Dropout(0.5))
model_4.add(Convolution2D(36, 5, 5, border_mode='valid', subsample=(2, 2)))
model_4.add(Activation('relu'))
model_4.add(Dropout(0.5))
model_4.add(Convolution2D(48, 5, 5, border_mode='valid', subsample=(2, 2)))
model_4.add(Activation('relu'))
model_4.add(Dropout(0.5))
model_4.add(Convolution2D(64, 3, 3, border_mode='valid'))
model_4.add(Activation('relu'))
model_4.add(Dropout(0.5))
model_4.add(Convolution2D(64, 3, 3, border_mode='valid'))
model_4.add(Activation('relu'))
model_4.add(Dropout(0.5))
model_4.add(Flatten())
model_4.add(Dense(100))
model_4.add(Activation('relu'))
model_4.add(Dense(50))
model_4.add(Activation('relu'))
model_4.add(Dense(10))
model_4.add(Activation('relu'))
model_4.add(Dense(1))
model_4.add(Activation('tanh'))
layer_info(model_4)
In [18]:
train(model_4)
sample_predictions(model_4)
In [39]:
model_4 = Sequential()
model_4.add(Convolution2D(24, 5, 5, border_mode='valid', subsample=(2, 2), input_shape=(80, 160, 3)))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(36, 5, 5, border_mode='valid', subsample=(2, 2)))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(48, 5, 5, border_mode='valid', subsample=(2, 2)))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(64, 3, 3, border_mode='valid'))
model_4.add(Activation('relu'))
model_4.add(Convolution2D(64, 3, 3, border_mode='valid'))
model_4.add(Activation('relu'))
model_4.add(Flatten())
model_4.add(Dense(100))
model_4.add(Dropout(0.5))
model_4.add(Activation('relu'))
model_4.add(Dense(50))
model_4.add(Activation('relu'))
model_4.add(Dense(10))
model_4.add(Activation('relu'))
model_4.add(Dense(1))
layer_info(model_4)
In [40]:
train(model_4, 50, learning_rate=0.001)
sample_predictions(model_4)
In [46]:
sample_predictions(model_4)
In [ ]: