In this notebook, I want to experiment with the problem using the provided sample driving data. The aim is to create a working solution that is able to predict the correct steering angle just by using three training examples. Using so small a training set means that the model should quickly overfit if it is working properly, which should indicate that it should be ok to use the model in further experiments. The idea for this approach came from Paul Heraty's cheatsheet (https://carnd-forums.udacity.com/questions/26214464/behavioral-cloning-cheatsheet).
Specifically, I want to
In [9]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
df = pd.read_csv('data/driving_log.csv')
print(df.describe())
df['steering'].hist(bins=100)
plt.title('Histogram of steering angle (100 bins)')
Out[9]:
Seems that we are mostly steering straight here.
Now I need to pick three images from the sample driving data that correspond to steering left, right and straight. This set of images should be enough to see that the model is able to learn the differences between the images if it can predict the different steering angles correctly.
Let's first get the indices of records where steering angle is hard left (< -0.5).
In [2]:
df[df['steering'] < -0.5].index
Out[2]:
By trial and error, I ended up picking index 4341 where the image matches the left turn nicely.
In [3]:
import os
from PIL import Image
def get_record_and_image(index):
record = df.iloc[index]
path = os.path.join('data', record.center)
return record, Image.open(path)
left_record, left_image = get_record_and_image(4341)
print('Steering angle {}'.format(left_record.steering))
plt.imshow(left_image)
Out[3]:
Let's do the same with the hard right turn (steering angle > 0.5).
In [4]:
df[df['steering'] > 0.5].index
Out[4]:
Again, after some peeking of the images, the index 3357 looks fine.
In [5]:
right_record, right_image = get_record_and_image(3357)
print('Steering angle {}'.format(right_record.steering))
plt.imshow(right_image)
Out[5]:
Now I need to pick an record for driving straight. There should be plenty of choices to pick from, so some random exploration of the choices is probably the best way to find one that looks ok.
In [6]:
# I used this code to pick random images until I found one I liked
#index = df[(df['steering'] > -0.1) & (df['steering'] < 0.1)].sample(n=1).iloc[0].name
#print('Index', index)
straight_record, straight_image = get_record_and_image(796)
plt.imshow(straight_image)
Out[6]:
Having selected the training examples, I next need to create a DNN model to train. I first thought about starting with the Nvidia pipeline (http://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf) but I started to wonder how a bit simpler network like the LeNet would work. The input is likely less complex here than in the Nvidia self-driving car case, with the constant lightning and road color etc.
So, let's setup a modified version of the LeNet that can take the 320x160 images as input and outputs a single number between -1 and 1. Because the image resolution is so much higher than in the original LeNet it probably makes sense to use striding in the convolution layers to reduce the dimensionality. Let's try (10, 10) stride for the first convolution layer followed To get the [-1, 1] range for the output, tanh activation function seems to be a good choice.
In [145]:
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
model = Sequential()
model.add(Convolution2D(6, 5, 5, border_mode='valid', subsample=(5, 5), input_shape=(160, 320, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(16, 5, 5, border_mode='valid', subsample=(2, 2)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(120))
model.add(Activation('relu'))
model.add(Dense(84))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('tanh'))
Lets check the dimensions of the network layers.
In [146]:
for n, layer in enumerate(model.layers, 1):
print('Layer {:2} {:16} input shape {} output shape {}'.format(n, layer.name, layer.input_shape, layer.output_shape))
Now I need to massage the images and corresponding steering angles to form that is usable in model training.
In [149]:
X_train = [np.array(image) for image in [left_image, right_image, straight_image]]
X_min = np.min(X_train)
X_max = np.max(X_train)
X_normalized = (X_train - X_min) / (X_max - X_min) - 0.5
y_train = np.array([record['steering'] for record in [left_record, right_record, straight_record]])
In [150]:
from random import randrange
def generator():
while 1:
i = randrange(3)
# Create a one item batch by taking a slice
yield X_normalized[i:i+1], y_train[i:i+1]
model.compile('adam', 'mse')
history = model.fit_generator(generator(), samples_per_epoch=1000, validation_data=(X_normalized, y_train), nb_epoch=10, verbose=2)
Training hits zero validation loss after epoch 5, i.e., it should have learned the data perfectly. Lets see how well the model predicts.
In [151]:
for X, y in zip(X_expand, y_train):
print('Actual steering angle {} model prediction {}'.format(y, model.predict(X)[0][0]))
The predictions are perfect as they should be. The model seems to be able to learn at least some relevant features from the images and is therefore suitable for further testing.
In [ ]: