In [1]:
from IPython.display import display, Math, Latex

In [2]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import collections
import cv2 #this needs to be imported before TF!!
import tensorflow as tf
from tensorflow.contrib.layers import flatten
import random
from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle
from six.moves import map

%matplotlib inline

Step 0: Load The Data


In [3]:
# Load pickled data
import pickle

training_file = 'traffic-signs-data/train.p'
testing_file = 'traffic-signs-data/test.p'
class_names = pd.read_csv('traffic-signs-data/signnames.csv')

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)
    
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']

Data Summary and Exploration

Below I show a sample for each class in the dataset. There are 43 unique classes. Some classes (particularly the speed limit signs) are overrepresented in the dataset. We might need to revisit this class imbalance problem later to improve the accuracy. Though this should be somewhat OK with the assumption that the train data and the test data come from the same distribution - e.g class imbalance persist in the test data.

In all, the training data and test data consist of 39209 and 12630 images, respectively. The images are of size 32x32.


In [4]:
n_train = y_train.shape[0]
n_test = y_test.shape[0]
image_shape = X_train.shape[1:]
NB_CLASSES = len(set(y_train))

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", NB_CLASSES)


Number of training examples = 39209
Number of testing examples = 12630
Image data shape = (32, 32, 3)
Number of classes = 43

In [5]:
#################
#  Parameters   #
#################
# A better approach here would be to pass the params as a dictionary into the model to iterate faster on the hyperparameters

base_image_path = "traffic-signs-data/"
IMAGE_SIZE = 32
# number of epochs
NB_EPOCHS = 100
NB_TRAIN_SAMPLES = 50000
# batch size
BATCH_SIZE = 128
checkpoint_name = "model.ckpt"
# filter size
FILTER_SIZE = 5
#number of conv filters
NB_FILTERS = 32
#input channels
INPUT_CHANNELS = 3
# number of hidden layers
NB_HIDDEN = 128
# Learning Rate
LR = 0.001

In [6]:
y = pd.DataFrame(y_train,columns=['label'])
y['value'] = y.index
ix = np.array(y.groupby('label').first())
ix = np.squeeze(ix)
X_sample = X_train[ix,:,:,:]
im_grid = X_sample.shape[0] // 2
plt.figure(figsize=(300,200))
for i in range(NB_CLASSES):
    plt.subplot(im_grid,4, i+1)
    plt.axis('off')
    plt.imshow(X_sample[i], aspect='auto')



In [7]:
_y_dist = collections.Counter(y_train)
y_dist = pd.DataFrame.from_dict(_y_dist, orient='index').reset_index()
y_dist.columns = ['ClassId','Count']
y_dist = y_dist.merge(class_names, on='ClassId').drop('ClassId',axis=1)
y_dist.index = y_dist.SignName
plt.figure(figsize=(200,100))
y_dist.plot.bar()


Out[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f89ec574518>
<matplotlib.figure.Figure at 0x7f89ec574630>

In [8]:
y_classes =  y_dist.index.str.lower().tolist()
y_classes


Out[8]:
['speed limit (20km/h)',
 'speed limit (30km/h)',
 'speed limit (50km/h)',
 'speed limit (60km/h)',
 'speed limit (70km/h)',
 'speed limit (80km/h)',
 'end of speed limit (80km/h)',
 'speed limit (100km/h)',
 'speed limit (120km/h)',
 'no passing',
 'no passing for vehicles over 3.5 metric tons',
 'right-of-way at the next intersection',
 'priority road',
 'yield',
 'stop',
 'no vehicles',
 'vehicles over 3.5 metric tons prohibited',
 'no entry',
 'general caution',
 'dangerous curve to the left',
 'dangerous curve to the right',
 'double curve',
 'bumpy road',
 'slippery road',
 'road narrows on the right',
 'road work',
 'traffic signals',
 'pedestrians',
 'children crossing',
 'bicycles crossing',
 'beware of ice/snow',
 'wild animals crossing',
 'end of all speed and passing limits',
 'turn right ahead',
 'turn left ahead',
 'ahead only',
 'go straight or right',
 'go straight or left',
 'keep right',
 'keep left',
 'roundabout mandatory',
 'end of no passing',
 'end of no passing by vehicles over 3.5 metric tons']

Preprocessing

The pixels are scaled down [0-1] range. Centering the input data around zero helps to allevaite dead or saturated neuron issues. I did not reduce the image to the gray-scale because it turns out that keeping the original three channel helps with the accuracy. This makes sense given that certain traffic signs have distinct colors associated with them.

The idea behind real-time data augmentation is to make the model more resistant to overfitting. By applying small distortions to the dataset, we implicitly force the model to memorize less and generalize better.

What type of augmentions can we use for this problem? Upon inspection of the image classes and fitting the model on images generated through combinations of transformations, it turns out flipping the images upside-down or left-to-right is not such a great idea. Because when you flip a go straight or right sign, it actually becomes go straight or left. Smoothing the image did not help either, possibly the image size is very small to begin with.

I have found that randomly rotating the image confined to a certain degree, however, turns out to be a helpful augmentation technique. Distoring the brightness of the image seemed to help slightly with validation accuracy as well.

The code below, ImageGenerator, acts as a generator by scaling and applying random distortions to the input image. This in turn allows us to run batches that are not confined to the actual number of training examples in the dataset.

The data is split into training and validation, using a 80-20 split.


In [9]:
# Image Generation. Inspired by Keras image generation logic.
class ImageGenerator():
    
    def __init__(self, batch_size, x, y):
        self.x = x
        self.y = y
        self.batch_index = 0
        self.batch_size = batch_size
        self.shift_degree = 22.5
        self.N = len(x)
        self.shape = (32, 32)
        self.index_generator = self._flow_index(self.N, batch_size, shuffle)
    
    # maintain the state
    def _flow_index(self, N, batch_size, shuffle=True):
        while 1:
            if self.batch_index == 0:
                index_array = np.arange(N)
            if shuffle:
                index_array = np.random.permutation(N)
                
            current_index = (self.batch_index * batch_size) % N
            if N >= current_index + batch_size:
                current_batch_size = batch_size
                self.batch_index += 1
            else:
                current_batch_size = N - current_index
                self.batch_index = 0
            yield (index_array[current_index: current_index + current_batch_size],
               current_index, current_batch_size) 
            
    def __iter__(self):
        return self
        
    def __next__(self):
        ix_array, current_index, current_batch_size = next(self.index_generator)
        batch_x = np.zeros(tuple([current_batch_size] + list(self.x.shape)[1:]))
        for i, j in enumerate(ix_array):
            x = self.x[j]
            x = self.transform(x)
            x = self.mean_substract(x)
            batch_x[i] = x
        batch_y = self.y[ix_array]
        return batch_x, batch_y
            
    def transform(self,x):
        #x = self.random_flip(x)
        #x = self.random_gaussian_blur(x)
        x = self.random_brightness(x)
        x = self.random_rotate(x)
        return x
                  
    def random_flip(self, x):
        if np.random.random() < 0.5:
            x = np.flipud(x)
        if np.random.random() < 0.5:
            x = np.fliplr(x)
        return x
    
    def random_rotate(self,x):
        # Rotation of an image for an angle theta is achieved by the transformation matrix of the form M
        # Rotate the image by a random magnitude centered around shift_degree
        random_degree = np.random.uniform(low=-1, high=1) * self.shift_degree
        M = cv2.getRotationMatrix2D((self.shape[0]/2,self.shape[1]/2),random_degree,1)
        return cv2.warpAffine(x,M,self.shape)
        
    def random_gaussian_blur(self, img, kernel_size=3):
        if np.random.random() < 0.5:
            return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)
        else:
            return img
    
    def random_brightness(self, img):
        image1 = cv2.cvtColor(img,cv2.COLOR_RGB2HSV)
        random_bright = np.random.uniform(low=0.7, high=1.5)
        image1[:,:,2] = image1[:,:,2] * random_bright
        image1 = cv2.cvtColor(image1,cv2.COLOR_HSV2RGB)
        return image1
    
    def mean_substract(self,img):
        #return img - np.mean(img)
        #Scale the features to be [0,1]
        return (img / 255.).astype(np.float32)

For completeness, I give examples of each transformation the generator is capable of.


In [10]:
# Image-flippin
_ix = 990
x1 = np.fliplr(X_train[_ix])
x2 = np.flipud(X_train[_ix])
plt.subplot(1, 3, 1)
plt.imshow(X_train[_ix], aspect='auto')
plt.subplot(1, 3, 2)
plt.imshow(x1, aspect='auto')
plt.subplot(1, 3, 3)
plt.imshow(x2, aspect='auto')


Out[10]:
<matplotlib.image.AxesImage at 0x7f89a1f6c518>

In [11]:
# Denoising applies a gaussian filter to smooth out the data
x1 = cv2.GaussianBlur(X_train[_ix], (3, 3), 0)
plt.subplot(1, 2, 1)
plt.imshow(X_train[_ix], aspect='auto')
plt.subplot(1, 2, 2)
plt.imshow(x1, aspect='auto')


Out[11]:
<matplotlib.image.AxesImage at 0x7f89a3bd1278>

In [12]:
# Random Brightness adjustment
_ix = 399
image1 = cv2.cvtColor(X_train[_ix],cv2.COLOR_RGB2HSV)
random_bright = np.random.uniform(low=0, high=2)
print(random_bright)
image1[:,:,2] = image1[:,:,2] * random_bright
image1 = cv2.cvtColor(image1,cv2.COLOR_HSV2RGB)
               
plt.subplot(1, 2, 1)
plt.imshow(X_train[_ix], aspect='auto')
plt.subplot(1, 2, 2)
plt.imshow(image1, aspect='auto')


1.401862695777989
Out[12]:
<matplotlib.image.AxesImage at 0x7f89a3b26f60>

In [13]:
#Randomly Rotate Slightly
random_degree = np.random.uniform(low=-1, high=1) * 22.5
M = cv2.getRotationMatrix2D((32/2,32/2),random_degree,1)
image1 = cv2.warpAffine(X_train[_ix],M,(32,32))

plt.subplot(1, 2, 1)
plt.imshow(X_train[_ix], aspect='auto')
plt.subplot(1, 2, 2)
plt.imshow(image1, aspect='auto')


Out[13]:
<matplotlib.image.AxesImage at 0x7f89a3ac8a90>

I split the data into training and validation, using a 80-20 split.


In [14]:
train_test_split_ratio = 0.2
# Shuffle and split into validation / training
ix = list(range(n_train))
random.shuffle(ix)
X_train = X_train[ix,:]
y_train = y_train[ix]
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size= train_test_split_ratio , random_state=0)
print("the size of the training and validation data: {}, {}".format(X_train.shape[0], X_val.shape[0]))
NB_TRAIN_DATA = len(y_train)


the size of the training and validation data: 31367, 7842

I make sure that I normalize the image data for the validation and test set.


In [15]:
def mean_substract(img):
    #return img - np.mean(img)
    return (img / 255.).astype(np.float32)
    
X_val_normalized = mean_substract(X_val)
X_test_normalized = mean_substract(X_test)
X_val_normalized.shape


Out[15]:
(7842, 32, 32, 3)

Model Architechture

I built a 3 layer CNN achitechture with filter size 5x5. AdamOptimizer combines the logic of AdaGrad optimizer with momentum. In short, AdaGrad keeps track of historical sums of squares in each gradient, which is used to normalize the immediate gradient update. Momentum, in turn, allows a velocity to build up - very much like in physics along gradient surfaces.

The batch size is set to 128. A larger batch size is more stable, but takes a longer time to run. I have iterated over synthetic dataset through 100 epochs, which seems to be adaquate enough to get to a decent error / accuracy rate. I could have trained the model through more epochs in combiniation with an early-stopping mechanism. For each epoch, the data generator generates 50,000 images randomly. The initial learning rate is set to 0.001. The hyperparameters are mostly chosen by trial-and-error. One needs to be careful with peeking too much into the validation data however. The dirtier the validation set gets, the less meaningful it is as a proxy for out-of-sample accuracy.

Weights used in the convolutional layers were initialized using a truncated normal distribution with a standard deviation of 0.1. Bias weights were either initialized to zeros or ones. Weights for the fully connected layers were initialized also using a truncated normal distribution with a standard deviation of 0.1.

Image classes were transformed into one-hot encodings. A reduced mean, cross entropy loss function was fed the logits from the last fully connected layer. I experimented with couple of regularization techniques such as L2 or batch normalization or dropout. The latter two did not offer any improvemens over L2 norm on the fully-connected layer, so I did stick with L2 norm.

Layer Description
Input 32x32x3 RGB Image - Normalized and Randomly Perturbed
Convolution 5x5 32 filters. Same padding. 1x1 stride
RELU
Max Pooling 2x2 stride, kernel size of 2x2
Convolution 5x5 32 filters. Same padding. 1x1 stride
RELU
Max Pooling 2x2 stride, kernel size of 2x2
Fully Connected Layer 128 hidden layers
Fully Connected Layer Softmax Layer. 43 classes

In [16]:
#Reset the graph
tf.reset_default_graph()
graph = tf.get_default_graph()
assert [] == graph.get_operations()

In [17]:
#  Models.
def convNet(data, one_hot_y, loss_beta = 0.01):
    # Variables.
    # For saving and optimizer. 
    global_step = tf.Variable(0, name="global_step")
    
    # conv weights
    layer1_weights = tf.Variable(tf.truncated_normal(
      [FILTER_SIZE, FILTER_SIZE, INPUT_CHANNELS, NB_FILTERS], stddev=0.1), name="layer1_weights")
    
    layer1_biases = tf.Variable(tf.zeros([NB_FILTERS]), name="layer1_biases")
    layer2_weights = tf.Variable(tf.truncated_normal(
      [FILTER_SIZE, FILTER_SIZE, NB_FILTERS, NB_FILTERS], stddev=0.1), name="layer2_weights")
    
    layer2_biases = tf.Variable(tf.constant(1.0, shape=[NB_FILTERS]), name="layer2_biases")

    # fully connected layers
    layer3_weights = tf.Variable(tf.truncated_normal(
      [IMAGE_SIZE // 4 * IMAGE_SIZE // 4 * NB_FILTERS, NB_HIDDEN], stddev=0.1), name="layer3_weights")
    l2_layer3 = loss_beta * tf.nn.l2_loss(layer3_weights)
    layer3_biases = tf.Variable(tf.constant(1.0, shape=[NB_HIDDEN]), name="layer3_biases")
    layer4_weights = tf.Variable(tf.truncated_normal(
      [NB_HIDDEN, NB_CLASSES], stddev=0.1), name="layer4_weights")
    #l2_layer4 = loss_beta * tf.nn.l2_loss(layer4_weights)
    layer4_biases = tf.Variable(tf.constant(1.0, shape=[NB_CLASSES]), name="layer4_biases")
    
    # 1st Convolution
    conv = tf.nn.conv2d(data, layer1_weights, [1, 1, 1, 1], padding='SAME')
    # Relu activation of conv
    hidden = tf.nn.relu(conv + layer1_biases)
    # Max pooling of activation
    hidden_pooled = tf.nn.max_pool(hidden, ksize = [1,2,2,1], strides = [1,2,2,1], padding='SAME')
    #print(hidden_pooled.get_shape())
    # 2nd Convolution
    conv = tf.nn.conv2d(hidden_pooled, layer2_weights, [1, 1, 1, 1], padding='SAME')
    # Relu activation of conv
    hidden = tf.nn.relu(conv + layer2_biases)
    # Max pooling
    hidden_pooled = tf.nn.max_pool(hidden, ksize = [1,2,2,1], strides = [1,2,2,1], padding='SAME')
    #print(hidden_pooled.get_shape())
    # Flatten
    shape = hidden_pooled.get_shape().as_list()
    reshape = tf.reshape(hidden_pooled, [-1, shape[1] * shape[2] * shape[3]])
    # 1st fully connected layer with relu activation
    #print(reshape.get_shape())
    hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)
    # 2nd fully connected layer
    logits = tf.matmul(hidden, layer4_weights) + layer4_biases
    
    # loss
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, one_hot_y)
    loss_operation = tf.reduce_mean(cross_entropy)
    loss_operation += l2_layer3
    #loss_operation += l2_layer4
    
    # accuracy
    correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
    accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    
    return logits, loss_operation, correct_prediction, accuracy_operation

print('done')


done

In [18]:
# Placeholders.
tf.reset_default_graph()
global_step = tf.Variable(0, name="global_step")
x = tf.placeholder(
    tf.float32, shape=(None, IMAGE_SIZE, IMAGE_SIZE, INPUT_CHANNELS),name='x')
y = tf.placeholder(tf.int32, shape=(None),name='y')
one_hot_y = tf.one_hot(y, NB_CLASSES)

In [19]:
# Set the graph computations.
logits, loss_operation, correct_prediction, accuracy_operation  = convNet(x, one_hot_y)
# cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, one_hot_y)
# loss_operation = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate = LR)
training_operation = optimizer.minimize(loss_operation)

Setting up the final ops and evaluation function for validation accuracy.


In [20]:
saver = tf.train.Saver()

def evaluate(X_data, y_data):
    num_examples = len(X_data)
    total_accuracy = 0
    sess = tf.get_default_session()
    for offset in range(0, num_examples, BATCH_SIZE):
        batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
        accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y})
        total_accuracy += (accuracy * len(batch_x))
    return total_accuracy / num_examples

print('done')


done

In [21]:
gr = tf.get_default_graph()
print("Number of ops in TF Graph is {}".format(len(gr.get_operations())))


Number of ops in TF Graph is 457

This is where we finally train our model.


In [22]:
#Initiate the generator
gen = ImageGenerator(BATCH_SIZE, X_train, y_train)

In [23]:
model_path = base_image_path + checkpoint_name

In [24]:
tr_error_rate = []
vl_accuracy = []
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())    
    print("Training...")
    print()
    for i in range(NB_EPOCHS):
        print("Epoch {} ...".format(i+1))
        for offset in range(0, NB_TRAIN_SAMPLES, BATCH_SIZE):
            batch_x, batch_y = next(gen)
            _, l = sess.run([training_operation, loss_operation] , feed_dict={x: batch_x, y: batch_y})
        tr_error_rate.append(l)
        validation_accuracy = evaluate(X_val_normalized, y_val)
        vl_accuracy.append(validation_accuracy)
        print("Training Loss = {:.3f}".format(l))
        print("Validation Accuracy = {:.3f}".format(validation_accuracy))
        print()
        if (i % 10 == 0) and (i > 0):
            print("Saving the temp chkpt..")
            # Save the variables to disk.
            save_path = saver.save(sess, model_path)
            print("Model saved in file: %s" % model_path)
            print()
    # Save the variables to disk.
    save_path = saver.save(sess, model_path)
    print("Final model saved in file: %s" % model_path)
print('done')


Training...

Epoch 1 ...
Training Loss = 3.416
Validation Accuracy = 0.607

Epoch 2 ...
Training Loss = 1.563
Validation Accuracy = 0.857

Epoch 3 ...
Training Loss = 0.834
Validation Accuracy = 0.930

Epoch 4 ...
Training Loss = 0.518
Validation Accuracy = 0.945

Epoch 5 ...
Training Loss = 0.527
Validation Accuracy = 0.960

Epoch 6 ...
Training Loss = 0.484
Validation Accuracy = 0.966

Epoch 7 ...
Training Loss = 0.381
Validation Accuracy = 0.972

Epoch 8 ...
Training Loss = 0.312
Validation Accuracy = 0.973

Epoch 9 ...
Training Loss = 0.472
Validation Accuracy = 0.969

Epoch 10 ...
Training Loss = 0.319
Validation Accuracy = 0.977

Epoch 11 ...
Training Loss = 0.340
Validation Accuracy = 0.981

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 12 ...
Training Loss = 0.506
Validation Accuracy = 0.982

Epoch 13 ...
Training Loss = 0.261
Validation Accuracy = 0.985

Epoch 14 ...
Training Loss = 0.272
Validation Accuracy = 0.982

Epoch 15 ...
Training Loss = 0.258
Validation Accuracy = 0.984

Epoch 16 ...
Training Loss = 0.277
Validation Accuracy = 0.988

Epoch 17 ...
Training Loss = 0.272
Validation Accuracy = 0.984

Epoch 18 ...
Training Loss = 0.265
Validation Accuracy = 0.988

Epoch 19 ...
Training Loss = 0.366
Validation Accuracy = 0.980

Epoch 20 ...
Training Loss = 0.225
Validation Accuracy = 0.988

Epoch 21 ...
Training Loss = 0.295
Validation Accuracy = 0.988

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 22 ...
Training Loss = 0.214
Validation Accuracy = 0.985

Epoch 23 ...
Training Loss = 0.175
Validation Accuracy = 0.984

Epoch 24 ...
Training Loss = 0.215
Validation Accuracy = 0.988

Epoch 25 ...
Training Loss = 0.173
Validation Accuracy = 0.986

Epoch 26 ...
Training Loss = 0.228
Validation Accuracy = 0.989

Epoch 27 ...
Training Loss = 0.224
Validation Accuracy = 0.989

Epoch 28 ...
Training Loss = 0.254
Validation Accuracy = 0.989

Epoch 29 ...
Training Loss = 0.275
Validation Accuracy = 0.981

Epoch 30 ...
Training Loss = 0.217
Validation Accuracy = 0.988

Epoch 31 ...
Training Loss = 0.277
Validation Accuracy = 0.989

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 32 ...
Training Loss = 0.170
Validation Accuracy = 0.990

Epoch 33 ...
Training Loss = 0.207
Validation Accuracy = 0.979

Epoch 34 ...
Training Loss = 0.201
Validation Accuracy = 0.988

Epoch 35 ...
Training Loss = 0.306
Validation Accuracy = 0.987

Epoch 36 ...
Training Loss = 0.251
Validation Accuracy = 0.991

Epoch 37 ...
Training Loss = 0.168
Validation Accuracy = 0.990

Epoch 38 ...
Training Loss = 0.194
Validation Accuracy = 0.986

Epoch 39 ...
Training Loss = 0.199
Validation Accuracy = 0.989

Epoch 40 ...
Training Loss = 0.199
Validation Accuracy = 0.988

Epoch 41 ...
Training Loss = 0.262
Validation Accuracy = 0.991

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 42 ...
Training Loss = 0.171
Validation Accuracy = 0.993

Epoch 43 ...
Training Loss = 0.201
Validation Accuracy = 0.993

Epoch 44 ...
Training Loss = 0.157
Validation Accuracy = 0.990

Epoch 45 ...
Training Loss = 0.198
Validation Accuracy = 0.992

Epoch 46 ...
Training Loss = 0.158
Validation Accuracy = 0.986

Epoch 47 ...
Training Loss = 0.207
Validation Accuracy = 0.994

Epoch 48 ...
Training Loss = 0.183
Validation Accuracy = 0.990

Epoch 49 ...
Training Loss = 0.141
Validation Accuracy = 0.991

Epoch 50 ...
Training Loss = 0.161
Validation Accuracy = 0.991

Epoch 51 ...
Training Loss = 0.135
Validation Accuracy = 0.989

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 52 ...
Training Loss = 0.169
Validation Accuracy = 0.988

Epoch 53 ...
Training Loss = 0.192
Validation Accuracy = 0.990

Epoch 54 ...
Training Loss = 0.267
Validation Accuracy = 0.992

Epoch 55 ...
Training Loss = 0.167
Validation Accuracy = 0.993

Epoch 56 ...
Training Loss = 0.123
Validation Accuracy = 0.990

Epoch 57 ...
Training Loss = 0.210
Validation Accuracy = 0.992

Epoch 58 ...
Training Loss = 0.237
Validation Accuracy = 0.992

Epoch 59 ...
Training Loss = 0.145
Validation Accuracy = 0.992

Epoch 60 ...
Training Loss = 0.178
Validation Accuracy = 0.984

Epoch 61 ...
Training Loss = 0.146
Validation Accuracy = 0.985

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 62 ...
Training Loss = 0.203
Validation Accuracy = 0.991

Epoch 63 ...
Training Loss = 0.157
Validation Accuracy = 0.992

Epoch 64 ...
Training Loss = 0.225
Validation Accuracy = 0.993

Epoch 65 ...
Training Loss = 0.144
Validation Accuracy = 0.993

Epoch 66 ...
Training Loss = 0.130
Validation Accuracy = 0.992

Epoch 67 ...
Training Loss = 0.178
Validation Accuracy = 0.991

Epoch 68 ...
Training Loss = 0.279
Validation Accuracy = 0.987

Epoch 69 ...
Training Loss = 0.148
Validation Accuracy = 0.993

Epoch 70 ...
Training Loss = 0.195
Validation Accuracy = 0.992

Epoch 71 ...
Training Loss = 0.156
Validation Accuracy = 0.991

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 72 ...
Training Loss = 0.207
Validation Accuracy = 0.993

Epoch 73 ...
Training Loss = 0.216
Validation Accuracy = 0.991

Epoch 74 ...
Training Loss = 0.174
Validation Accuracy = 0.990

Epoch 75 ...
Training Loss = 0.152
Validation Accuracy = 0.991

Epoch 76 ...
Training Loss = 0.121
Validation Accuracy = 0.993

Epoch 77 ...
Training Loss = 0.172
Validation Accuracy = 0.994

Epoch 78 ...
Training Loss = 0.135
Validation Accuracy = 0.991

Epoch 79 ...
Training Loss = 0.154
Validation Accuracy = 0.993

Epoch 80 ...
Training Loss = 0.187
Validation Accuracy = 0.992

Epoch 81 ...
Training Loss = 0.143
Validation Accuracy = 0.993

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 82 ...
Training Loss = 0.155
Validation Accuracy = 0.992

Epoch 83 ...
Training Loss = 0.181
Validation Accuracy = 0.994

Epoch 84 ...
Training Loss = 0.159
Validation Accuracy = 0.993

Epoch 85 ...
Training Loss = 0.147
Validation Accuracy = 0.992

Epoch 86 ...
Training Loss = 0.162
Validation Accuracy = 0.993

Epoch 87 ...
Training Loss = 0.139
Validation Accuracy = 0.992

Epoch 88 ...
Training Loss = 0.110
Validation Accuracy = 0.991

Epoch 89 ...
Training Loss = 0.159
Validation Accuracy = 0.992

Epoch 90 ...
Training Loss = 0.105
Validation Accuracy = 0.994

Epoch 91 ...
Training Loss = 0.171
Validation Accuracy = 0.993

Saving the temp chkpt..
Model saved in file: traffic-signs-data/model.ckpt

Epoch 92 ...
Training Loss = 0.175
Validation Accuracy = 0.993

Epoch 93 ...
Training Loss = 0.171
Validation Accuracy = 0.991

Epoch 94 ...
Training Loss = 0.128
Validation Accuracy = 0.993

Epoch 95 ...
Training Loss = 0.125
Validation Accuracy = 0.993

Epoch 96 ...
Training Loss = 0.147
Validation Accuracy = 0.994

Epoch 97 ...
Training Loss = 0.118
Validation Accuracy = 0.992

Epoch 98 ...
Training Loss = 0.135
Validation Accuracy = 0.990

Epoch 99 ...
Training Loss = 0.118
Validation Accuracy = 0.992

Epoch 100 ...
Training Loss = 0.141
Validation Accuracy = 0.991

Final model saved in file: traffic-signs-data/model.ckpt
done

In [25]:
# Training error and validation accuracy plot
df = pd.DataFrame(np.array(tr_error_rate),columns=['training_error'])
df2 = pd.DataFrame(np.array(vl_accuracy),columns=['validation_accuracy'])
df.plot()
df2.plot()


Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f89a3da89b0>

Next, I want to see if there are classes where my trained network is doing poorly. This is a bit of an iterative process. The reader only gets to see the final result of many iterations over the parameter hyperspace.


In [27]:
def predict(X_data):
    num_examples = len(X_data)
    predictions = []
    sess = tf.get_default_session()
    for offset in range(0, num_examples, BATCH_SIZE):
        batch_x = X_data[offset:offset+BATCH_SIZE]
        _predictions = sess.run(logits, feed_dict={x: batch_x})
        predictions.extend(_predictions)
    return np.array(predictions)

In [28]:
print(model_path)
#Test Predictions
with tf.Session() as sess:
    saver.restore(sess, model_path)
    print("Model Loaded...")
    predictions = predict(X_test_normalized)
print('done')

#Create a lookup for labels
class_names = class_names.drop('ClassId',axis=1)
lookup = class_names.to_dict()['SignName']

val_preds = [(lookup[np.argmax(p,0)], lookup[y]) for p,y in zip(predictions, y_test)]

val_preds = pd.DataFrame(val_preds,columns=['pred','truth'])
confusion_matrix = val_preds.groupby(['pred','truth']).size()
confusion_matrix = confusion_matrix.reset_index()
confusion_matrix.columns = val_preds.columns.tolist() + ['cnt']
confusion_matrix = confusion_matrix.pivot(index='pred', columns='truth', values='cnt').fillna(0)


traffic-signs-data/model.ckpt
Model Loaded...
done

In [29]:
nb_truth = np.array(confusion_matrix.sum(axis=0)).T
correct_pred = np.array(confusion_matrix).diagonal().T
false_pred = (nb_truth - correct_pred).T
false_pred.shape, correct_pred.shape, nb_truth.shape


Out[29]:
((43,), (43,), (43,))

In [30]:
preds = pd.DataFrame({'ground_truth' : nb_truth, 'correct_pred': correct_pred, 'false_pred': false_pred})
preds.index = confusion_matrix.sum(axis=0).index
preds['accuracy'] = preds['correct_pred'] / preds['ground_truth']
preds.sort_values('accuracy')


Out[30]:
correct_pred false_pred ground_truth accuracy
truth
Beware of ice/snow 94.0 56.0 150.0 0.626667
Double curve 58.0 32.0 90.0 0.644444
Pedestrians 42.0 18.0 60.0 0.700000
Bicycles crossing 76.0 14.0 90.0 0.844444
Speed limit (80km/h) 535.0 95.0 630.0 0.849206
End of no passing 51.0 9.0 60.0 0.850000
End of speed limit (80km/h) 128.0 22.0 150.0 0.853333
Bumpy road 104.0 16.0 120.0 0.866667
Road narrows on the right 79.0 11.0 90.0 0.877778
Slippery road 132.0 18.0 150.0 0.880000
Road work 424.0 56.0 480.0 0.883333
Speed limit (20km/h) 55.0 5.0 60.0 0.916667
Traffic signals 167.0 13.0 180.0 0.927778
General caution 363.0 27.0 390.0 0.930769
End of no passing by vehicles over 3.5 metric tons 84.0 6.0 90.0 0.933333
Speed limit (50km/h) 706.0 44.0 750.0 0.941333
Children crossing 142.0 8.0 150.0 0.946667
Speed limit (70km/h) 628.0 32.0 660.0 0.951515
Vehicles over 3.5 metric tons prohibited 144.0 6.0 150.0 0.960000
Speed limit (100km/h) 432.0 18.0 450.0 0.960000
Right-of-way at the next intersection 405.0 15.0 420.0 0.964286
Speed limit (60km/h) 437.0 13.0 450.0 0.971111
No passing for vehicles over 3.5 metric tons 641.0 19.0 660.0 0.971212
No entry 350.0 10.0 360.0 0.972222
Priority road 672.0 18.0 690.0 0.973913
Go straight or right 117.0 3.0 120.0 0.975000
Keep right 673.0 17.0 690.0 0.975362
Keep left 88.0 2.0 90.0 0.977778
Roundabout mandatory 88.0 2.0 90.0 0.977778
No vehicles 207.0 3.0 210.0 0.985714
Turn right ahead 207.0 3.0 210.0 0.985714
Speed limit (30km/h) 711.0 9.0 720.0 0.987500
Turn left ahead 119.0 1.0 120.0 0.991667
Yield 716.0 4.0 720.0 0.994444
Ahead only 388.0 2.0 390.0 0.994872
No passing 478.0 2.0 480.0 0.995833
Wild animals crossing 269.0 1.0 270.0 0.996296
Stop 270.0 0.0 270.0 1.000000
Go straight or left 60.0 0.0 60.0 1.000000
End of all speed and passing limits 60.0 0.0 60.0 1.000000
Dangerous curve to the right 90.0 0.0 90.0 1.000000
Dangerous curve to the left 60.0 0.0 60.0 1.000000
Speed limit (120km/h) 450.0 0.0 450.0 1.000000

In [31]:
# accuracy by class 
preds.drop(['ground_truth','accuracy'],axis=1).plot.bar(stacked=True)


Out[31]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f89a3f36eb8>

In [47]:
# Accuracy by class size
preds.drop(['correct_pred','false_pred'],axis=1).plot(kind='scatter',x='ground_truth',y='accuracy')


Out[47]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f89a41a0e80>

Finally, after we are satisfied with all the tweaking, I took the test data out of the box and give it a whirl..


In [32]:
#Test Accuracy
with tf.Session() as sess:
    saver.restore(sess, model_path)
    print("Model Loaded...")
    test_accuracy = evaluate(X_test_normalized, y_test)
    print("Test Accuracy for Test Data = {:.3f}".format(test_accuracy))


Model Loaded...
Test Accuracy for Test Data = 0.950

The validation set accuracy is 0.987, whereas the test set accuracy turned out to be 0.950.

After trying several architechtures including LeNet, I settled with a plain-vanilla ConvNet architecture because it led to a higher accuracy rate out of the box. The hyperparameter space is very very large even for a simple model such as this.

Iterating over different combinations, the trick that helped the most was (i) to scale the data and (ii) augmented data (random rotations) (iii) L2 regulizer.

You can see above that the accuracy in the validation set by class vary. There are 11 classes where the accuracy rate in validation data stays below 90% including Beware of ice/snow, Double curve, Pedestrians - outliers in the lowe left corner in the scatter plot. Further iterations can alleviate this issue by fixing the class imbalance. Secondly, the model seems to confuse Speed limit (80km/h) with End of speed limit (80km/h) as both classes have lower accuracy rates.

Third, the fact that validation accuracy is higher than the test accuracy leads me to believe that the model does not generalize as good as it should. Tweaking the regularization parameter and/or adding dropout or normalization layers might alleviate the issue.

Finally, I can try more powerful models. A model with more capacity might be able to learn features that are beyond this little ConvNet's reach.


Test a Model on New Images

Now we apply the image classifier on images hat tI have found in the wild. The accuracy rate turns out to be a dismal 25%. This is a cautoniary tale that the model does not necessarily generalize well when released in the wild. I suspect that the model has hard time recognization objects in images that are not head-on or with other objects in the background. The model seems also be fooled by shapes - Speed limit (30km/h) is mistaken as Roundabout mandatory sign, both of which lot of circular shapes in common. There are some blatant errors that need a more in-depth analysis. For example, the model predicts that the Stop sign is a Turn left ahead sign.


In [180]:
import os
ground_truth = {'speed_limit_120.jpg': 8,
                'Schilderwald_in_Passau.jpg': 1,
                '186819557.jpg': 13,
                '532135995.jpg': 12,
                'slippery_road.jpg': 23,
                'stop.jpg': 14,
                '19.jpg' : 31,
                'speed_limit_100.jpg': 7,
    }
image_names = os.listdir("street_signs/")

processed_images = []
class_labels = []

for image_name in image_names:
    if image_name in ground_truth:
        image = cv2.imread("street_signs/"+ image_name)
        #print(image.shape)
        class_labels.append(ground_truth[image_name])
        #preprocessing step. first crop then normalize
        resized = cv2.resize(image, (32,32))
        #print(resized.shape)
        wi = mean_substract(resized)
        processed_images.append(wi)
print('*'*50)
print(class_labels)


**************************************************
[13, 31, 12, 1, 23, 7, 8, 14]

In [181]:
# predict images in the wild
pred = []
with tf.Session() as sess:
    saver.restore(sess, model_path)
    print("Model Loaded...")
    print("Prediction for {} images".format(len(processed_images)))
    _pred = predict(processed_images)


Model Loaded...
Prediction for 8 images

In [182]:
_pred.shape
wild_preds = [(lookup[np.argmax(p,0)], lookup[y]) for p,y in zip(_pred, class_labels)]
wild_preds


Out[182]:
[('Yield', 'Yield'),
 ('Speed limit (30km/h)', 'Wild animals crossing'),
 ('Roundabout mandatory', 'Priority road'),
 ('Roundabout mandatory', 'Speed limit (30km/h)'),
 ('Go straight or left', 'Slippery road'),
 ('Speed limit (100km/h)', 'Speed limit (100km/h)'),
 ('Speed limit (100km/h)', 'Speed limit (120km/h)'),
 ('Turn left ahead', 'Stop')]

In [177]:
_v = np.exp(_pred).sum(axis=1)
softmax_probs = np.round(np.exp(_pred) / _v[:,None],2)
pred_wild = pd.DataFrame(softmax_probs, columns = class_names.SignName, index = class_labels)

In [178]:
for i, (lbl, pred) in enumerate(pred_wild.iterrows()):    
    _pred = pred.sort_values(ascending=False).head(5)
    print(i)
    plt.imshow(processed_images[i])
    plt.show()
    print("Label: {}\n".format(lookup[lbl]))
    print("Top 5 predictions:\n")
    print(_pred)
    print('='*10)


0
Label: Yield

Top 5 predictions:

SignName
Yield                                                 0.99
Priority road                                         0.01
End of no passing by vehicles over 3.5 metric tons    0.00
No passing for vehicles over 3.5 metric tons          0.00
General caution                                       0.00
Name: 13, dtype: float32
==========
1
Label: Wild animals crossing

Top 5 predictions:

SignName
Speed limit (30km/h)     0.65
Double curve             0.23
Roundabout mandatory     0.11
Wild animals crossing    0.01
Speed limit (50km/h)     0.00
Name: 31, dtype: float32
==========
2
Label: Priority road

Top 5 predictions:

SignName
Roundabout mandatory                                  1.0
End of no passing by vehicles over 3.5 metric tons    0.0
No passing for vehicles over 3.5 metric tons          0.0
General caution                                       0.0
No entry                                              0.0
Name: 12, dtype: float32
==========
3
Label: Speed limit (30km/h)

Top 5 predictions:

SignName
Roundabout mandatory                                  0.97
Keep right                                            0.03
Go straight or left                                   0.01
End of no passing by vehicles over 3.5 metric tons    0.00
No passing                                            0.00
Name: 1, dtype: float32
==========
4
Label: Slippery road

Top 5 predictions:

SignName
Go straight or left                                   0.89
Road narrows on the right                             0.06
Ahead only                                            0.04
End of no passing by vehicles over 3.5 metric tons    0.00
No passing for vehicles over 3.5 metric tons          0.00
Name: 23, dtype: float32
==========
5
Label: Speed limit (100km/h)

Top 5 predictions:

SignName
Speed limit (100km/h)                                 0.82
Speed limit (30km/h)                                  0.17
Speed limit (60km/h)                                  0.01
End of no passing by vehicles over 3.5 metric tons    0.00
Right-of-way at the next intersection                 0.00
Name: 7, dtype: float32
==========
6
Label: Speed limit (120km/h)

Top 5 predictions:

SignName
Speed limit (100km/h)                                 1.0
End of no passing by vehicles over 3.5 metric tons    0.0
No passing for vehicles over 3.5 metric tons          0.0
General caution                                       0.0
No entry                                              0.0
Name: 8, dtype: float32
==========
7
Label: Stop

Top 5 predictions:

SignName
Turn left ahead                                       1.0
End of no passing by vehicles over 3.5 metric tons    0.0
No passing for vehicles over 3.5 metric tons          0.0
General caution                                       0.0
No entry                                              0.0
Name: 14, dtype: float32
==========

Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.


In [ ]: