Visualizing attention on self driving car

So far we have seen many examples of attention and activation maximization on Dense layers that outputs a probability distribution. What if we have a regression model instead?

In this example, we will use a pretrained self driving car model that predicts the steering angle output. This model is borrowed from https://github.com/experiencor/self-driving-toy-car. Here is the model in action.

Lets load the model, weights etc and make a prediction.


In [1]:
import numpy as np

from matplotlib import pyplot as plt
%matplotlib inline

from model import build_model, FRAME_W, FRAME_H
from keras.preprocessing.image import img_to_array
from vis.utils import utils

model = build_model()
model.load_weights('weights.hdf5')
img = utils.load_img('images/left.png', target_size=(FRAME_H, FRAME_W))
plt.imshow(img)

# Convert to BGR, create input with batch_size: 1.
bgr_img = utils.bgr2rgb(img)
img_input = np.expand_dims(img_to_array(bgr_img), axis=0)
pred = model.predict(img_input)[0][0]
print('Predicted {}'.format(pred))


Using TensorFlow backend.
Predicted -0.9328667521476746

Looks good. The negative value is indicative of left steering.

Attention

By default, visualize_saliency and visualize_cam use positive gradients which shows what parts of the image increase the output value. This makes sense for categorical outputs. For regression cases, it is more interesting to see what parts of the image cause the output to:

  • Increase
  • Decrease
  • Maintain

the current predicted value. This is where grad_modifiers shine.

  • To visualize decrease, we need to consider negative gradients that indicate the decrease. To treat them as positive values (as used by visualization), we need to to negate the gradients. This is easily done by using grad_modifier='negate'.
  • To visualize what is responsible for current output, we need to highlight small gradients (either positive or negative). This can be done by using a grad modifier that performs grads = np.abs(1. / grads) to magnify small positive or negative values. Alternatively, we can use grad_modifier='small_values' which does the same thing.

Lets use this knowledge to visualize the parts of the image that cause the car to increase, decrease, maintain the predicted steering.


In [2]:
import matplotlib.cm as cm
from vis.visualization import visualize_saliency, overlay

titles = ['right steering', 'left steering', 'maintain steering']
modifiers = [None, 'negate', 'small_values']
for i, modifier in enumerate(modifiers):
    heatmap = visualize_saliency(model, layer_idx=-1, filter_indices=0, 
                                 seed_input=bgr_img, grad_modifier=modifier)
    plt.figure()
    plt.title(titles[i])
    # Overlay is used to alpha blend heatmap onto img.
    jet_heatmap = np.uint8(cm.jet(heatmap)[..., :3] * 255)
    plt.imshow(overlay(img, jet_heatmap, alpha=0.7))


That was anti-climactic. Lets try grad-CAM. We know that vanilla saliency can be noisy.


In [3]:
from vis.visualization import visualize_cam

for i, modifier in enumerate(modifiers):
    heatmap = visualize_cam(model, layer_idx=-1, filter_indices=0, 
                            seed_input=bgr_img, grad_modifier=modifier)
    plt.figure()
    plt.title(titles[i])
    # Overlay is used to alpha blend heatmap onto img.
    jet_heatmap = np.uint8(cm.jet(heatmap)[..., :3] * 255)
    plt.imshow(overlay(img, jet_heatmap, alpha=0.7))


This makes sense. In order to turn right, the left part of the lane contributes the most towards it. I am guessing it is attending to the fact that it curves left and so changing it to curve right would make the network increase the steering angle.

The maintain_steering visualization shows that its current decision is mostly due to the object in the right corner. This is an undesirable behavior which visualizations like these can help you uncover.

The left steering case is intuitive as well. Interestingly, the objects in the room in the far right also provide it a cue to turn left. This means that, even without the lane marker, the network will probably turn away from obstacles. Lets put this hypothesis to test. Using my awesome photo editing skills, I will remove the lane marker.

Lets see the predicted output first.


In [4]:
img = utils.load_img('images/blank.png', target_size=(FRAME_H, FRAME_W))
plt.imshow(img)

# Convert to BGR, create input with batch_size: 1.
bgr_img = utils.bgr2rgb(img)
img_input = np.expand_dims(img_to_array(bgr_img), axis=0)
img_input.shape

pred = model.predict(img_input)[0][0]
print('Predicted {}'.format(pred))


Predicted -0.9986222982406616

As predicted, it has a left steering output. Lets find out if those objects have anything to do with it.


In [5]:
# We want to use grad_modifier='small_values' to see what is resposible for maintaining current prediction.
heatmap = visualize_cam(model, layer_idx=-1, filter_indices=0, 
                        seed_input=bgr_img, grad_modifier='small_values')
jet_heatmap = np.uint8(cm.jet(heatmap)[..., :3] * 255)
plt.imshow(overlay(img, jet_heatmap, alpha=0.7))


Out[5]:
<matplotlib.image.AxesImage at 0x7f9a168441d0>

As hypothesized, those objects indeed cause the toy car to go left.

This single example already provides some pretty good insight into network behavior. If your toy self driving car makes a mistake, we can easily use that example to glean why and add counter examples to the training set to correct that behavior.