So far we have seen many examples of attention and activation maximization on Dense layers that outputs a probability distribution. What if we have a regression model instead?
In this example, we will use a pretrained self driving car model that predicts the steering angle output. This model is borrowed from https://github.com/experiencor/self-driving-toy-car. Here is the model in action.
Lets load the model, weights etc and make a prediction.
In [1]:
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
from model import build_model, FRAME_W, FRAME_H
from keras.preprocessing.image import img_to_array
from vis.utils import utils
model = build_model()
model.load_weights('weights.hdf5')
img = utils.load_img('images/left.png', target_size=(FRAME_H, FRAME_W))
plt.imshow(img)
# Convert to BGR, create input with batch_size: 1.
bgr_img = utils.bgr2rgb(img)
img_input = np.expand_dims(img_to_array(bgr_img), axis=0)
pred = model.predict(img_input)[0][0]
print('Predicted {}'.format(pred))
Looks good. The negative value is indicative of left steering.
By default, visualize_saliency and visualize_cam use positive gradients which shows what parts of the image increase the output value. This makes sense for categorical outputs. For regression cases, it is more interesting to see what parts of the image cause the output to:
the current predicted value. This is where grad_modifiers shine.
grad_modifier='negate'.grads = np.abs(1. / grads) to magnify small positive or negative values. Alternatively, we can use grad_modifier='small_values' which does the same thing.Lets use this knowledge to visualize the parts of the image that cause the car to increase, decrease, maintain the predicted steering.
In [2]:
import matplotlib.cm as cm
from vis.visualization import visualize_saliency, overlay
titles = ['right steering', 'left steering', 'maintain steering']
modifiers = [None, 'negate', 'small_values']
for i, modifier in enumerate(modifiers):
heatmap = visualize_saliency(model, layer_idx=-1, filter_indices=0,
seed_input=bgr_img, grad_modifier=modifier)
plt.figure()
plt.title(titles[i])
# Overlay is used to alpha blend heatmap onto img.
jet_heatmap = np.uint8(cm.jet(heatmap)[..., :3] * 255)
plt.imshow(overlay(img, jet_heatmap, alpha=0.7))
That was anti-climactic. Lets try grad-CAM. We know that vanilla saliency can be noisy.
In [3]:
from vis.visualization import visualize_cam
for i, modifier in enumerate(modifiers):
heatmap = visualize_cam(model, layer_idx=-1, filter_indices=0,
seed_input=bgr_img, grad_modifier=modifier)
plt.figure()
plt.title(titles[i])
# Overlay is used to alpha blend heatmap onto img.
jet_heatmap = np.uint8(cm.jet(heatmap)[..., :3] * 255)
plt.imshow(overlay(img, jet_heatmap, alpha=0.7))
This makes sense. In order to turn right, the left part of the lane contributes the most towards it. I am guessing it is attending to the fact that it curves left and so changing it to curve right would make the network increase the steering angle.
The maintain_steering visualization shows that its current decision is mostly due to the object in the right corner. This is an undesirable behavior which visualizations like these can help you uncover.
The left steering case is intuitive as well. Interestingly, the objects in the room in the far right also provide it a cue to turn left. This means that, even without the lane marker, the network will probably turn away from obstacles. Lets put this hypothesis to test. Using my awesome photo editing skills, I will remove the lane marker.
Lets see the predicted output first.
In [4]:
img = utils.load_img('images/blank.png', target_size=(FRAME_H, FRAME_W))
plt.imshow(img)
# Convert to BGR, create input with batch_size: 1.
bgr_img = utils.bgr2rgb(img)
img_input = np.expand_dims(img_to_array(bgr_img), axis=0)
img_input.shape
pred = model.predict(img_input)[0][0]
print('Predicted {}'.format(pred))
As predicted, it has a left steering output. Lets find out if those objects have anything to do with it.
In [5]:
# We want to use grad_modifier='small_values' to see what is resposible for maintaining current prediction.
heatmap = visualize_cam(model, layer_idx=-1, filter_indices=0,
seed_input=bgr_img, grad_modifier='small_values')
jet_heatmap = np.uint8(cm.jet(heatmap)[..., :3] * 255)
plt.imshow(overlay(img, jet_heatmap, alpha=0.7))
Out[5]:
As hypothesized, those objects indeed cause the toy car to go left.
This single example already provides some pretty good insight into network behavior. If your toy self driving car makes a mistake, we can easily use that example to glean why and add counter examples to the training set to correct that behavior.