Thanks to these notes for much of the material for this hands on session.
In [1]:
from IPython.display import IFrame
IFrame('http://cs231n.github.io//assets/conv-demo/index.html', 850, 700)
Out[1]:
The example above serves as a visual guide to the parameters we are working with in a convolutional NN which:
Lets' get familiar with the how a convolutional layer works, and reduces the number of required weights compared with a fully connected layer, by working through an example.
Suppose we have a 32x32x3 input image with 10 3x3 filters of stride 1, pad 1.
Q1 How many neurons are in this layer?
Q2 How many weights would be required for a fully connected layer with this many neurons?
Q3 How many weights are required for each neuron in our convolutional layer? How many weights total for the layer, considering weight sharing?
In [3]:
import numpy as np
def conv_forward_naive(x, w, b, conv_param):
"""
A naive implementation of the forward pass for a convolutional layer.
The input consists of N data points, each with C channels, height H and width
W. We convolve each input with F different filters, where each filter spans
all C channels and has height HH and width HH.
Input:
- x: Input data of shape (N, C, H, W)
- w: Filter weights of shape (F, C, HH, WW)
- b: Biases, of shape (F,)
- conv_param: A dictionary with the following keys:
- 'stride': The number of pixels between adjacent receptive fields in the
horizontal and vertical directions.
- 'pad': The number of pixels that will be used to zero-pad the input.
Returns a tuple of:
- out: Output data, of shape (N, F, H', W') where H' and W' are given by
H' = 1 + (H + 2 * pad - HH) / stride
W' = 1 + (W + 2 * pad - WW) / stride
- cache: (x, w, b, conv_param)
"""
out = None
(N, C, H, W) = x.shape
(F, _, HH, WW) = w.shape
stride = conv_param['stride']
pad = conv_param['pad']
H_prime = 1 + int((H + 2 * pad - HH) / stride)
W_prime = 1 + int((W + 2 * pad - WW) / stride)
out = np.zeros((N, F, H_prime, W_prime))
# your code goes here!
# hint: you can use the function np.pad for padding
# end of your code
cache = (x, w, b, conv_param)
return out, cache
#
# Test bed
#
def rel_error(x, y):
""" returns relative error """
return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))
x_shape = (2, 3, 4, 4)
w_shape = (3, 3, 4, 4)
x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)
w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)
b = np.linspace(-0.1, 0.2, num=3)
conv_param = {'stride': 2, 'pad': 1}
out, _ = conv_forward_naive(x, w, b, conv_param)
correct_out = np.array([[[[[-0.08759809, -0.10987781],
[-0.18387192, -0.2109216]],
[[0.21027089, 0.21661097],
[0.22847626, 0.23004637]],
[[0.50813986, 0.54309974],
[0.64082444, 0.67101435]]],
[[[-0.98053589, -1.03143541],
[-1.19128892, -1.24695841]],
[[0.69108355, 0.66880383],
[0.59480972, 0.56776003]],
[[2.36270298, 2.36904306],
[2.38090835, 2.38247847]]]]])
# Compare your output to solution. Should be within 1e-8 (and print out 0.0000)
print('Testing conv_forward_naive')
print('difference: {:.4f}'.format(np.asscalar(rel_error(out, correct_out))))