Convolutional Networks

So far we have worked with deep fully-connected networks, using them to explore different optimization strategies and network architectures. Fully-connected networks are a good testbed for experimentation because they are very computationally efficient, but in practice all state-of-the-art results use convolutional networks instead.

First you will implement several layer types that are used in convolutional networks. You will then use these layers to train a convolutional network on the CIFAR-10 dataset.


In [1]:
# As usual, a bit of setup

import numpy as np
import matplotlib.pyplot as plt
from cs231n.classifiers.cnn import *
from cs231n.data_utils import get_CIFAR10_data
from cs231n.gradient_check import eval_numerical_gradient_array, eval_numerical_gradient
from cs231n.layers import *
from cs231n.fast_layers import *
from cs231n.solver import Solver

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# for auto-reloading external modules
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
def rel_error(x, y):
  """ returns relative error """
  return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))

In [2]:
# Load the (preprocessed) CIFAR10 data.

data = get_CIFAR10_data()
for k, v in data.iteritems():
  print '%s: ' % k, v.shape


X_val:  (1000, 3, 32, 32)
X_train:  (49000, 3, 32, 32)
X_test:  (1000, 3, 32, 32)
y_val:  (1000,)
y_train:  (49000,)
y_test:  (1000,)

Convolution: Naive forward pass

The core of a convolutional network is the convolution operation. In the file cs231n/layers.py, implement the forward pass for the convolution layer in the function conv_forward_naive.

You don't have to worry too much about efficiency at this point; just write the code in whatever way you find most clear.

You can test your implementation by running the following:


In [39]:
x_shape = (2, 3, 4, 4)
w_shape = (3, 3, 4, 4)
x = np.linspace(-0.1, 0.5, num=np.prod(x_shape)).reshape(x_shape)
w = np.linspace(-0.2, 0.3, num=np.prod(w_shape)).reshape(w_shape)
b = np.linspace(-0.1, 0.2, num=3)

conv_param = {'stride': 2, 'pad': 1}
out, _ = conv_forward_naive(x, w, b, conv_param)
correct_out = np.array([[[[[-0.08759809, -0.10987781],
                           [-0.18387192, -0.2109216 ]],
                          [[ 0.21027089,  0.21661097],
                           [ 0.22847626,  0.23004637]],
                          [[ 0.50813986,  0.54309974],
                           [ 0.64082444,  0.67101435]]],
                         [[[-0.98053589, -1.03143541],
                           [-1.19128892, -1.24695841]],
                          [[ 0.69108355,  0.66880383],
                           [ 0.59480972,  0.56776003]],
                          [[ 2.36270298,  2.36904306],
                           [ 2.38090835,  2.38247847]]]]])

# Compare your output to ours; difference should be around 1e-8
print 'Testing conv_forward_naive'
print 'difference: ', rel_error(out, correct_out)


Testing conv_forward_naive
difference:  2.21214764175e-08

Aside: Image processing via convolutions

As fun way to both check your implementation and gain a better understanding of the type of operation that convolutional layers can perform, we will set up an input containing two images and manually set up filters that perform common image processing operations (grayscale conversion and edge detection). The convolution forward pass will apply these operations to each of the input images. We can then visualize the results as a sanity check.


In [40]:
from scipy.misc import imread, imresize

kitten, puppy = imread('kitten.jpg'), imread('puppy.jpg')
# kitten is wide, and puppy is already square
d = kitten.shape[1] - kitten.shape[0]
kitten_cropped = kitten[:, d/2:-d/2, :]

img_size = 200   # Make this smaller if it runs too slow
x = np.zeros((2, 3, img_size, img_size))
x[0, :, :, :] = imresize(puppy, (img_size, img_size)).transpose((2, 0, 1))
x[1, :, :, :] = imresize(kitten_cropped, (img_size, img_size)).transpose((2, 0, 1))

# Set up a convolutional weights holding 2 filters, each 3x3
w = np.zeros((2, 3, 3, 3))

# The first filter converts the image to grayscale.
# Set up the red, green, and blue channels of the filter.
w[0, 0, :, :] = [[0, 0, 0], [0, 0.3, 0], [0, 0, 0]]
w[0, 1, :, :] = [[0, 0, 0], [0, 0.6, 0], [0, 0, 0]]
w[0, 2, :, :] = [[0, 0, 0], [0, 0.1, 0], [0, 0, 0]]

# Second filter detects horizontal edges in the blue channel.
w[1, 2, :, :] = [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]

# Vector of biases. We don't need any bias for the grayscale
# filter, but for the edge detection filter we want to add 128
# to each output so that nothing is negative.
b = np.array([0, 128])

# Compute the result of convolving each input in x with each filter in w,
# offsetting by b, and storing the results in out.
out, _ = conv_forward_naive(x, w, b, {'stride': 1, 'pad': 1})

def imshow_noax(img, normalize=True):
    """ Tiny helper to show images as uint8 and remove axis labels """
    if normalize:
        img_max, img_min = np.max(img), np.min(img)
        img = 255.0 * (img - img_min) / (img_max - img_min)
    plt.imshow(img.astype('uint8'))
    plt.gca().axis('off')

# Show the original images and the results of the conv operation
plt.subplot(2, 3, 1)
imshow_noax(puppy, normalize=False)
plt.title('Original image')
plt.subplot(2, 3, 2)
imshow_noax(out[0, 0])
plt.title('Grayscale')
plt.subplot(2, 3, 3)
imshow_noax(out[0, 1])
plt.title('Edges')
plt.subplot(2, 3, 4)
imshow_noax(kitten_cropped, normalize=False)
plt.subplot(2, 3, 5)
imshow_noax(out[1, 0])
plt.subplot(2, 3, 6)
imshow_noax(out[1, 1])
plt.show()


Convolution: Naive backward pass

Implement the backward pass for the convolution operation in the function conv_backward_naive in the file cs231n/layers.py. Again, you don't need to worry too much about computational efficiency.

When you are done, run the following to check your backward pass with a numeric gradient check.


In [41]:
x = np.random.randn(4, 3, 5, 5)
w = np.random.randn(2, 3, 3, 3)
b = np.random.randn(2,)
dout = np.random.randn(4, 2, 5, 5)
conv_param = {'stride': 1, 'pad': 1}

dx_num = eval_numerical_gradient_array(lambda x: conv_forward_naive(x, w, b, conv_param)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: conv_forward_naive(x, w, b, conv_param)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: conv_forward_naive(x, w, b, conv_param)[0], b, dout)

out, cache = conv_forward_naive(x, w, b, conv_param)
dx, dw, db = conv_backward_naive(dout, cache)

# Your errors should be around 1e-9'
print 'Testing conv_backward_naive function'
print 'dx error: ', rel_error(dx, dx_num)
#print dx
#print dx_num
print 'dw error: ', rel_error(dw, dw_num)
print 'db error: ', rel_error(db, db_num)


Testing conv_backward_naive function
dx error:  6.18357316555e-09
dw error:  3.34701514672e-10
db error:  1.20552063176e-11

Max pooling: Naive forward

Implement the forward pass for the max-pooling operation in the function max_pool_forward_naive in the file cs231n/layers.py. Again, don't worry too much about computational efficiency.

Check your implementation by running the following:


In [42]:
x_shape = (2, 3, 4, 4)
x = np.linspace(-0.3, 0.4, num=np.prod(x_shape)).reshape(x_shape)
pool_param = {'pool_width': 2, 'pool_height': 2, 'stride': 2}

out, _ = max_pool_forward_naive(x, pool_param)
out_fast, cache_fast = max_pool_forward_fast(x, pool_param)
correct_out = np.array([[[[-0.26315789, -0.24842105],
                          [-0.20421053, -0.18947368]],
                         [[-0.14526316, -0.13052632],
                          [-0.08631579, -0.07157895]],
                         [[-0.02736842, -0.01263158],
                          [ 0.03157895,  0.04631579]]],
                        [[[ 0.09052632,  0.10526316],
                          [ 0.14947368,  0.16421053]],
                         [[ 0.20842105,  0.22315789],
                          [ 0.26736842,  0.28210526]],
                         [[ 0.32631579,  0.34105263],
                          [ 0.38526316,  0.4       ]]]])

# Compare your output with ours. Difference should be around 1e-8.
print 'Testing max_pool_forward_naive function:'
print 'difference: ', rel_error(out, correct_out)
print 'difference: ', rel_error(out_fast, correct_out)


Testing max_pool_forward_naive function:
difference:  4.16666651573e-08
difference:  4.16666651573e-08

Max pooling: Naive backward

Implement the backward pass for the max-pooling operation in the function max_pool_backward_naive in the file cs231n/layers.py. You don't need to worry about computational efficiency.

Check your implementation with numeric gradient checking by running the following:


In [3]:
from cs231n.fast_layers import max_pool_forward_fast, max_pool_backward_fast

x = np.random.randn(3, 2, 8, 8)
dout = np.random.randn(3, 2, 4, 4)
pool_param = {'pool_height': 2, 'pool_width': 2, 'stride': 2}

dx_num = eval_numerical_gradient_array(lambda x: max_pool_forward_naive(x, pool_param)[0], x, dout)

out, cache = max_pool_forward_naive(x, pool_param)
dx = max_pool_backward_naive(dout, cache)

# Your error should be around 1e-12
print 'Testing max_pool_backward_naive function:'
print 'dx error: ', rel_error(dx, dx_num)


Testing max_pool_backward_naive function:
dx error:  3.27562446905e-12

Fast layers

Making convolution and pooling layers fast can be challenging. To spare you the pain, we've provided fast implementations of the forward and backward passes for convolution and pooling layers in the file cs231n/fast_layers.py.

The fast convolution implementation depends on a Cython extension; to compile it you need to run the following from the cs231n directory:

python setup.py build_ext --inplace

The API for the fast versions of the convolution and pooling layers is exactly the same as the naive versions that you implemented above: the forward pass receives data, weights, and parameters and produces outputs and a cache object; the backward pass recieves upstream derivatives and the cache object and produces gradients with respect to the data and weights.

NOTE: The fast implementation for pooling will only perform optimally if the pooling regions are non-overlapping and tile the input. If these conditions are not met then the fast pooling implementation will not be much faster than the naive implementation.

You can compare the performance of the naive and fast versions of these layers by running the following:


In [4]:
from cs231n.fast_layers import conv_forward_fast, conv_backward_fast
from time import time

x = np.random.randn(100, 3, 31, 31)
w = np.random.randn(25, 3, 3, 3)
b = np.random.randn(25,)
dout = np.random.randn(100, 25, 16, 16)
conv_param = {'stride': 2, 'pad': 1}

t0 = time()
out_naive, cache_naive = conv_forward_naive(x, w, b, conv_param)
t1 = time()
out_fast, cache_fast = conv_forward_fast(x, w, b, conv_param)
t2 = time()

print 'Testing conv_forward_fast:'
print 'Naive: %fs' % (t1 - t0)
print 'Fast: %fs' % (t2 - t1)
print 'Speedup: %fx' % ((t1 - t0) / (t2 - t1))
print 'Difference: ', rel_error(out_naive, out_fast)

t0 = time()
dx_naive, dw_naive, db_naive = conv_backward_naive(dout, cache_naive)
t1 = time()
dx_fast, dw_fast, db_fast = conv_backward_fast(dout, cache_fast)
t2 = time()

print '\nTesting conv_backward_fast:'
print 'Naive: %fs' % (t1 - t0)
print 'Fast: %fs' % (t2 - t1)
print 'Speedup: %fx' % ((t1 - t0) / (t2 - t1))
print 'dx difference: ', rel_error(dx_naive, dx_fast)
print 'dw difference: ', rel_error(dw_naive, dw_fast)
print 'db difference: ', rel_error(db_naive, db_fast)


Testing conv_forward_fast:
Naive: 8.853287s
Fast: 0.260367s
Speedup: 34.003087x
Difference:  3.30489815622e-11

Testing conv_backward_fast:
Naive: 4.729306s
Fast: 0.042266s
Speedup: 111.893517x
dx difference:  9.14822596925e-12
dw difference:  3.29378295164e-13
db difference:  0.0

In [5]:
from cs231n.fast_layers import max_pool_forward_fast, max_pool_backward_fast

x = np.random.randn(100, 3, 32, 32)
dout = np.random.randn(100, 3, 16, 16)
pool_param = {'pool_height': 2, 'pool_width': 2, 'stride': 2}

t0 = time()
out_naive, cache_naive = max_pool_forward_naive(x, pool_param)
t1 = time()
out_fast, cache_fast = max_pool_forward_fast(x, pool_param)
t2 = time()

print 'Testing pool_forward_fast:'
print 'Naive: %fs' % (t1 - t0)
print 'fast: %fs' % (t2 - t1)
print 'speedup: %fx' % ((t1 - t0) / (t2 - t1))
#print out_naive
#print out_fast
print 'difference: ', rel_error(out_naive, out_fast)

t0 = time()
dx_naive = max_pool_backward_naive(dout, cache_naive)
t1 = time()
dx_fast = max_pool_backward_fast(dout, cache_fast)
t2 = time()

print '\nTesting pool_backward_fast:'
print 'Naive: %fs' % (t1 - t0)
print 'fast: %fs' % (t2 - t1)
print 'speedup: %fx' % ((t1 - t0) / (t2 - t1))
print 'dx difference: ', rel_error(dx_naive, dx_fast)


Testing pool_forward_fast:
Naive: 0.207427s
fast: 0.001938s
speedup: 107.038878x
difference:  0.0

Testing pool_backward_fast:
Naive: 0.287399s
fast: 0.010548s
speedup: 27.247101x
dx difference:  0.0

Convolutional "sandwich" layers

Previously we introduced the concept of "sandwich" layers that combine multiple operations into commonly used patterns. In the file cs231n/layer_utils.py you will find sandwich layers that implement a few commonly used patterns for convolutional networks.


In [100]:
from cs231n.layer_utils import conv_relu_pool_forward, conv_relu_pool_backward

x = np.random.randn(2, 3, 16, 16)
w = np.random.randn(3, 3, 3, 3)
b = np.random.randn(3,)
dout = np.random.randn(2, 3, 8, 8)
conv_param = {'stride': 1, 'pad': 1}
pool_param = {'pool_height': 2, 'pool_width': 2, 'stride': 2}

out, cache = conv_relu_pool_forward(x, w, b, conv_param, pool_param)
dx, dw, db = conv_relu_pool_backward(dout, cache)

dx_num = eval_numerical_gradient_array(lambda x: conv_relu_pool_forward(x, w, b, conv_param, pool_param)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: conv_relu_pool_forward(x, w, b, conv_param, pool_param)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: conv_relu_pool_forward(x, w, b, conv_param, pool_param)[0], b, dout)

print 'Testing conv_relu_pool'
print 'dx error: ', rel_error(dx_num, dx)
print 'dw error: ', rel_error(dw_num, dw)
print 'db error: ', rel_error(db_num, db)


Testing conv_relu_pool
dx error:  8.47933221262e-09
dw error:  1.2276167305e-09
db error:  1.1823712336e-11

In [124]:
from cs231n.layer_utils import conv_relu_forward, conv_relu_backward

x = np.random.randn(2, 3, 8, 8)
w = np.random.randn(3, 3, 3, 3)
b = np.random.randn(3,)
dout = np.random.randn(2, 3, 8, 8)
conv_param = {'stride': 1, 'pad': 1}

out, cache = conv_relu_forward(x, w, b, conv_param)
dx, dw, db = conv_relu_backward(dout, cache)

dx_num = eval_numerical_gradient_array(lambda x: conv_relu_forward(x, w, b, conv_param)[0], x, dout)
dw_num = eval_numerical_gradient_array(lambda w: conv_relu_forward(x, w, b, conv_param)[0], w, dout)
db_num = eval_numerical_gradient_array(lambda b: conv_relu_forward(x, w, b, conv_param)[0], b, dout)

print 'Testing conv_relu:'
print 'dx error: ', rel_error(dx_num, dx)
print 'dw error: ', rel_error(dw_num, dw)
print 'db error: ', rel_error(db_num, db)


Testing conv_relu:
dx error:  2.02241062283e-08
dw error:  1.83198106362e-09
db error:  1.84045082078e-11

Three-layer ConvNet

Now that you have implemented all the necessary layers, we can put them together into a simple convolutional network.

Open the file cs231n/cnn.py and complete the implementation of the ThreeLayerConvNet class. Run the following cells to help you debug:

Sanity check loss

After you build a new network, one of the first things you should do is sanity check the loss. When we use the softmax loss, we expect the loss for random weights (and no regularization) to be about log(C) for C classes. When we add regularization this should go up.


In [8]:
model = ThreeLayerConvNet()

N = 50
X = np.random.randn(N, 3, 32, 32)
y = np.random.randint(10, size=N)

loss, grads = model.loss(X, y)
print 'Initial loss (no regularization): ', loss

model.reg = 0.5
loss, grads = model.loss(X, y)
print 'Initial loss (with regularization): ', loss


(50, 1)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-8-eee75a41560a> in <module>()
      5 y = np.random.randint(10, size=N)
      6 
----> 7 loss, grads = model.loss(X, y)
      8 print 'Initial loss (no regularization): ', loss
      9 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/classifiers/cnn.pyc in loss(self, X, y)
    111     # for self.params[k]. Don't forget to add L2 regularization!               #
    112     ############################################################################
--> 113     loss,dout=softmax_loss_temp(scores,y)
    114 
    115     loss += 0.5*self.reg*np.sum(W1*W1)+0.5*self.reg*np.sum(W2*W2)+0.5*self.reg*np.sum(W3*W3)

/home/zengliang/winter1516_assignment2/assignment2/cs231n/layers.py in softmax_loss_temp(x, y, alpha, belta)
    769   print x_temp.shape
    770   for i in range(c):
--> 771       x_temp[:,i] = x_temp[:,i]+0.5*belta*np.sum(x[:,c*i+1:c*(i+1)],axis=1)
    772   print x_temp.shape
    773   probs = np.exp(x_temp - np.max(x_temp, axis=1, keepdims=True))

IndexError: index 1 is out of bounds for axis 1 with size 1

Gradient check

After the loss looks reasonable, use numeric gradient checking to make sure that your backward pass is correct. When you use numeric gradient checking you should use a small amount of artifical data and a small number of neurons at each layer.


In [126]:
num_inputs = 2
input_dim = (3, 16, 16)
reg = 0.0
num_classes = 10
X = np.random.randn(num_inputs, *input_dim)
y = np.random.randint(num_classes, size=num_inputs)

model = ThreeLayerConvNet(num_filters=3, filter_size=3,
                          input_dim=input_dim, hidden_dim=7,
                          dtype=np.float64)
loss, grads = model.loss(X, y)
for param_name in sorted(grads):
    f = lambda _: model.loss(X, y)[0]
    param_grad_num = eval_numerical_gradient(f, model.params[param_name], verbose=False, h=1e-6)
    e = rel_error(param_grad_num, grads[param_name])
    print '%s max relative error: %e' % (param_name, rel_error(param_grad_num, grads[param_name]))


W1 max relative error: 1.000000e+00
W2 max relative error: 5.838236e-01
W3 max relative error: 4.950137e-01
b1 max relative error: 6.060299e-01
b2 max relative error: 5.800478e-01
b3 max relative error: 4.492255e-01

Overfit small data

A nice trick is to train your model with just a few training samples. You should be able to overfit small datasets, which will result in very high training accuracy and comparatively low validation accuracy.


In [66]:
num_train = 100
small_data = {
  'X_train': data['X_train'][:num_train],
  'y_train': data['y_train'][:num_train],
  'X_val': data['X_val'],
  'y_val': data['y_val'],
}

model = ThreeLayerConvNet(weight_scale=1e-2)

solver = Solver(model, small_data,
                num_epochs=15, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=1)
solver.train()


(Iteration 1 / 30) loss: 2.365631
(Epoch 0 / 15) train acc: 0.210000; val_acc: 0.111000
(Iteration 2 / 30) loss: 3.408930
(Epoch 1 / 15) train acc: 0.230000; val_acc: 0.091000
(Iteration 3 / 30) loss: 2.568033
(Iteration 4 / 30) loss: 2.898103
(Epoch 2 / 15) train acc: 0.230000; val_acc: 0.178000
(Iteration 5 / 30) loss: 2.265252
(Iteration 6 / 30) loss: 1.817820
(Epoch 3 / 15) train acc: 0.470000; val_acc: 0.136000
(Iteration 7 / 30) loss: 1.647687
(Iteration 8 / 30) loss: 1.776858
(Epoch 4 / 15) train acc: 0.550000; val_acc: 0.152000
(Iteration 9 / 30) loss: 1.828471
(Iteration 10 / 30) loss: 1.220002
(Epoch 5 / 15) train acc: 0.640000; val_acc: 0.202000
(Iteration 11 / 30) loss: 1.244590
(Iteration 12 / 30) loss: 1.126374
(Epoch 6 / 15) train acc: 0.740000; val_acc: 0.218000
(Iteration 13 / 30) loss: 1.007658
(Iteration 14 / 30) loss: 0.861593
(Epoch 7 / 15) train acc: 0.690000; val_acc: 0.202000
(Iteration 15 / 30) loss: 1.003764
(Iteration 16 / 30) loss: 0.729228
(Epoch 8 / 15) train acc: 0.850000; val_acc: 0.193000
(Iteration 17 / 30) loss: 0.570097
(Iteration 18 / 30) loss: 0.705321
(Epoch 9 / 15) train acc: 0.810000; val_acc: 0.167000
(Iteration 19 / 30) loss: 0.752162
(Iteration 20 / 30) loss: 0.663235
(Epoch 10 / 15) train acc: 0.890000; val_acc: 0.198000
(Iteration 21 / 30) loss: 0.330915
(Iteration 22 / 30) loss: 0.438326
(Epoch 11 / 15) train acc: 0.810000; val_acc: 0.203000
(Iteration 23 / 30) loss: 0.749956
(Iteration 24 / 30) loss: 0.629249
(Epoch 12 / 15) train acc: 0.930000; val_acc: 0.241000
(Iteration 25 / 30) loss: 0.221407
(Iteration 26 / 30) loss: 0.200581
(Epoch 13 / 15) train acc: 0.890000; val_acc: 0.194000
(Iteration 27 / 30) loss: 0.346046
(Iteration 28 / 30) loss: 0.321695
(Epoch 14 / 15) train acc: 0.880000; val_acc: 0.208000
(Iteration 29 / 30) loss: 0.384454
(Iteration 30 / 30) loss: 0.275492
(Epoch 15 / 15) train acc: 0.940000; val_acc: 0.193000

Plotting the loss, training accuracy, and validation accuracy should show clear overfitting:


In [67]:
plt.subplot(2, 1, 1)
plt.plot(solver.loss_history, 'o')
plt.xlabel('iteration')
plt.ylabel('loss')

plt.subplot(2, 1, 2)
plt.plot(solver.train_acc_history, '-o')
plt.plot(solver.val_acc_history, '-o')
plt.legend(['train', 'val'], loc='upper left')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.show()


Train the net

By training the three-layer convolutional network for one epoch, you should achieve greater than 40% accuracy on the training set:


In [68]:
model = ThreeLayerConvNet(weight_scale=0.001, hidden_dim=500, reg=0.001)

solver = Solver(model, data,
                num_epochs=1, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=20)
solver.train()


(Iteration 1 / 980) loss: 2.304574
(Epoch 0 / 1) train acc: 0.101000; val_acc: 0.119000
(Iteration 21 / 980) loss: 1.962605
(Iteration 41 / 980) loss: 1.940497
(Iteration 61 / 980) loss: 1.937721
(Iteration 81 / 980) loss: 1.967191
(Iteration 101 / 980) loss: 2.037985
(Iteration 121 / 980) loss: 1.850920
(Iteration 141 / 980) loss: 1.873033
(Iteration 161 / 980) loss: 1.782839
(Iteration 181 / 980) loss: 1.727118
(Iteration 201 / 980) loss: 1.671961
(Iteration 221 / 980) loss: 1.679592
(Iteration 241 / 980) loss: 1.823214
(Iteration 261 / 980) loss: 2.030962
(Iteration 281 / 980) loss: 1.680479
(Iteration 301 / 980) loss: 2.102454
(Iteration 321 / 980) loss: 1.599381
(Iteration 341 / 980) loss: 1.850792
(Iteration 361 / 980) loss: 1.717346
(Iteration 381 / 980) loss: 1.626259
(Iteration 401 / 980) loss: 1.722622
(Iteration 421 / 980) loss: 1.662781
(Iteration 441 / 980) loss: 1.953107
(Iteration 461 / 980) loss: 1.320160
(Iteration 481 / 980) loss: 1.522696
(Iteration 501 / 980) loss: 1.410285
(Iteration 521 / 980) loss: 1.625847
(Iteration 541 / 980) loss: 1.488752
(Iteration 561 / 980) loss: 1.817768
(Iteration 581 / 980) loss: 1.888829
(Iteration 601 / 980) loss: 1.666338
(Iteration 621 / 980) loss: 1.824948
(Iteration 641 / 980) loss: 1.871745
(Iteration 661 / 980) loss: 1.781277
(Iteration 681 / 980) loss: 1.915411
(Iteration 701 / 980) loss: 1.296078
(Iteration 721 / 980) loss: 1.562524
(Iteration 741 / 980) loss: 1.517117
(Iteration 761 / 980) loss: 1.457624
(Iteration 781 / 980) loss: 1.582438
(Iteration 801 / 980) loss: 1.605301
(Iteration 821 / 980) loss: 1.498177
(Iteration 841 / 980) loss: 1.542092
(Iteration 861 / 980) loss: 1.420068
(Iteration 881 / 980) loss: 1.598874
(Iteration 901 / 980) loss: 1.582207
(Iteration 921 / 980) loss: 1.584232
(Iteration 941 / 980) loss: 1.711081
(Iteration 961 / 980) loss: 1.515051
(Epoch 1 / 1) train acc: 0.488000; val_acc: 0.498000

Visualize Filters

You can visualize the first-layer convolutional filters from the trained network by running the following:


In [69]:
from cs231n.vis_utils import visualize_grid

grid = visualize_grid(model.params['W1'].transpose(0, 2, 3, 1))
plt.imshow(grid.astype('uint8'))
plt.axis('off')
plt.gcf().set_size_inches(5, 5)
plt.show()


Spatial Batch Normalization

We already saw that batch normalization is a very useful technique for training deep fully-connected networks. Batch normalization can also be used for convolutional networks, but we need to tweak it a bit; the modification will be called "spatial batch normalization."

Normally batch-normalization accepts inputs of shape (N, D) and produces outputs of shape (N, D), where we normalize across the minibatch dimension N. For data coming from convolutional layers, batch normalization needs to accept inputs of shape (N, C, H, W) and produce outputs of shape (N, C, H, W) where the N dimension gives the minibatch size and the (H, W) dimensions give the spatial size of the feature map.

If the feature map was produced using convolutions, then we expect the statistics of each feature channel to be relatively consistent both between different imagesand different locations within the same image. Therefore spatial batch normalization computes a mean and variance for each of the C feature channels by computing statistics over both the minibatch dimension N and the spatial dimensions H and W.

Spatial batch normalization: forward

In the file cs231n/layers.py, implement the forward pass for spatial batch normalization in the function spatial_batchnorm_forward. Check your implementation by running the following:


In [71]:
# Check the training-time forward pass by checking means and variances
# of features both before and after spatial batch normalization

N, C, H, W = 2, 3, 4, 5
x = 4 * np.random.randn(N, C, H, W) + 10

print 'Before spatial batch normalization:'
print '  Shape: ', x.shape
print '  Means: ', x.mean(axis=(0, 2, 3))
print '  Stds: ', x.std(axis=(0, 2, 3))

# Means should be close to zero and stds close to one
gamma, beta = np.ones(C), np.zeros(C)
bn_param = {'mode': 'train'}
out, _ = spatial_batchnorm_forward(x, gamma, beta, bn_param)
print 'After spatial batch normalization:'
print '  Shape: ', out.shape
print '  Means: ', out.mean(axis=(0, 2, 3))
print '  Stds: ', out.std(axis=(0, 2, 3))

# Means should be close to beta and stds close to gamma
gamma, beta = np.asarray([3, 4, 5]), np.asarray([6, 7, 8])
out, _ = spatial_batchnorm_forward(x, gamma, beta, bn_param)
print 'After spatial batch normalization (nontrivial gamma, beta):'
print '  Shape: ', out.shape
print '  Means: ', out.mean(axis=(0, 2, 3))
print '  Stds: ', out.std(axis=(0, 2, 3))


Before spatial batch normalization:
  Shape:  (2, 3, 4, 5)
  Means:  [  9.48901739  10.33215284   9.18778842]
  Stds:  [ 4.4931751   4.0332237   3.75811403]
After spatial batch normalization:
  Shape:  (2, 3, 4, 5)
  Means:  [ -8.32667268e-17   2.94209102e-16  -5.46784840e-16]
  Stds:  [ 0.99999975  0.99999969  0.99999965]
After spatial batch normalization (nontrivial gamma, beta):
  Shape:  (2, 3, 4, 5)
  Means:  [ 6.  7.  8.]
  Stds:  [ 2.99999926  3.99999877  4.99999823]

In [72]:
# Check the test-time forward pass by running the training-time
# forward pass many times to warm up the running averages, and then
# checking the means and variances of activations after a test-time
# forward pass.

N, C, H, W = 10, 4, 11, 12

bn_param = {'mode': 'train'}
gamma = np.ones(C)
beta = np.zeros(C)
for t in xrange(50):
  x = 2.3 * np.random.randn(N, C, H, W) + 13
  spatial_batchnorm_forward(x, gamma, beta, bn_param)
bn_param['mode'] = 'test'
x = 2.3 * np.random.randn(N, C, H, W) + 13
a_norm, _ = spatial_batchnorm_forward(x, gamma, beta, bn_param)

# Means should be close to zero and stds close to one, but will be
# noisier than training-time forward passes.
print 'After spatial batch normalization (test-time):'
print '  means: ', a_norm.mean(axis=(0, 2, 3))
print '  stds: ', a_norm.std(axis=(0, 2, 3))


After spatial batch normalization (test-time):
  means:  [ 0.04444914  0.02411446  0.02738975  0.03575366]
  stds:  [ 1.01213143  0.99598667  1.0235327   1.0025881 ]

Spatial batch normalization: backward

In the file cs231n/layers.py, implement the backward pass for spatial batch normalization in the function spatial_batchnorm_backward. Run the following to check your implementation using a numeric gradient check:


In [75]:
N, C, H, W = 2, 3, 4, 5
x = 5 * np.random.randn(N, C, H, W) + 12
gamma = np.random.randn(C)
beta = np.random.randn(C)
dout = np.random.randn(N, C, H, W)

bn_param = {'mode': 'train'}
fx = lambda x: spatial_batchnorm_forward(x, gamma, beta, bn_param)[0]
fg = lambda a: spatial_batchnorm_forward(x, gamma, beta, bn_param)[0]
fb = lambda b: spatial_batchnorm_forward(x, gamma, beta, bn_param)[0]

dx_num = eval_numerical_gradient_array(fx, x, dout)
da_num = eval_numerical_gradient_array(fg, gamma, dout)
db_num = eval_numerical_gradient_array(fb, beta, dout)

_, cache = spatial_batchnorm_forward(x, gamma, beta, bn_param)
dx, dgamma, dbeta = spatial_batchnorm_backward(dout, cache)
print 'dx error: ', rel_error(dx_num, dx)
print 'dgamma error: ', rel_error(da_num, dgamma)
print 'dbeta error: ', rel_error(db_num, dbeta)


dx error:  2.45314899605e-09
dgamma error:  8.04703257157e-12
dbeta error:  1.80437500938e-11

Experiment!

Experiment and try to get the best performance that you can on CIFAR-10 using a ConvNet. Here are some ideas to get you started:

Things you should try:

  • Filter size: Above we used 7x7; this makes pretty pictures but smaller filters may be more efficient
  • Number of filters: Above we used 32 filters. Do more or fewer do better?
  • Batch normalization: Try adding spatial batch normalization after convolution layers and vanilla batch normalization aafter affine layers. Do your networks train faster?
  • Network architecture: The network above has two layers of trainable parameters. Can you do better with a deeper network? You can implement alternative architectures in the file cs231n/classifiers/convnet.py. Some good architectures to try include:
    • [conv-relu-pool]xN - conv - relu - [affine]xM - [softmax or SVM]
    • [conv-relu-pool]XN - [affine]XM - [softmax or SVM]
    • [conv-relu-conv-relu-pool]xN - [affine]xM - [softmax or SVM]

Tips for training

For each network architecture that you try, you should tune the learning rate and regularization strength. When doing this there are a couple important things to keep in mind:

  • If the parameters are working well, you should see improvement within a few hundred iterations
  • Remember the course-to-fine approach for hyperparameter tuning: start by testing a large range of hyperparameters for just a few training iterations to find the combinations of parameters that are working at all.
  • Once you have found some sets of parameters that seem to work, search more finely around these parameters. You may need to train for more epochs.

Going above and beyond

If you are feeling adventurous there are many other features you can implement to try and improve your performance. You are not required to implement any of these; however they would be good things to try for extra credit.

  • Alternative update steps: For the assignment we implemented SGD+momentum, RMSprop, and Adam; you could try alternatives like AdaGrad or AdaDelta.
  • Alternative activation functions such as leaky ReLU, parametric ReLU, or MaxOut.
  • Model ensembles
  • Data augmentation

If you do decide to implement something extra, clearly describe it in the "Extra Credit Description" cell below.

What we expect

At the very least, you should be able to train a ConvNet that gets at least 65% accuracy on the validation set. This is just a lower bound - if you are careful it should be possible to get accuracies much higher than that! Extra credit points will be awarded for particularly high-scoring models or unique approaches.

You should use the space below to experiment and train your network. The final cell in this notebook should contain the training, validation, and test set accuracies for your final trained network. In this notebook you should also write an explanation of what you did, any additional features that you implemented, and any visualizations or graphs that you make in the process of training and evaluating your network.

Have fun and happy training!


In [9]:
# Train a really good model on CIFAR-10
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([100,100],[64,64,128,128,256,256])

N = 50
X = np.random.randn(N, 3, 32, 32)
y = np.random.randint(10, size=N)

loss, grads = model.loss(X, y)
print 'Initial loss (no regularization): ', loss

model.reg = 0.5
loss, grads = model.loss(X, y)
print 'Initial loss (with regularization): ', loss


Initial loss (no regularization):  2.52023956299
Initial loss (with regularization):  2958.5117099

In [10]:
N = 50
X = np.random.randn(N, 3, 32, 32)
y = np.random.randint(10, size=N)

model = ConvNet([100,100],[64,64,128,128,256,256])
loss, grads = model.loss(X, y)
for param_name in sorted(grads):
    f = lambda _: model.loss(X, y)[0]
    param_grad_num = eval_numerical_gradient(f, model.params[param_name], verbose=False, h=1e-6)
    e = rel_error(param_grad_num, grads[param_name])
    print '%s max relative error: %e' % (param_name, rel_error(param_grad_num, grads[param_name]))


---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-10-5d9d9d3407ea> in <module>()
      7 for param_name in sorted(grads):
      8     f = lambda _: model.loss(X, y)[0]
----> 9     param_grad_num = eval_numerical_gradient(f, model.params[param_name], verbose=False, h=1e-6)
     10     e = rel_error(param_grad_num, grads[param_name])
     11     print '%s max relative error: %e' % (param_name, rel_error(param_grad_num, grads[param_name]))

/home/zengliang/winter1516_assignment2/assignment2/cs231n/gradient_check.pyc in eval_numerical_gradient(f, x, verbose, h)
     19     oldval = x[ix]
     20     x[ix] = oldval + h # increment by h
---> 21     fxph = f(x) # evalute f(x + h)
     22     x[ix] = oldval - h
     23     fxmh = f(x) # evaluate f(x - h)

<ipython-input-10-5d9d9d3407ea> in <lambda>(_)
      6 loss, grads = model.loss(X, y)
      7 for param_name in sorted(grads):
----> 8     f = lambda _: model.loss(X, y)[0]
      9     param_grad_num = eval_numerical_gradient(f, model.params[param_name], verbose=False, h=1e-6)
     10     e = rel_error(param_grad_num, grads[param_name])

/home/zengliang/winter1516_assignment2/assignment2/cs231n/classifiers/mycnn.py in loss(self, X, y)
     94       score_temp=X
     95       for i in xrange(self.conv_num):
---> 96           out,cache['layer'+str(i+1)]=conv_batchnorm_relu_pool_forward(score_temp,self.params['W'+str(i+1)],self.params['b'+str(i+1)],self.params['gamma'+str(i+1)],self.params['beta'+str(i+1)],conv_param,pool_param,self.bn_params[i])
     97           score_temp=out
     98 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/classifiers/mycnn.py in conv_batchnorm_relu_pool_forward(x, w, b, gamma, beta, conv_param, pool_param, bn_param)
    161 
    162 def conv_batchnorm_relu_pool_forward(x, w, b,gamma, beta, conv_param, pool_param,bn_param):
--> 163     a,conv_cache = conv_forward_fast(x,w,b,conv_param)
    164     a2,bn_cache=spatial_batchnorm_forward(a,gamma,beta,bn_param)
    165     a3,relu_cache=relu_forward(a2)

/home/zengliang/winter1516_assignment2/assignment2/cs231n/fast_layers.pyc in conv_forward_strides(x, w, b, conv_param)
     69 
     70   # Now all our convolutions are a big matrix multiply
---> 71   res = w.reshape(F, -1).dot(x_cols) + b.reshape(-1, 1)
     72 
     73   # Reshape the output

KeyboardInterrupt: 

In [23]:
from cs231n.classifiers.cnn import ThreeLayerConvNet
num_train = 100
small_data = {
  'X_train': data['X_train'][:num_train],
  'y_train': data['y_train'][:num_train],
  'X_val': data['X_val'],
  'y_val': data['y_val'],
}

model = ThreeLayerConvNet()

solver = Solver(model, small_data,
                num_epochs=15, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=1)
solver.train()


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-b131f967610c> in <module>()
     17                 },
     18                 verbose=True, print_every=1)
---> 19 solver.train()

/home/zengliang/winter1516_assignment2/assignment2/cs231n/solver.pyc in train(self)
    226 
    227     for t in xrange(num_iterations):
--> 228       self._step()
    229 
    230       # Maybe print training loss

/home/zengliang/winter1516_assignment2/assignment2/cs231n/solver.pyc in _step(self)
    163 
    164     # Compute loss and gradient
--> 165     loss, grads = self.model.loss(X_batch, y_batch)
    166     self.loss_history.append(loss)
    167 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/classifiers/cnn.py in loss(self, X, y)
     94     # variable.                                                                #
     95     ############################################################################
---> 96     out1, cache_conv_relu_pool=conv_relu_pool_forward(X, W1, b1, conv_param, pool_param)
     97     out2,cache_affine_relu=affine_relu_forward(out1,W2,b2)
     98     scores,cache_affine=affine_forward(out2,W3,b3)

/home/zengliang/winter1516_assignment2/assignment2/cs231n/layer_utils.pyc in conv_relu_pool_forward(x, w, b, conv_param, pool_param)
     87   - cache: Object to give to the backward pass
     88   """
---> 89   a, conv_cache = conv_forward_fast(x, w, b, conv_param)
     90   s, relu_cache = relu_forward(a)
     91   out, pool_cache = max_pool_forward_fast(s, pool_param)

/home/zengliang/winter1516_assignment2/assignment2/cs231n/fast_layers.pyc in conv_forward_strides(x, w, b, conv_param)
     69 
     70   # Now all our convolutions are a big matrix multiply
---> 71   res = w.reshape(F, -1).dot(x_cols) + b.reshape(-1, 1)
     72 
     73   # Reshape the output

ValueError: shapes (32,147) and (49,115200) not aligned: 147 (dim 1) != 49 (dim 0)

In [12]:
plt.subplot(2, 1, 1)
plt.plot(solver.loss_history, 'o')
plt.xlabel('iteration')
plt.ylabel('loss')

plt.subplot(2, 1, 2)
plt.plot(solver.train_acc_history, '-o')
plt.plot(solver.val_acc_history, '-o')
plt.legend(['train', 'val'], loc='upper left')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.show()



In [3]:
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([100,100],[32,32,64,64,128,128])

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


(Iteration 1 / 4900) loss: 2.423632
(Epoch 0 / 5) train acc: 0.098000; val_acc: 0.109000
(Iteration 501 / 4900) loss: 1.509718
(Epoch 1 / 5) train acc: 0.614000; val_acc: 0.616000
(Iteration 1001 / 4900) loss: 1.086694
(Iteration 1501 / 4900) loss: 1.015397
(Epoch 2 / 5) train acc: 0.670000; val_acc: 0.677000
(Iteration 2001 / 4900) loss: 1.333757
(Iteration 2501 / 4900) loss: 0.844296
(Epoch 3 / 5) train acc: 0.730000; val_acc: 0.717000
(Iteration 3001 / 4900) loss: 0.728297
(Iteration 3501 / 4900) loss: 0.530599
(Epoch 4 / 5) train acc: 0.746000; val_acc: 0.722000
(Iteration 4001 / 4900) loss: 0.812512
(Iteration 4501 / 4900) loss: 0.614442
(Epoch 5 / 5) train acc: 0.782000; val_acc: 0.744000

In [4]:
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([100,100],[64,64,128,128,256,256])

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


(Iteration 1 / 4900) loss: 2.604150
(Epoch 0 / 5) train acc: 0.116000; val_acc: 0.129000
(Iteration 501 / 4900) loss: 1.254849
(Epoch 1 / 5) train acc: 0.690000; val_acc: 0.661000
(Iteration 1001 / 4900) loss: 1.064097
(Iteration 1501 / 4900) loss: 0.946639
(Epoch 2 / 5) train acc: 0.734000; val_acc: 0.707000
(Iteration 2001 / 4900) loss: 0.850200
(Iteration 2501 / 4900) loss: 0.730295
(Epoch 3 / 5) train acc: 0.762000; val_acc: 0.724000
(Iteration 3001 / 4900) loss: 1.038651
(Iteration 3501 / 4900) loss: 0.704184
(Epoch 4 / 5) train acc: 0.802000; val_acc: 0.744000
(Iteration 4001 / 4900) loss: 0.713150
(Iteration 4501 / 4900) loss: 0.532252
(Epoch 5 / 5) train acc: 0.849000; val_acc: 0.757000

In [5]:
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([200,200],[32,32,64,64,128,128,])

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


(Iteration 1 / 4900) loss: 2.411086
(Epoch 0 / 5) train acc: 0.126000; val_acc: 0.118000
(Iteration 501 / 4900) loss: 1.235233
(Epoch 1 / 5) train acc: 0.624000; val_acc: 0.606000
(Iteration 1001 / 4900) loss: 1.240939
(Iteration 1501 / 4900) loss: 0.984925
(Epoch 2 / 5) train acc: 0.710000; val_acc: 0.662000
(Iteration 2001 / 4900) loss: 0.862167
(Iteration 2501 / 4900) loss: 1.131189
(Epoch 3 / 5) train acc: 0.733000; val_acc: 0.706000
(Iteration 3001 / 4900) loss: 0.722889
(Iteration 3501 / 4900) loss: 1.192593
(Epoch 4 / 5) train acc: 0.787000; val_acc: 0.719000
(Iteration 4001 / 4900) loss: 0.945739
(Iteration 4501 / 4900) loss: 0.834625
(Epoch 5 / 5) train acc: 0.790000; val_acc: 0.722000

In [6]:
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([200,200],[64,64,128,128,256,256])

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


(Iteration 1 / 4900) loss: 2.583209
(Epoch 0 / 5) train acc: 0.094000; val_acc: 0.089000
(Iteration 501 / 4900) loss: 1.070304
(Epoch 1 / 5) train acc: 0.656000; val_acc: 0.667000
(Iteration 1001 / 4900) loss: 1.045767
(Iteration 1501 / 4900) loss: 1.192539
(Epoch 2 / 5) train acc: 0.763000; val_acc: 0.704000
(Iteration 2001 / 4900) loss: 0.882805
(Iteration 2501 / 4900) loss: 0.894327
(Epoch 3 / 5) train acc: 0.763000; val_acc: 0.703000
(Iteration 3001 / 4900) loss: 0.702577
(Iteration 3501 / 4900) loss: 0.900516
(Epoch 4 / 5) train acc: 0.816000; val_acc: 0.740000
(Iteration 4001 / 4900) loss: 0.598736
(Iteration 4501 / 4900) loss: 0.492377
(Epoch 5 / 5) train acc: 0.841000; val_acc: 0.752000

In [5]:
from cs231n.classifiers.mycnn1 import ConvNet1
model = ConvNet1([100,100],[64,64,128,128,256,256],dropout=0.75)

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


(Iteration 1 / 4900) loss: 2.549612
(Epoch 0 / 5) train acc: 0.092000; val_acc: 0.113000
(Iteration 501 / 4900) loss: 1.260734
(Epoch 1 / 5) train acc: 0.583000; val_acc: 0.611000
(Iteration 1001 / 4900) loss: 1.209661
(Iteration 1501 / 4900) loss: 1.112903
(Epoch 2 / 5) train acc: 0.741000; val_acc: 0.699000
(Iteration 2001 / 4900) loss: 1.090693
(Iteration 2501 / 4900) loss: 0.770919
(Epoch 3 / 5) train acc: 0.764000; val_acc: 0.728000
(Iteration 3001 / 4900) loss: 0.712355
(Iteration 3501 / 4900) loss: 0.717223
(Epoch 4 / 5) train acc: 0.816000; val_acc: 0.730000
(Iteration 4001 / 4900) loss: 0.833548
(Iteration 4501 / 4900) loss: 0.677000
(Epoch 5 / 5) train acc: 0.840000; val_acc: 0.744000

In [4]:
from cs231n.features import color_histogram_hsv, hog_feature
from cs231n.features import *
print data['X_train'].shape
X_train=data['X_train'].transpose(0,2, 3, 1).copy()
X_val=data['X_val'].transpose(0,2, 3, 1).copy()
X_test=data['X_test'].transpose(0,2, 3, 1).copy()
print X_train.shape

num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)

# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat

# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat

# Preprocessing: Add a bias dimension
#X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
#X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
#X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])
print X_train_feats.shape

data['X_train']=X_train_feats.transpose(0, 3, 1,2).copy()
data['X_val']=X_val_feats.transpose(0,3,1,2).copy()
data['X_test']=X_test_feats.transpose(0,3,1,2).copy()


(49000, 3, 32, 32)
(49000, 32, 32, 3)
Done extracting features for 1000 / 49000 images
Done extracting features for 2000 / 49000 images
Done extracting features for 3000 / 49000 images
Done extracting features for 4000 / 49000 images
Done extracting features for 5000 / 49000 images
Done extracting features for 6000 / 49000 images
Done extracting features for 7000 / 49000 images
Done extracting features for 8000 / 49000 images
Done extracting features for 9000 / 49000 images
Done extracting features for 10000 / 49000 images
Done extracting features for 11000 / 49000 images
Done extracting features for 12000 / 49000 images
Done extracting features for 13000 / 49000 images
Done extracting features for 14000 / 49000 images
Done extracting features for 15000 / 49000 images
Done extracting features for 16000 / 49000 images
Done extracting features for 17000 / 49000 images
Done extracting features for 18000 / 49000 images
Done extracting features for 19000 / 49000 images
Done extracting features for 20000 / 49000 images
Done extracting features for 21000 / 49000 images
Done extracting features for 22000 / 49000 images
Done extracting features for 23000 / 49000 images
Done extracting features for 24000 / 49000 images
Done extracting features for 25000 / 49000 images
Done extracting features for 26000 / 49000 images
Done extracting features for 27000 / 49000 images
Done extracting features for 28000 / 49000 images
Done extracting features for 29000 / 49000 images
Done extracting features for 30000 / 49000 images
Done extracting features for 31000 / 49000 images
Done extracting features for 32000 / 49000 images
Done extracting features for 33000 / 49000 images
Done extracting features for 34000 / 49000 images
Done extracting features for 35000 / 49000 images
Done extracting features for 36000 / 49000 images
Done extracting features for 37000 / 49000 images
Done extracting features for 38000 / 49000 images
Done extracting features for 39000 / 49000 images
Done extracting features for 40000 / 49000 images
Done extracting features for 41000 / 49000 images
Done extracting features for 42000 / 49000 images
Done extracting features for 43000 / 49000 images
Done extracting features for 44000 / 49000 images
Done extracting features for 45000 / 49000 images
Done extracting features for 46000 / 49000 images
Done extracting features for 47000 / 49000 images
Done extracting features for 48000 / 49000 images
(49000, 154)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-24350a1c55a6> in <module>()
     32 print X_train_feats.shape
     33 
---> 34 data['X_train']=X_train_feats.transpose(0, 3, 1,2).copy()
     35 data['X_val']=X_val_feats.transpose(0,3,1,2).copy()
     36 data['X_test']=X_test_feats.transpose(0,3,1,2).copy()

ValueError: axes don't match array

In [7]:
from cs231n.vis_utils import visualize_grid

grid = visualize_grid(model.params['W1'].transpose(0, 2, 3, 1))
plt.imshow(grid.astype('uint8'))
plt.axis('off')
plt.gcf().set_size_inches(5, 5)
plt.show()
grid = visualize_grid(model.params['W2'].transpose(0, 2, 3, 1))
plt.imshow(grid.astype('uint8'))
plt.axis('off')
plt.gcf().set_size_inches(5, 5)
plt.show()
grid = visualize_grid(model.params['W3'].transpose(0, 2, 3, 1))
plt.imshow(grid.astype('uint8'))
plt.axis('off')
plt.gcf().set_size_inches(5, 5)
plt.show()


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-7-a4aec55e10c6> in <module>()
      1 from cs231n.vis_utils import visualize_grid
      2 
----> 3 grid = visualize_grid(model.params['W1'].transpose(0, 2, 3, 1))
      4 plt.imshow(grid.astype('uint8'))
      5 plt.axis('off')

NameError: name 'model' is not defined

In [7]:
X_train=data['X_train']
X_train=X_train.transpose(0, 2,3,1)
y_train=data['y_train']
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y)
    idxs = np.random.choice(idxs, samples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt_idx = i * num_classes + y + 1
        plt.subplot(samples_per_class, num_classes, plt_idx)
        plt.imshow(X_train[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls)
plt.show()



In [8]:
from cs231n.features1 import  hist
import cv2
X_train=data['X_train']
X_val=data['X_val']

#X_train=X_train.transpose(0, 3,1,2)
print data['X_train'][1,1,:,:]
#print cv2.equalizeHist(data['X_train'][1,1,:,:].astype('uint8'))
#a=(X_train[1,1,:,:].astype('uint8'))
#cv2.equalizeHist(a)
#mean_image = np.mean(X_train, axis=0)
#X_train += mean_image
#X_val += mean_image
#print X_train.shape
#print mean_image.shape
#print X_val.shape
data['X_train']=hist(X_train)
data['X_val']=hist(X_val)
print data['X_train'][1,1,:,:]
#mean_image = np.mean(X_train, axis=0)
#data['X_train'] -= mean_image
#data['X_val'] -= mean_image
#print data['X_train'][1,1,:,:]


[[ 41.01826531   1.65195918 -32.14328571 ..., -41.17532653 -45.5802449
  -54.36534694]
 [ 24.78316327  18.63691837 -10.08104082 ..., -36.03577551 -54.456
  -61.29195918]
 [ 20.41510204  12.57302041 -18.98687755 ..., -51.88912245 -63.47059184
  -64.49834694]
 ..., 
 [ 41.06446939  30.28759184  37.21993878 ..., -89.04928571 -70.77730612
  -41.8947551 ]
 [ 27.63295918  27.44293878  36.84238776 ..., -31.38504082 -10.67434694
   -4.31057143]
 [ 21.04759184  22.35055102  30.24808163 ...,   7.26855102   8.37042857
    7.13804082]]
[[  55.    2.  218. ...,  200.  192.  180.]
 [  33.   25.  245. ...,  210.  180.  169.]
 [  27.   20.  233. ...,  183.  166.  164.]
 ..., 
 [  55.   43.   50. ...,  142.  159.  200.]
 [  37.   37.   49. ...,  219.  245.  252.]
 [  28.   30.   43. ...,   13.   14.   13.]]

In [9]:
X_train=data['X_train']
X_train=X_train.transpose(0, 2,3,1)
y_train=data['y_train']
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
num_classes = len(classes)
samples_per_class = 7
for y, cls in enumerate(classes):
    idxs = np.flatnonzero(y_train == y)
    idxs = np.random.choice(idxs, samples_per_class, replace=False)
    for i, idx in enumerate(idxs):
        plt_idx = i * num_classes + y + 1
        plt.subplot(samples_per_class, num_classes, plt_idx)
        plt.imshow(X_train[idx].astype('uint8'))
        plt.axis('off')
        if i == 0:
            plt.title(cls)
plt.show()



In [10]:
print data['X_train'][1,1,:,:]


[[  55.    2.  218. ...,  200.  192.  180.]
 [  33.   25.  245. ...,  210.  180.  169.]
 [  27.   20.  233. ...,  183.  166.  164.]
 ..., 
 [  55.   43.   50. ...,  142.  159.  200.]
 [  37.   37.   49. ...,  219.  245.  252.]
 [  28.   30.   43. ...,   13.   14.   13.]]

In [ ]:


In [11]:
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([100,100],[64,64,128,128,256,256])

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


(Iteration 1 / 4900) loss: 2.560496
(Epoch 0 / 5) train acc: 0.099000; val_acc: 0.091000
(Iteration 501 / 4900) loss: 1.596029
(Epoch 1 / 5) train acc: 0.516000; val_acc: 0.491000
(Iteration 1001 / 4900) loss: 1.523716
(Iteration 1501 / 4900) loss: 1.463184
(Epoch 2 / 5) train acc: 0.614000; val_acc: 0.563000
(Iteration 2001 / 4900) loss: 1.191132
(Iteration 2501 / 4900) loss: 1.009065
(Epoch 3 / 5) train acc: 0.676000; val_acc: 0.577000
(Iteration 3001 / 4900) loss: 0.991285
(Iteration 3501 / 4900) loss: 0.992061
(Epoch 4 / 5) train acc: 0.740000; val_acc: 0.584000
(Iteration 4001 / 4900) loss: 0.796722
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-11-9d74479cdcec> in <module>()
      9                 },
     10                 verbose=True, print_every=500)
---> 11 solver.train()

/home/zengliang/winter1516_assignment2/assignment2/cs231n/solver.pyc in train(self)
    226 
    227     for t in xrange(num_iterations):
--> 228       self._step()
    229 
    230       # Maybe print training loss

/home/zengliang/winter1516_assignment2/assignment2/cs231n/solver.pyc in _step(self)
    163 
    164     # Compute loss and gradient
--> 165     loss, grads = self.model.loss(X_batch, y_batch)
    166     self.loss_history.append(loss)
    167 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/classifiers/mycnn.pyc in loss(self, X, y)
    125           grads['W'+str(index)] += self.reg*grads['W'+str(index)]
    126 
--> 127       dout,grads['W'+str(self.conv_num+1)],grads['b'+str(self.conv_num+1)]=conv_relu_backward(dscore, cache['layer'+str(self.conv_num+1)])
    128       grads['W'+str(self.conv_num+1)] +=self.reg*grads['W'+str(self.conv_num+1)]
    129 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/layer_utils.pyc in conv_relu_backward(dout, cache)
     70   conv_cache, relu_cache = cache
     71   da = relu_backward(dout, relu_cache)
---> 72   dx, dw, db = conv_backward_fast(da, conv_cache)
     73   return dx, dw, db
     74 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/fast_layers.pyc in conv_backward_strides(dout, cache)
     95 
     96   dout_reshaped = dout.transpose(1, 0, 2, 3).reshape(F, -1)
---> 97   dw = dout_reshaped.dot(x_cols.T).reshape(w.shape)
     98 
     99   dx_cols = w.reshape(F, -1).T.dot(dout_reshaped)

KeyboardInterrupt: 

In [ ]:
from cs231n.features1 import  hist
import cv2
X_train=data['X_train']
X_val=data['X_val']
X_train=X_train.transpose(0,2,3,1)
X_val=X_val.transpose(0,2,3,1)

mean_image = np.mean(X_train, axis=0)
X_train += mean_image
X_val += mean_image

X_train=X_train.transpose(0,3,12)
X_val=X_val.transpose(0,3,1,2)
X_train=hist(X_train)
X_val=hist(X_val)

X_train=X_train.transpose(0,2,3,1)
X_val=X_val.transpose(0,2,3,1)

mean_image = np.mean(X_train, axis=0)
X_train -= mean_image
X_val -= mean_image

data['X_train'] = X_train.transpose(0, 3, 1, 2).copy()
data['X_val'] = X_val.transpose(0, 3, 1, 2).copy()

In [ ]:
from cs231n.classifiers.mycnn import ConvNet
model = ConvNet([100,100],[64,64,128,128,256,256])

solver = Solver(model, data,
                num_epochs=5, batch_size=50,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()

In [9]:
import h5py

In [11]:
f = h5py.File('/home/zengliang/x_train.hdf5','r')
x_train = f['x_train'][:]
f.close()

f = h5py.File('/home/zengliang/y_train.hdf5','r')
y_train = f['y_train'][:]
f.close()

f = h5py.File('/home/zengliang/x_val.hdf5','r')
x_val = f['x_val'][:]
f.close()

f = h5py.File('/home/zengliang/y_val.hdf5','r')
y_val = f['y_val'][:]
f.close()

In [13]:
print x_train.shape,y_train.shape


(28709, 1, 48, 48) (28709, 1)

In [12]:
data = {
  'X_train': x_train,
  'y_train': y_train,
  'X_val': x_val,
  'y_val': y_val,
}

In [17]:
from cs231n.classifiers.mycnn2 import ConvNet2
model = ConvNet1([1024,512],[32,64,64,128],input_dim=(1,48,48),num_classes=7,dropout=0.25)

solver = Solver(model, data,
                num_epochs=50, batch_size=128,
                update_rule='adam',
                optim_config={
                  'learning_rate': 1e-3,
                },
                verbose=True, print_every=500)
solver.train()


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-8912b55f5080> in <module>()
      9                 },
     10                 verbose=True, print_every=500)
---> 11 solver.train()

/home/zengliang/winter1516_assignment2/assignment2/cs231n/solver.pyc in train(self)
    226 
    227     for t in xrange(num_iterations):
--> 228       self._step()
    229 
    230       # Maybe print training loss

/home/zengliang/winter1516_assignment2/assignment2/cs231n/solver.pyc in _step(self)
    163 
    164     # Compute loss and gradient
--> 165     loss, grads = self.model.loss(X_batch, y_batch)
    166     self.loss_history.append(loss)
    167 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/classifiers/mycnn1.py in loss(self, X, y)
     98       score_temp=X
     99       for i in xrange(self.conv_num):
--> 100           out,cache['layer'+str(i+1)]=conv_batchnorm_relu_pool_forward(score_temp,self.params['W'+str(i+1)],self.params['b'+str(i+1)],self.params['gamma'+str(i+1)],self.params['beta'+str(i+1)],conv_param,pool_param,self.bn_params[i])
    101           score_temp=out
    102 

/home/zengliang/winter1516_assignment2/assignment2/cs231n/layer_utils.pyc in conv_batchnorm_relu_pool_forward(x, w, b, gamma, beta, conv_param, pool_param, bn_param)
    119 
    120 def conv_batchnorm_relu_pool_forward(x, w, b,gamma, beta, conv_param, pool_param,bn_param):
--> 121     a,conv_cache = conv_forward_fast(x,w,b,conv_param)
    122     a2,bn_cache=spatial_batchnorm_forward(a,gamma,beta,bn_param)
    123     a3,relu_cache=relu_forward(a2)

/home/zengliang/winter1516_assignment2/assignment2/cs231n/fast_layers.pyc in conv_forward_strides(x, w, b, conv_param)
     69 
     70   # Now all our convolutions are a big matrix multiply
---> 71   res = w.reshape(F, -1).dot(x_cols) + b.reshape(-1, 1)
     72 
     73   # Reshape the output

ValueError: shapes (32,27) and (9,294912) not aligned: 27 (dim 1) != 9 (dim 0)

In [ ]: