Deep learning for computer vision

got no lasagne?

Install the bleeding edge version from here: http://lasagne.readthedocs.org/en/latest/user/installation.html

Main task

This week, we shall focus on the image recognition problem on cifar10 dataset

  • 60k images of shape 3x32x32
  • 10 different classes: planes, dogs, cats, trucks, etc.

In [ ]:
import numpy as np
from cifar import load_cifar10
X_train,y_train,X_val,y_val,X_test,y_test = load_cifar10("cifar_data")

class_names = np.array(['airplane','automobile ','bird ','cat ','deer ','dog ','frog ','horse ','ship ','truck'])

print X_train.shape,y_train.shape

In [ ]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.figure(figsize=[12,10])
for i in range(12):
    plt.subplot(3,4,i+1)
    plt.xlabel(class_names[y_train[i]])
    plt.imshow(np.transpose(X_train[i],[1,2,0]))

lasagne

  • lasagne is a library for neural network building and training
  • it's a low-level library with almost seamless integration with theano

In [ ]:
import lasagne
import theano
import theano.tensor as T

input_X = T.tensor4("X")

#input dimention (None means "Arbitrary")
input_shape = [None,3,32,32]

target_y = T.vector("target Y integer",dtype='int32')

Defining network architecture


In [ ]:
#Input layer (auxilary)
input_layer = lasagne.layers.InputLayer(shape = input_shape,input_var=input_X)

#fully connected layer, that takes input layer and applies 50 neurons to it.
# nonlinearity here is sigmoid as in logistic regression
# you can give a name to each layer (optional)
dense_1 = lasagne.layers.DenseLayer(input_layer,num_units=100,
                                   nonlinearity = lasagne.nonlinearities.sigmoid,
                                   name = "hidden_dense_layer")

#fully connected output layer that takes dense_1 as input and has 10 neurons (1 for each digit)
#We use softmax nonlinearity to make probabilities add up to 1
dense_output = lasagne.layers.DenseLayer(dense_1,num_units = 10,
                                        nonlinearity = lasagne.nonlinearities.softmax,
                                        name='output')

In [ ]:
#network prediction (theano-transformation)
y_predicted = lasagne.layers.get_output(dense_output)

In [ ]:
#all network weights (shared variables)
all_weights = lasagne.layers.get_all_params(dense_output,trainable=True)
print all_weights

Than you could simply

  • define loss function manually
  • compute error gradient over all weights
  • define updates
  • But that's a whole lot of work and life's short
    • not to mention life's too short to wait for SGD to converge

Instead, we shall use Lasagne builtins


In [ ]:
#Mean categorical crossentropy as a loss function - similar to logistic loss but for multiclass targets
loss = lasagne.objectives.categorical_crossentropy(y_predicted,target_y).mean()

#prediction accuracy (WITH dropout)
accuracy = lasagne.objectives.categorical_accuracy(y_predicted,target_y).mean()

#This function computes gradient AND composes weight updates just like you did earlier
updates_sgd = lasagne.updates.sgd(loss, all_weights,learning_rate=0.01)

In [ ]:
#function that computes loss and updates weights
train_fun = theano.function([input_X,target_y],[loss,accuracy],updates= updates_sgd)

In [ ]:
#deterministic prediciton (without dropout)
y_predicted_det = lasagne.layers.get_output(dense_output,deterministic=True)

#prediction accuracy (without dropout)
accuracy_det = lasagne.objectives.categorical_accuracy(y_predicted_det,target_y).mean()

#function that just computes accuracy without dropout/noize -- for evaluation purposes
accuracy_fun = theano.function([input_X,target_y],accuracy_det)

That's all, now let's train it!

  • We got a lot of data, so it's recommended that you use SGD
  • So let's implement a function that splits the training sample into minibatches

In [ ]:
# An auxilary function that returns mini-batches for neural network training

#Parameters
# X - a tensor of images with shape (many, 3, 32, 32), e.g. X_train
# y - a vector of answers for corresponding images e.g. Y_train
#batch_size - a single number - the intended size of each batches

#What do need to implement
# 1) Shuffle data
# - Gotta shuffle X and y the same way not to break the correspondence between X_i and y_i
# 3) Split data into minibatches of batch_size
# - If data size is not a multiple of batch_size, make one last batch smaller.
# 4) return a list (or an iterator) of pairs
# - (подгруппа картинок, ответы из y на эту подгруппу)
def iterate_minibatches(X, y, batchsize):
    
    <return an iterable of (X_batch, y_batch)  batches of images and answers for them>
    
        
        
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
#
# You feel lost and wish you stayed home tonight?
# Go search for a similar function at
# https://github.com/Lasagne/Lasagne/blob/master/examples/mnist.py

Training loop


In [ ]:
import time

num_epochs = 100 #amount of passes through the data
            
batch_size = 50 #number of samples processed at each function call

for epoch in range(num_epochs):
    # In each epoch, we do a full pass over the training data:
    train_err = 0
    train_acc = 0
    train_batches = 0
    start_time = time.time()
    for batch in iterate_minibatches(X_train, y_train,batch_size):
        inputs, targets = batch
        train_err_batch, train_acc_batch= train_fun(inputs, targets)
        train_err += train_err_batch
        train_acc += train_acc_batch
        train_batches += 1

    # And a full pass over the validation data:
    val_acc = 0
    val_batches = 0
    for batch in iterate_minibatches(X_val, y_val, batch_size):
        inputs, targets = batch
        val_acc += accuracy_fun(inputs, targets)
        val_batches += 1

    
    # Then we print the results for this epoch:
    print("Epoch {} of {} took {:.3f}s".format(
        epoch + 1, num_epochs, time.time() - start_time))

    print("  training loss (in-iteration):\t\t{:.6f}".format(train_err / train_batches))
    print("  train accuracy:\t\t{:.2f} %".format(
        train_acc / train_batches * 100))
    print("  validation accuracy:\t\t{:.2f} %".format(
        val_acc / val_batches * 100))

In [ ]:
test_acc = 0
test_batches = 0
for batch in iterate_minibatches(X_test, y_test, 500):
    inputs, targets = batch
    acc = accuracy_fun(inputs, targets)
    test_acc += acc
    test_batches += 1
print("Final results:")
print("  test accuracy:\t\t{:.2f} %".format(
    test_acc / test_batches * 100))

if test_acc / test_batches * 100 > 95:
    print "Double-check, than consider applying for NIPS'17. SRSly."
elif test_acc / test_batches * 100 > 90:
    print "U'r freakin' amazin'!"
elif test_acc / test_batches * 100 > 80:
    print "Achievement unlocked: 110lvl Warlock!"
elif test_acc / test_batches * 100 > 70:
    print "Achievement unlocked: 80lvl Warlock!"
elif test_acc / test_batches * 100 > 50:
    print "Achievement unlocked: 60lvl Warlock!"
else:
    print "We need more magic!"

First step

Let's create a mini-convolutional network with roughly such architecture:

  • Input layer
  • 3x3 convolution with 10 filters and ReLU activation
  • 3x3 pooling (or set previous convolution stride to 3)
  • Dense layer with 100-neurons and ReLU activation
  • 10% dropout
  • Output dense layer.

Train it with Adam optimizer with default params.

Second step

  • Add batch_norm (with default params) between convolution and pooling

Re-train the network with the same optimizer

Quest For A Better Network

(please read it at least diagonally)

  • The ultimate quest is to create a network that has as high accuracy as you can push it.
  • There is a mini-report at the end that you will have to fill in. We recommend reading it first and filling it while you iterate.

Grading

  • starting at zero points
  • +2 for describing your iteration path in a report below.
  • +2 for building a network that gets above 20% accuracy
  • +1 for beating each of these milestones on TEST dataset:
    • 50% (5 total)
    • 60% (6 total)
    • 65% (7 total)
    • 70% (8 total)
    • 75% (9 total)
    • 80% (10 total)

Bonus points

Common ways to get bonus points are:

  • Get higher score, obviously.
  • Anything special about your NN. For example "A super-small/fast NN that gets 80%" gets a bonus.
  • Any detailed analysis of the results. (saliency maps, whatever)

Restrictions

  • Please do NOT use pre-trained networks for this assignment until you reach 80%.
    • In other words, base milestones must be beaten without pre-trained nets (and such net must be present in the e-mail). After that, you can use whatever you want.
  • you can use validation data for training, but you can't' do anything with test data apart from running the evaluation procedure.

Tips on what can be done:

  • Network size

    • MOAR neurons,
    • MOAR layers, (lasagne docs)

    • Nonlinearities in the hidden layers

      • tanh, relu, leaky relu, etc
    • Larger networks may take more epochs to train, so don't discard your net just because it could didn't beat the baseline in 5 epochs.

    • Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn!

  • Convolution layers

    • they are a must unless you have any super-ideas
    • network = lasagne.layers.Conv2DLayer(prev_layer, num_filters = n_neurons, filter_size = (filter width, filter height), nonlinearity = some_nonlinearity)
    • Warning! Training convolutional networks can take long without GPU. That's okay.

      • If you are CPU-only, we still recomment to try a simple convolutional architecture
      • a perfect option is if you can set it up to run at nighttime and check it up at the morning.
      • Make reasonable layer size estimates. A 128-neuron first convolution is likely an overkill.
      • To reduce computation time by a factor in exchange for some accuracy drop, try using stride parameter. A stride=2 convolution should take roughly 1/4 of the default (stride=1) one.
    • Plenty other layers and architectures

  • Early Stopping

    • Training for 100 epochs regardless of anything is probably a bad idea.
    • Some networks converge over 5 epochs, others - over 500.
    • Way to go: stop when validation score is 10 iterations past maximum
  • Faster optimization -
    • rmsprop, nesterov_momentum, adam, adagrad and so on.
      • Converge faster and sometimes reach better optima
      • It might make sense to tweak learning rate/momentum, other learning parameters, batch size and number of epochs
    • BatchNormalization (lasagne.layers.batch_norm) FTW!
  • Regularize to prevent overfitting
  • Data augmemntation - getting 5x as large dataset for free is a great deal
    • Zoom-in+slice = move
    • Rotate+zoom(to remove black stripes)
    • any other perturbations
    • Add Noize (easiest: GaussianNoizeLayer)
    • Simple way to do that (if you have PIL/Image):
      • from scipy.misc import imrotate,imresize
      • and a few slicing
    • Stay realistic. There's usually no point in flipping dogs upside down as that is not the way you usually see them.

There is a template for your solution below that you can opt to use or throw away and write it your way


In [ ]:
import numpy as np
from cifar import load_cifar10
X_train,y_train,X_val,y_val,X_test,y_test = load_cifar10("cifar_data")

class_names = np.array(['airplane','automobile ','bird ','cat ','deer ','dog ','frog ','horse ','ship ','truck'])

print X_train.shape,y_train.shape

In [ ]:
import lasagne

input_X = T.tensor4("X")

#input dimention (None means "Arbitrary" and only works at  the first axes [samples])
input_shape = [None,3,32,32]

target_y = T.vector("target Y integer",dtype='int32')

In [ ]:
#Input layer (auxilary)
input_layer = lasagne.layers.InputLayer(shape = input_shape,input_var=input_X)

<student.code_neural_network_architecture()>

dense_output = <your network output>

In [ ]:
# Network predictions (theano-transformation)
y_predicted = lasagne.layers.get_output(dense_output)

In [ ]:
#All weights (shared-varaibles)
# "trainable" flag means not to return auxilary params like batch mean (for batch normalization)
all_weights = lasagne.layers.get_all_params(dense_output,trainable=True)
print all_weights

In [ ]:
#loss function
loss = <loss function>

#<optionally add regularization>

#accuracy with dropout/noize
accuracy = lasagne.objectives.categorical_accuracy(y_predicted,target_y).mean()

#weight updates
updates = <try different update methods>

In [ ]:
#A function that accepts X and y, returns loss functions and performs weight updates
train_fun = theano.function([input_X,target_y],[loss,accuracy],updates= updates_sgd)

In [ ]:
#deterministic prediciton (without dropout)
y_predicted_det = lasagne.layers.get_output(dense_output)

#prediction accuracy (without dropout)
accuracy_det = lasagne.objectives.categorical_accuracy(y_predicted_det,target_y).mean()

#function that just computes accuracy without dropout/noize -- for evaluation purposes
accuracy_fun = theano.function([input_X,target_y],accuracy_det)

In [ ]:
#итерации обучения

num_epochs = <how many times to iterate over the entire training set>

batch_size = <how many samples are processed at a single function call>

for epoch in range(num_epochs):
    # In each epoch, we do a full pass over the training data:
    train_err = 0
    train_acc = 0
    train_batches = 0
    start_time = time.time()
    for batch in iterate_minibatches(X_train, y_train,batch_size):
        inputs, targets = batch
        train_err_batch, train_acc_batch= train_fun(inputs, targets)
        train_err += train_err_batch
        train_acc += train_acc_batch
        train_batches += 1

    # And a full pass over the validation data:
    val_acc = 0
    val_batches = 0
    for batch in iterate_minibatches(X_val, y_val, batch_size):
        inputs, targets = batch
        val_acc += accuracy_fun(inputs, targets)
        val_batches += 1

    
    # Then we print the results for this epoch:
    print("Epoch {} of {} took {:.3f}s".format(
        epoch + 1, num_epochs, time.time() - start_time))

    print("  training loss (in-iteration):\t\t{:.6f}".format(train_err / train_batches))
    print("  train accuracy:\t\t{:.2f} %".format(
        train_acc / train_batches * 100))
    print("  validation accuracy:\t\t{:.2f} %".format(
        val_acc / val_batches * 100))

In [ ]:
test_acc = 0
test_batches = 0
for batch in iterate_minibatches(X_test, y_test, 500):
    inputs, targets = batch
    acc = accuracy_fun(inputs, targets)
    test_acc += acc
    test_batches += 1
print("Final results:")
print("  test accuracy:\t\t{:.2f} %".format(
    test_acc / test_batches * 100))

if test_acc / test_batches * 100 > 80:
    print "Achievement unlocked: 80lvl Warlock!"
else:
    print "We need more magic!"

Report

All creative approaches are highly welcome, but at the very least it would be great to mention

  • the idea;
  • brief history of tweaks and improvements;
  • what is the final architecture and why?
  • what is the training method and, again, why?
  • Any regularizations and other techniques applied and their effects;

There is no need to write strict mathematical proofs (unless you want to).

  • "I tried this, this and this, and the second one turned out to be better. And i just didn't like the name of that one" - OK, but can be better
  • "I have analized these and these articles|sources|blog posts, tried that and that to adapt them to my problem and the conclusions are such and such" - the ideal one
  • "I took that code that demo without understanding it, but i'll never confess that and instead i'll make up some pseudoscientific explaination" - not_ok

Hi, my name is ___ ___, and here's my story

A long ago in a galaxy far far away, when it was still more than an hour before deadline, i got an idea:

I gonna build a neural network, that
  • brief text on what was
  • the original idea
  • and why it was so

How could i be so naive?!

One day, with no signs of warning,

This thing has finally converged and

  • Some explaination about what were the results,
  • what worked and what didn't
  • most importantly - what next steps were taken, if any
  • and what were their respective outcomes
Finally, after iterations, mugs of [tea/coffee]
  • what was the final architecture
  • as well as training method and tricks

That, having wasted __ [minutes, hours or days] of my life training, got

  • accuracy on training: __
  • accuracy on validation: __
  • accuracy on test: __

[an optional afterword and mortal curses on assignment authors]


In [ ]: