Discover the power of Deep Learning

In this tutorial, you'll discover how to use "deep learning" (DL) to classify digits, ranging from 0 to 9. The dataset is quite famous, it's called 'MNIST' http://yann.lecun.com/exdb/mnist/. A French guy put it up, very famous in the DL comunity, he's called Yann Lecun and is now both head of the Facebook AI reseach program and head of something in the University of New York (you may want to search and pull the answer :p ).

I invite you to discover how MNIST truly is (class distribution, pixels distribution...).
Luckilly for you, I managed to be organised this time, and you may find this notebook usefull.

Remember logistic regression ? I also happen to have a notebook about this here. It's all done with Keras and might help you

Lets load the data



In [ ]:

    
import keras
from keras import models
from keras import layers
from keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Reshape Xs and Ys
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)
x_train = x_train.reshape(-1, 28 * 28)
x_test = x_test.reshape(-1, 28 * 28)

WTF are we going to do in this notebook ?

Ok, if you are up to this line, I expect you to know what is MNIST, and it's associated classification task. If you didn't got the task you can google something like this : "what is the classification task of MNSIT".

Perfect. Therefore, we want to classify MNIST. To do so, we'll use a neural network !! The neural net will be as follows :

It takes as input a batch of shape(32, 28 * 28)
Then has 3 * 128 fully connected layers (also called 'Dense layer') with Relu activations.
And finishes with a 10 dimention dense layer (which should be interpreted as probabilities (<=> sums to one))



In [ ]:

    
# Model definition
model = models.Sequential()
model.add(layers. ??

# Loss definition (from logits)
# use the SGD function from keras, discover the parameters
#sgd = keras.optimizers.
model.compile(optimizer=sgd,
      loss='categorical_crossentropy',  # Understand this... What is the logistic version of this ?
      metrics=['accuracy'])

TRAIN !!

Fit your data on the model you've created...



In [ ]:

    
# use this function fit the data (use some validation data (20% shall do))
# model.fit(

Ok... That's bad !

We've reached 10% accuracy, which is pure random, it's very bad ! Somehow, I can tell you that your classifier isn't converging, not even to random !! At random, you loss should be $ln(1/10) = 2.3$. By experience, I can tell you that your gradient step is too large...

You can try changing the lr parameter in keras.optimizers.SGD (yes, try this !) What happens ?

LR wasn't the true culpable ...

The Gradient descent algorithms are quite scaled to normalized dataset... Yet our dataset has a poor distribution:



In [ ]:

    
#Find the mean and varience of x_train



In [ ]:

    
# Normalize the dataset such that mean=0 and std=1

Let re-train..



In [ ]:

    
sgd = keras.optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,
      loss='categorical_crossentropy',
      metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=5, verbose=1, validation_split=0.3)

YEAH !!!

Try to beat this !!

you may use something else than SGD
you may regularize the neurons
you may try dropout
you may use batch normalization



In [ ]:

    
# Might wanna do this here...

Metrics time

Quantify the quality of the model using precision, recall and R2 score (if they make sense). You might be able to use sklearn.metrics



In [ ]:

    
# Might wanna do this here...