In this tutorial, you'll discover how to use "deep learning" (DL) to classify digits, ranging from 0 to 9. The dataset is quite famous, it's called 'MNIST' http://yann.lecun.com/exdb/mnist/. A French guy put it up, very famous in the DL comunity, he's called Yann Lecun and is now both head of the Facebook AI reseach program and head of something in the University of New York (you may want to search and pull the answer :p ).
I invite you to discover how MNIST truly is (class distribution, pixels distribution...).
Luckilly for you, I managed to be organised this time, and you may find this notebook usefull.
Remember logistic regression ? I also happen to have a notebook about this here. It's all done with Keras and might help you
In [ ]:
import keras
from keras import models
from keras import layers
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Reshape Xs and Ys
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)
x_train = x_train.reshape(-1, 28 * 28)
x_test = x_test.reshape(-1, 28 * 28)
Ok, if you are up to this line, I expect you to know what is MNIST, and it's associated classification task. If you didn't got the task you can google something like this : "what is the classification task of MNSIT".
Perfect. Therefore, we want to classify MNIST. To do so, we'll use a neural network !! The neural net will be as follows :
In [ ]:
# Model definition
model = models.Sequential()
model.add(layers. ??
# Loss definition (from logits)
# use the SGD function from keras, discover the parameters
#sgd = keras.optimizers.
model.compile(optimizer=sgd,
loss='categorical_crossentropy', # Understand this... What is the logistic version of this ?
metrics=['accuracy'])
In [ ]:
# use this function fit the data (use some validation data (20% shall do))
# model.fit(
We've reached 10% accuracy, which is pure random, it's very bad ! Somehow, I can tell you that your classifier isn't converging, not even to random !! At random, you loss should be $ln(1/10) = 2.3$. By experience, I can tell you that your gradient step is too large...
You can try changing the lr parameter in keras.optimizers.SGD (yes, try this !) What happens ?
In [ ]:
#Find the mean and varience of x_train
In [ ]:
# Normalize the dataset such that mean=0 and std=1
Let re-train..
In [ ]:
sgd = keras.optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=32, epochs=5, verbose=1, validation_split=0.3)
In [ ]:
# Might wanna do this here...
In [ ]:
# Might wanna do this here...