Example 1 - Overfitting Sample Data

Here, we will use a simple model to overfit a set of randomly generated data points.

First, we import Numpy to hold the data, and we import Learny McLearnface.


In [1]:
import numpy as np
import LearnyMcLearnface as lml

Now, we will create the data to be overfitted. For the purposes of the example, we will create 100 700-dimensional data points, which will each be randomly assigned one of 10 classes. We will attempt to use a model to overfit this data and achieve 100% accuracy on this training set.

We will organize the data points into a single numpy array, where the rows are individual datapoints, and we will also create a separate vector of integers which give the classes for corresponding examples.

We initialize the data and its classes:


In [2]:
test_data = np.random.randn(100, 700)
test_classes = np.random.randint(1, 10, 100)

Now, in order to feed the data to Learny McLearnface, we wrap it in a data dictionary with specified labels.

(Note that he validation set and training set will be the same in this case, as we are intentionally trying to overfit a training set.)


In [3]:
data = {
    'X_train' : test_data,
    'y_train' : test_classes,
    'X_val' : test_data,
    'y_val' : test_classes
}

Now, we will create our model. We will use a simple fully-connected shallow network, with 500 hidden layer neurons, ReLU activations, and a softmax classifier at the end.

First, we set our initial network options in a dictionary. We will have an input dimension of 700, and we will use the Xavier scheme to initialize our parameters.


In [4]:
opts = {
    'input_dim' : 700,
    'init_scheme' : 'xavier'
}

And finally, we build the network itself. With the above description, the layer architecture will be:

(Affine) -> (ReLU) -> (Affine) -> (Softmax)

We create our network object, with 500 hidden layer neurons and 10 output layer neurons (which correspond to our 10 classes).


In [5]:
nn = lml.NeuralNetwork(opts)
nn.add_layer('Affine', {'neurons':500})
nn.add_layer('ReLU', {})
nn.add_layer('Affine', {'neurons':10})
nn.add_layer('SoftmaxLoss', {})

Now that our model is created, we must train it. We use the given Trainer object to accomplish this. First, we must supply training options. These are, once again, provided in a dictionary.

We will use basic stochastic gradient descent with a learning rate of 1, no regularization, and we will train for 10 epochs.


In [6]:
opts = {
    'update_options' : {'update_rule' : 'sgd', 'learning_rate' : 1},
    'reg_param' : 0,
    'num_epochs' : 10
}

Now we create the trainer object and give it the model, the data, and the options.


In [7]:
trainer = lml.Trainer(nn, data, opts)

We will use the trainer's toolset to first print the accuracy of the model before training. Since the model was randomly initialized and there are 10 classes, we should expect an initial accuracy close to 10%


In [8]:
accuracy = trainer.accuracy(test_data, test_classes)
print('Initial model accuracy:', accuracy)


Initial model accuracy: 0.14

Since we have supplied all the requirements necessary for the trainer, we simply use the train() function to train the model. This will print status updates at the end of each epoch.


In [9]:
trainer.train()


Epoch 1 of 10 Validation accuracy: 0.33
Epoch 2 of 10 Validation accuracy: 0.5
Epoch 3 of 10 Validation accuracy: 0.75
Epoch 4 of 10 Validation accuracy: 0.92
Epoch 5 of 10 Validation accuracy: 1.0
Epoch 6 of 10 Validation accuracy: 1.0
Epoch 7 of 10 Validation accuracy: 1.0
Epoch 8 of 10 Validation accuracy: 1.0
Epoch 9 of 10 Validation accuracy: 1.0
Epoch 10 of 10 Validation accuracy: 1.0

As you can see, the network overfits the data very easily, achieving a validation accuracy of 100%. For the sake of completeness, we will print the final validation accuracy of the model.


In [10]:
accuracy = trainer.accuracy(test_data, test_classes)
print('Final model accuracy:', accuracy)


Final model accuracy: 1.0