Deep Learning Lab Session

First Lab Session - 3 Hours

Artificial Neural Networks for Handwritten Digits Recognition

Student 1: Fabio Ellena
Student 2: Lorenzo Canale

The aim of this session is to practice with Artificial Neural Networks. Answers and experiments should be made by groups of one or two students. Each group should fill and run appropriate notebook cells.

To generate your final report, use print as PDF (Ctrl+P). Do not forget to run all your cells before generating your final report and do not forget to include the names of all participants in the group. The lab session should be completed by April 7th 2017.

Introduction

In this session, your will implement, train and test a Neural Network for the Handwritten Digits Recognition problem [1] with different settings of hyper parameters. You will use the MNIST dataset which was constructed from a number of scanned document dataset available from the National Institute of Standards and Technology (NIST). Images of digits were taken from a variety of scanned documents, normalized in size and centered.

<img src="Nimages/mnist.png",width="350" height="500" align="center">

Figure 1: MNIST digits examples

This assignment includes a written part of programms to help you understand how to build and train your neural net and then to test your code and get restults.

  1. NeuralNetwork.py
  2. transfer_functions.py
  3. utils.py

Functions defined inside the python files mentionned above can be imported using the python command : from filename import *

You will use the following libraries:

  1. numpy : for creating arrays and using methods to manipulate arrays.

  2. matplotlib : for making plots

Section 1 : My First Neural Network

Part 1: Before designing and writing your code, you will first work on a neural network by hand. Consider the above Neural network with two inputs $X=(x1,x2)$, one hidden layers and a single output unit $(y)$. The initial weights are set to random values. Neurons 6 and 7 represent the bias. Bias values are equal to 1.
Training sample, X = (0.8, 0.2), whose class label is Y=0.4.

Assume that the neurons have a Sigmoid activation function $f(x)=\frac{1}{(1+e^{-x})}$ and the learning rate $\mu$=1

<img src="Nimages/NN.png", width="700" height="900">

Figure 2: Neural network

Question 1.1.1: Compute the new values of weights $w_{i,j}$ after a forward pass and a backward pass. $w_{i,j}$ is the weight of the connexion between neuron $i$ and neuron $j$.


In [1]:
import utils as UT
import transfer_functions as TF
import NeuralNetwork as NN
import numpy as np

In [2]:
u3 = 0.8*0.3 + 0.2*0.8 + 1*0.2
u4 = 0.8*(-0.5) + 0.2*0.2 + 1*(-0.4)

o3 = TF.sigmoid(u3)
o4 = TF.sigmoid(u4)
o7 = 1.0
o6 = 1.0

u5 = o3*(-0.6) + o4*0.4 + o7*0.5
o5 = TF.sigmoid(u5) 
y=o5

print('feed forward:\n')
print('u3=%f' % u3)
print('u4=%f' % u4)
print('u5=%f' % u5)

print('o3=%f' % o3)
print('o4=%f' % o4)
print('o5=%f' % o5)
print('o6=%f' % o6)
print('o7=%f' % o7)

print('output of NN is %f' % o5)


feed forward:

u3=0.600000
u4=-0.760000
u5=0.240065
o3=0.645656
o4=0.318646
o5=0.559730
o6=1.000000
o7=1.000000
output of NN is 0.559730

In [3]:
dEu5 = (y-0.4)*o5*(1-o5)
w45 = 0.4
w35 = -0.6
w75 = 0.5
dEu3 = dEu5*w35*o3*(1-o3)
dEu4 = dEu5*w45*o4*(1-o4)

w45 -= dEu5*o4
w35 -= dEu5*o3
w75 -= dEu5*o7

w13 = 0.3
w14 = -0.5
w23 = 0.8
w24 = 0.2
w63 = 0.2
w64 = -0.4

w13 -= dEu3*0.8
w14 -= dEu4*0.8
w23 -= dEu3*0.2
w24 -= dEu4*0.2
w63 -= dEu3*1
w64 -= dEu4*1

print('back propagation ')
print('w13=%f' % w13)
print('w14=%f' % w14)
print('w23=%f' % w23)
print('w24=%f' % w24)
print('w63=%f' % w63)
print('w64=%f' % w64)
print('w35=%f' % w35)
print('w45=%f' % w45)
print('w75=%f' % w75)

u3 = 0.8*w13 + 0.2*w23 + 1*w63
u4 = 0.8*w14 + 0.2*w24 + 1*w64
o3 = TF.sigmoid(u3)
o4 = TF.sigmoid(u3)
o7 = 1

u5 = o3*w35 + o4*w45 + o7*w75
o5 = TF.sigmoid(u5) 
y=o5
print('output after backpropagation is %f' % y)


back propagation 
w13=0.304323
w14=-0.502735
w23=0.801081
w24=0.199316
w63=0.205403
w64=-0.403418
w35=-0.625415
w45=0.387457
w75=0.460637
output after backpropagation is 0.576032

Your answer goes here :

$w_{1,3}= 0.304323 $

$w_{1,4}= -0.502735 $

$w_{2,3}= 0.801081 $

$w_{2,4}= 0.199316 $

$w_{6,3}= 0.205403 $

$w_{6,4}= -0.403418 $

$w_{3,5}= -0.625415 $

$w_{4,5}= 0.387457 $

$w_{7,5}= 0.460637 $

Part 2: Neural Network Implementation

Please read all source files carefully and understand the data structures and all functions. You are to complete the missing code. First you should define the neural network (using the NeuralNetwork class, see in the NeuralNetwork.py file) and reinitialise weights. Then you will to complete the Feed Forward and the Back-propagation functions.

Question 1.2.1: Define the neural network corresponding to the one in part 1


In [6]:
from NeuralNetwork import *
#create the network
my_first_net = NeuralNetwork(input_layer_size=2, hidden_layer_size=2, output_layer_size=1,)

In [7]:
#Data preparation 
X=[0.8,0.2]
Y=[0.4]
data=[]
data.append(X)
data.append(Y)

#initialize weights
wi=np.array([[0.3,-0.5],[0.8,0.2],[0.2,-0.4]])
wo=np.array([[-0.6],[0.4],[0.5]])
my_first_net.weights_initialisation(wi,wo)
print(my_first_net.W_input_to_hidden)
print(my_first_net.W_hidden_to_output)


[[ 0.3 -0.5]
 [ 0.8  0.2]
 [ 0.2 -0.4]]
[[-0.6]
 [ 0.4]
 [ 0.5]]

Question 1.2.2: Implement the Feed Forward function (feedForward(X) in the NeuralNetwork.py file)


In [8]:
def feedForward(self, inputs):
    # Compute input activations
    self.a_input = np.append(inputs, [1])
    # Compute  hidden activations
    self.a_hidden = np.append(self.tf(self.a_input.dot(self.W_input_to_hidden)), [1])
    # Compute output activations       
    self.a_out = self.tf(self.a_hidden.dot(self.W_hidden_to_output))

    return self.a_out

Check your network outputs the expected value (the one you computed in question 1.1)


In [9]:
#test my  Feed Forward function 
Output_activation=my_first_net.feedForward(X)
print("output activation =%.3f" %(Output_activation))


output activation =0.560

Question 1.2.3: Implement the Back-propagation Algorithm (backPropagate(Y) in the NeuralNetwork.py file)


In [10]:
def backPropagate(self, targets):
    # calculate error terms for output
    self.err_out = self.a_out - targets
    # calculate error terms for hidden
    delta_out = self.err_out * self.dtf(self.a_out)
    delta_hidden = self.W_hidden_to_output.dot(delta_out) * self.dtf(self.a_hidden)
    # update output weights: calculate the new weights
    self.W_hidden_to_output -= self.learning_rate * np.outer(self.a_hidden, delta_out)
    # update input weights
    self.W_input_to_hidden -= self.learning_rate * np.outer(self.a_input, delta_hidden[:-1])
    # calculate error
    return np.sum(self.err_out**2) / 2

Check the gradient values and weight updates are correct (similar to the ones you computed in question 1.1)


In [11]:
#test my  Back-propagation function
my_first_net.backPropagate(Y)
#Print weights after backpropagation
print('wi_new=', my_first_net.W_input_to_hidden)
print('wo_new=', my_first_net.W_hidden_to_output)


('wi_new=', array([[ 0.30043227, -0.50027347],
       [ 0.80010807,  0.19993163],
       [ 0.20054033, -0.40034184]]))
('wo_new=', array([[-0.60254147],
       [ 0.39874573],
       [ 0.49606375]]))

Your Feed Forward and Back-Propagation implementations are working, Great!! Let's tackle a real world problem.

Section 2 : The MNIST Challenge!

Data Preparation

The MNIST dataset consists of handwritten digit images it contains 60,000 examples for the training set and 10,000 examples for testing. In this Lab Session, the official training set of 60,000 is divided into an actual training set of 50,000 examples, 10,000 validation examples and 10,000 examples for test. All digit images have been size-normalized and centered in a fixed size image of 28 x 28 pixels. The images are stored in byte form you will use the NumPy python library to read the data files into NumPy arrays that we will use to train the ANN.

The MNIST dataset is available in the Data folder. To get the training, testing and validation data, run the the load_data() function.


In [12]:
from utils import *
training_data, validation_data, test_data=load_data()

print("Training data size: %d" % (len(training_data)))
print("Validation data size: %d" % (len(validation_data)))
print("Test data size: %d" % (len(test_data)))


Loading MNIST data .....
Done.
Training data size: 50000
Validation data size: 10000
Test data size: 10000

MNIST Dataset Digits Visualisation


In [13]:
ROW = 2
COLUMN = 4
for i in range(ROW * COLUMN):
    # train[i][0] is i-th image data with size 28x28
    image = training_data[i][0].reshape(28, 28)   
    plt.subplot(ROW, COLUMN, i+1)          
    plt.imshow(image, cmap='gray')  # cmap='gray' is for black and white picture.
plt.axis('off')  # do not show axis value
plt.tight_layout()   # automatic padding between subplots
plt.show()


Part 1: Creating the Neural Networks

The input layer of the neural network contains neurons encoding the values of the input pixels. The training data for the network will consist of many 28 by 28 pixel images of scanned handwritten digits, and so the input layer contains 784=28×28 neurons. The second layer of the network is a hidden layer, we set the neuron number in the hidden layer to 30. The output layer contains 10 neurons.

Question 2.1.1: Create the network described above using the NeuralNetwork class


In [14]:
#create the network
from NeuralNetwork import * 
my_mnist_net = NeuralNetwork(784, 30, 10, iterations=30, learning_rate=0.1)

Question 2.1.2: Add the information about the performance of the neural network on the test set at each epoch


In [15]:
test_accuracy=my_mnist_net.predict(test_data)/100
print('Test_Accuracy  %-2.2f' % test_accuracy)


Test_Accuracy  8.79

Question 2.1.3: Train the Neural Network and comment your findings


In [16]:
#train your network 
evaluations = my_mnist_net.train(training_data,validation_data)


Iteration:  1/30[==============] -Error: 0.1655674548  -Training_Accuracy:  88.06  -time: 15.64 
Iteration:  2/30[==============] -Error: 0.0900181135  -Training_Accuracy:  90.54  -time: 33.58 
Iteration:  3/30[==============] -Error: 0.0745799926  -Training_Accuracy:  91.95  -time: 47.70 
Iteration:  4/30[==============] -Error: 0.0662253743  -Training_Accuracy:  92.61  -time: 61.77 
Iteration:  5/30[==============] -Error: 0.0609684985  -Training_Accuracy:  93.30  -time: 75.87 
Iteration:  6/30[==============] -Error: 0.0567575099  -Training_Accuracy:  93.59  -time: 103.45 
Iteration:  7/30[==============] -Error: 0.0538454924  -Training_Accuracy:  94.10  -time: 117.36 
Iteration:  8/30[==============] -Error: 0.0511648403  -Training_Accuracy:  94.11  -time: 134.97 
Iteration:  9/30[==============] -Error: 0.0490640124  -Training_Accuracy:  94.53  -time: 153.27 
Iteration: 10/30[==============] -Error: 0.0470885581  -Training_Accuracy:  94.78  -time: 170.12 
Iteration: 11/30[==============] -Error: 0.0452450698  -Training_Accuracy:  94.95  -time: 186.18 
Iteration: 12/30[==============] -Error: 0.0435106317  -Training_Accuracy:  95.14  -time: 200.71 
Iteration: 13/30[==============] -Error: 0.0423460673  -Training_Accuracy:  95.26  -time: 223.88 
Iteration: 14/30[==============] -Error: 0.0409779653  -Training_Accuracy:  95.46  -time: 242.88 
Iteration: 15/30[==============] -Error: 0.0398718803  -Training_Accuracy:  95.54  -time: 265.24 
Iteration: 16/30[==============] -Error: 0.0387893913  -Training_Accuracy:  95.71  -time: 286.00 
Iteration: 17/30[==============] -Error: 0.0377897016  -Training_Accuracy:  95.75  -time: 301.02 
Iteration: 18/30[==============] -Error: 0.0371246456  -Training_Accuracy:  95.90  -time: 315.02 
Iteration: 19/30[==============] -Error: 0.0362451176  -Training_Accuracy:  96.02  -time: 336.75 
Iteration: 20/30[==============] -Error: 0.0354483124  -Training_Accuracy:  96.12  -time: 362.70 
Iteration: 21/30[==============] -Error: 0.0349168001  -Training_Accuracy:  96.12  -time: 376.76 
Iteration: 22/30[==============] -Error: 0.0342014590  -Training_Accuracy:  96.24  -time: 392.24 
Iteration: 23/30[==============] -Error: 0.0336115216  -Training_Accuracy:  96.31  -time: 406.18 
Iteration: 24/30[==============] -Error: 0.0331044299  -Training_Accuracy:  96.36  -time: 422.90 
Iteration: 25/30[==============] -Error: 0.0325105790  -Training_Accuracy:  96.44  -time: 438.33 
Iteration: 26/30[==============] -Error: 0.0319264355  -Training_Accuracy:  96.53  -time: 455.26 
Iteration: 27/30[==============] -Error: 0.0313326902  -Training_Accuracy:  96.49  -time: 471.43 
Iteration: 28/30[==============] -Error: 0.0309411694  -Training_Accuracy:  96.61  -time: 488.44 
Iteration: 29/30[==============] -Error: 0.0305512165  -Training_Accuracy:  96.63  -time: 505.73 
Iteration: 30/30[==============] -Error: 0.0298976480  -Training_Accuracy:  96.56  -time: 522.75 

In [17]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")



In [79]:
#save your model in Models/ using a distinguishing name for your model (architecture, learning rate, etc...)
my_mnist_net.save("Models/model_" + str(784) + "_" + str(30) + "_" + str(10) + "_" + str(0.1) + "_" + "30.model")
Observations
After each iteration, the error decreases at exponential rate. This looks reasonable.

Looking at the training accuracy, it increases after each iteration but at a lower rate. We think that this is because, like the error, in the first iterations is easy to improve because initial weights are random.

Looking at validation accuracy, the pattern is similar to that of the training accuracy, but the improvement is not monotoni: sometimes after an iteration we have a worse accuracy, this is because we try to improve our net on training data and here we are looking at validation data. At the end, we obtain an accuracy of 95%, which is few units below the training accuracy.

We can state that the net is not overfitting.

Question 2.1.4: Guess digit, Implement and test a python function that predict the class of a digit (the folder images_test contains some examples of images of digits)


In [33]:
#Your implementation goes here
reload(NN)
from scipy import misc

def predict_image(my_mnist_net,img_path,number):    
    img = misc.imread(img_path,mode='L')
    plt.imshow(img, cmap='gray')
    plt.show()

    mean = np.mean(img)
    if mean > 255/2 :
        img = np.invert(img)
    
    img = misc.imresize(img, (28,28))        
    plt.imshow(img, cmap='gray')
    plt.show()
    
    #count pixels
    
    img = np.reshape(img, (28*28))
    test = (img,number)
    #print(img)
    return my_mnist_net.predict2(test)

print('predicted: %d, real number: %d' %predict_image(my_mnist_net,'./Images_test/4.bmp',4))
print('predicted: %d, real number: %d' %predict_image(my_mnist_net,'./Images_test/5.bmp',5))
print('predicted: %d, real number: %d' %predict_image(my_mnist_net,'./Images_test/9.bmp',9))


predicted: 4, real number: 4
predicted: 5, real number: 5
predicted: 9, real number: 9
Observations
In a first moment, only one of the three test images are correctly recognized. The problem was that some number were black over a white background. Then we managed to convert all the images to the same format: background black and number in white and we obtained 3 matches over 3. In order to convert the images we checked the mean color of the figure and then if it was white, we inverted all the colors.

We think this is due to the fact that here we are using a NN that was trained with numbers in white. Then, when a number is black it doesn't work.

Part 2: Change the neural network structure and parameters to optimize performance

Question 2.2.1: Change the learning rate (0.001, 0.1, 1.0 , 10). Train the new neural nets with the original specifications (Part 2.1), for 50 iterations. Plot test accuracy vs iteration for each learning rate on the same graph. Report the maximum test accuracy achieved for each learning rate. Which one achieves the maximum test accuracy?


In [16]:
#Your implementation with a learning rate of 0.001 goes here 
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=0.001, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.4859765042  -Training_Accuracy:  38.64  -time: 13.50 
Iteration:  2/50[==============] -Error: 0.3953032951  -Training_Accuracy:  48.22  -time: 29.14 
Iteration:  3/50[==============] -Error: 0.3636985282  -Training_Accuracy:  54.60  -time: 44.42 
Iteration:  4/50[==============] -Error: 0.3384765839  -Training_Accuracy:  59.06  -time: 62.47 
Iteration:  5/50[==============] -Error: 0.3178134567  -Training_Accuracy:  61.82  -time: 75.17 
Iteration:  6/50[==============] -Error: 0.3006752240  -Training_Accuracy:  64.03  -time: 88.44 
Iteration:  7/50[==============] -Error: 0.2864235242  -Training_Accuracy:  65.95  -time: 103.78 
Iteration:  8/50[==============] -Error: 0.2744133441  -Training_Accuracy:  67.47  -time: 116.48 
Iteration:  9/50[==============] -Error: 0.2641674555  -Training_Accuracy:  68.79  -time: 129.15 
Iteration: 10/50[==============] -Error: 0.2552398978  -Training_Accuracy:  69.84  -time: 141.80 
Iteration: 11/50[==============] -Error: 0.2473561060  -Training_Accuracy:  70.92  -time: 154.75 
Iteration: 12/50[==============] -Error: 0.2402894177  -Training_Accuracy:  71.51  -time: 167.19 
Iteration: 13/50[==============] -Error: 0.2339245901  -Training_Accuracy:  72.43  -time: 179.61 
Iteration: 14/50[==============] -Error: 0.2280662395  -Training_Accuracy:  73.03  -time: 192.01 
Iteration: 15/50[==============] -Error: 0.2226574113  -Training_Accuracy:  73.63  -time: 204.35 
Iteration: 16/50[==============] -Error: 0.2176276304  -Training_Accuracy:  74.25  -time: 216.80 
Iteration: 17/50[==============] -Error: 0.2129535827  -Training_Accuracy:  74.89  -time: 229.26 
Iteration: 18/50[==============] -Error: 0.2085459790  -Training_Accuracy:  75.47  -time: 241.79 
Iteration: 19/50[==============] -Error: 0.2044128483  -Training_Accuracy:  75.95  -time: 254.14 
Iteration: 20/50[==============] -Error: 0.2005037853  -Training_Accuracy:  76.42  -time: 266.52 
Iteration: 21/50[==============] -Error: 0.1968020613  -Training_Accuracy:  76.84  -time: 278.92 
Iteration: 22/50[==============] -Error: 0.1932900286  -Training_Accuracy:  77.32  -time: 291.31 
Iteration: 23/50[==============] -Error: 0.1899605420  -Training_Accuracy:  77.73  -time: 303.68 
Iteration: 24/50[==============] -Error: 0.1867713519  -Training_Accuracy:  78.14  -time: 316.03 
Iteration: 25/50[==============] -Error: 0.1837386212  -Training_Accuracy:  78.47  -time: 328.39 
Iteration: 26/50[==============] -Error: 0.1808208911  -Training_Accuracy:  78.80  -time: 341.77 
Iteration: 27/50[==============] -Error: 0.1780430696  -Training_Accuracy:  79.09  -time: 354.19 
Iteration: 28/50[==============] -Error: 0.1753627811  -Training_Accuracy:  79.40  -time: 366.58 
Iteration: 29/50[==============] -Error: 0.1727976567  -Training_Accuracy:  79.68  -time: 378.93 
Iteration: 30/50[==============] -Error: 0.1703225015  -Training_Accuracy:  80.03  -time: 391.27 
Iteration: 31/50[==============] -Error: 0.1679472942  -Training_Accuracy:  80.30  -time: 403.84 
Iteration: 32/50[==============] -Error: 0.1656523127  -Training_Accuracy:  80.55  -time: 416.32 
Iteration: 33/50[==============] -Error: 0.1634355077  -Training_Accuracy:  80.86  -time: 428.72 
Iteration: 34/50[==============] -Error: 0.1613107872  -Training_Accuracy:  81.11  -time: 441.13 
Iteration: 35/50[==============] -Error: 0.1592500086  -Training_Accuracy:  81.38  -time: 453.57 
Iteration: 36/50[==============] -Error: 0.1572608741  -Training_Accuracy:  81.55  -time: 465.98 
Iteration: 37/50[==============] -Error: 0.1553491151  -Training_Accuracy:  81.74  -time: 478.77 
Iteration: 38/50[==============] -Error: 0.1534848524  -Training_Accuracy:  81.97  -time: 491.37 
Iteration: 39/50[==============] -Error: 0.1516951672  -Training_Accuracy:  82.16  -time: 507.25 
Iteration: 40/50[==============] -Error: 0.1499692587  -Training_Accuracy:  82.36  -time: 519.77 
Iteration: 41/50[==============] -Error: 0.1482857016  -Training_Accuracy:  82.54  -time: 532.82 
Iteration: 42/50[==============] -Error: 0.1466767797  -Training_Accuracy:  82.71  -time: 545.81 
Iteration: 43/50[==============] -Error: 0.1450912319  -Training_Accuracy:  82.92  -time: 558.33 
Iteration: 44/50[==============] -Error: 0.1435799613  -Training_Accuracy:  83.08  -time: 570.79 
Iteration: 45/50[==============] -Error: 0.1421075606  -Training_Accuracy:  83.26  -time: 583.16 
Iteration: 46/50[==============] -Error: 0.1406720303  -Training_Accuracy:  83.42  -time: 595.59 
Iteration: 47/50[==============] -Error: 0.1393006323  -Training_Accuracy:  83.58  -time: 608.01 
Iteration: 48/50[==============] -Error: 0.1379626320  -Training_Accuracy:  83.69  -time: 620.35 
Iteration: 49/50[==============] -Error: 0.1366587311  -Training_Accuracy:  83.85  -time: 632.97 
Iteration: 50/50[==============] -Error: 0.1354017560  -Training_Accuracy:  84.00  -time: 646.72 

In [17]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
With a very low learning rate, the error decreases at a lower rate.
This means that it may take more time to reach a good result.

Looking at the training accuracy, it increases after each iteration but at a lower rate. We note that the resulting curve is very clean, this means that every time we move very little in the position of the gradient. This is because the learning rate is really low.
We see the same pattern with the validation accuracy, the curve is really clean even considering validation samples.
We can state that after 50 iterations the net can still improve, in fact we obtain a validation accuracy of 84%.

In [25]:
#Your implementation with a learning rate of 0.01 goes here 
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=0.01, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.3312215275  -Training_Accuracy:  69.04  -time: 17.18 
Iteration:  2/50[==============] -Error: 0.2223044897  -Training_Accuracy:  76.63  -time: 36.14 
Iteration:  3/50[==============] -Error: 0.1805254103  -Training_Accuracy:  80.04  -time: 53.76 
Iteration:  4/50[==============] -Error: 0.1561428295  -Training_Accuracy:  82.53  -time: 66.49 
Iteration:  5/50[==============] -Error: 0.1397394571  -Training_Accuracy:  84.16  -time: 78.76 
Iteration:  6/50[==============] -Error: 0.1277714313  -Training_Accuracy:  85.40  -time: 91.63 
Iteration:  7/50[==============] -Error: 0.1186125108  -Training_Accuracy:  86.35  -time: 104.13 
Iteration:  8/50[==============] -Error: 0.1114878433  -Training_Accuracy:  87.01  -time: 116.87 
Iteration:  9/50[==============] -Error: 0.1058014769  -Training_Accuracy:  87.72  -time: 130.19 
Iteration: 10/50[==============] -Error: 0.1011376766  -Training_Accuracy:  88.15  -time: 143.63 
Iteration: 11/50[==============] -Error: 0.0972567804  -Training_Accuracy:  88.59  -time: 157.54 
Iteration: 12/50[==============] -Error: 0.0938351605  -Training_Accuracy:  88.87  -time: 173.27 
Iteration: 13/50[==============] -Error: 0.0909427802  -Training_Accuracy:  89.20  -time: 187.11 
Iteration: 14/50[==============] -Error: 0.0884110332  -Training_Accuracy:  89.59  -time: 199.57 
Iteration: 15/50[==============] -Error: 0.0860803706  -Training_Accuracy:  89.74  -time: 211.94 
Iteration: 16/50[==============] -Error: 0.0840633499  -Training_Accuracy:  90.04  -time: 224.53 
Iteration: 17/50[==============] -Error: 0.0822228413  -Training_Accuracy:  90.27  -time: 236.90 
Iteration: 18/50[==============] -Error: 0.0804685390  -Training_Accuracy:  90.43  -time: 249.20 
Iteration: 19/50[==============] -Error: 0.0789105049  -Training_Accuracy:  90.61  -time: 261.58 
Iteration: 20/50[==============] -Error: 0.0774777561  -Training_Accuracy:  90.81  -time: 274.30 
Iteration: 21/50[==============] -Error: 0.0761086088  -Training_Accuracy:  90.97  -time: 286.84 
Iteration: 22/50[==============] -Error: 0.0748737492  -Training_Accuracy:  91.01  -time: 299.50 
Iteration: 23/50[==============] -Error: 0.0736697985  -Training_Accuracy:  91.26  -time: 311.85 
Iteration: 24/50[==============] -Error: 0.0726259262  -Training_Accuracy:  91.36  -time: 324.15 
Iteration: 25/50[==============] -Error: 0.0715718106  -Training_Accuracy:  91.51  -time: 345.58 
Iteration: 26/50[==============] -Error: 0.0705932188  -Training_Accuracy:  91.61  -time: 362.33 
Iteration: 27/50[==============] -Error: 0.0696363476  -Training_Accuracy:  91.73  -time: 378.71 
Iteration: 28/50[==============] -Error: 0.0687497375  -Training_Accuracy:  91.84  -time: 392.51 
Iteration: 29/50[==============] -Error: 0.0678594296  -Training_Accuracy:  91.95  -time: 406.30 
Iteration: 30/50[==============] -Error: 0.0670615860  -Training_Accuracy:  92.08  -time: 420.40 
Iteration: 31/50[==============] -Error: 0.0662298016  -Training_Accuracy:  92.15  -time: 434.21 
Iteration: 32/50[==============] -Error: 0.0655299164  -Training_Accuracy:  92.27  -time: 447.91 
Iteration: 33/50[==============] -Error: 0.0647685798  -Training_Accuracy:  92.33  -time: 461.67 
Iteration: 34/50[==============] -Error: 0.0640649591  -Training_Accuracy:  92.48  -time: 475.58 
Iteration: 35/50[==============] -Error: 0.0633928577  -Training_Accuracy:  92.59  -time: 489.50 
Iteration: 36/50[==============] -Error: 0.0627143623  -Training_Accuracy:  92.68  -time: 503.33 
Iteration: 37/50[==============] -Error: 0.0621506084  -Training_Accuracy:  92.74  -time: 517.11 
Iteration: 38/50[==============] -Error: 0.0615094102  -Training_Accuracy:  92.82  -time: 530.89 
Iteration: 39/50[==============] -Error: 0.0609618525  -Training_Accuracy:  92.85  -time: 545.51 
Iteration: 40/50[==============] -Error: 0.0603609598  -Training_Accuracy:  92.93  -time: 559.82 
Iteration: 41/50[==============] -Error: 0.0598468947  -Training_Accuracy:  93.07  -time: 573.56 
Iteration: 42/50[==============] -Error: 0.0593165329  -Training_Accuracy:  93.15  -time: 587.28 
Iteration: 43/50[==============] -Error: 0.0588194834  -Training_Accuracy:  93.24  -time: 601.12 
Iteration: 44/50[==============] -Error: 0.0583433471  -Training_Accuracy:  93.28  -time: 616.57 
Iteration: 45/50[==============] -Error: 0.0578620272  -Training_Accuracy:  93.33  -time: 630.46 
Iteration: 46/50[==============] -Error: 0.0574041914  -Training_Accuracy:  93.38  -time: 644.25 
Iteration: 47/50[==============] -Error: 0.0569310835  -Training_Accuracy:  93.44  -time: 658.20 
Iteration: 48/50[==============] -Error: 0.0565244516  -Training_Accuracy:  93.47  -time: 672.24 
Iteration: 49/50[==============] -Error: 0.0560839440  -Training_Accuracy:  93.52  -time: 685.96 
Iteration: 50/50[==============] -Error: 0.0556815695  -Training_Accuracy:  93.53  -time: 699.74 

In [26]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Same as before, here the learning rate is a little bit grater and we see that the slope in all the curves is steeper.

We can state that after 50 iterations the net can still improve, in fact we obtain a validation accuracy of 94%, but the slope of the curve suggests that there is space to improve.

In [18]:
#Your implementation with a learning rate of 0.1 goes here 
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=0.1, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1664727551  -Training_Accuracy:  87.28  -time: 13.06 
Iteration:  2/50[==============] -Error: 0.0923832385  -Training_Accuracy:  90.37  -time: 26.06 
Iteration:  3/50[==============] -Error: 0.0758983674  -Training_Accuracy:  91.64  -time: 39.40 
Iteration:  4/50[==============] -Error: 0.0677050211  -Training_Accuracy:  92.69  -time: 52.00 
Iteration:  5/50[==============] -Error: 0.0621352380  -Training_Accuracy:  93.15  -time: 64.55 
Iteration:  6/50[==============] -Error: 0.0583716609  -Training_Accuracy:  93.59  -time: 77.16 
Iteration:  7/50[==============] -Error: 0.0555261435  -Training_Accuracy:  93.81  -time: 90.77 
Iteration:  8/50[==============] -Error: 0.0530157825  -Training_Accuracy:  94.19  -time: 109.55 
Iteration:  9/50[==============] -Error: 0.0508728181  -Training_Accuracy:  94.31  -time: 123.46 
Iteration: 10/50[==============] -Error: 0.0489569816  -Training_Accuracy:  94.40  -time: 136.47 
Iteration: 11/50[==============] -Error: 0.0474224008  -Training_Accuracy:  94.71  -time: 152.73 
Iteration: 12/50[==============] -Error: 0.0461137023  -Training_Accuracy:  94.97  -time: 165.80 
Iteration: 13/50[==============] -Error: 0.0445956145  -Training_Accuracy:  95.20  -time: 183.67 
Iteration: 14/50[==============] -Error: 0.0434070811  -Training_Accuracy:  95.21  -time: 197.58 
Iteration: 15/50[==============] -Error: 0.0423298697  -Training_Accuracy:  95.10  -time: 215.48 
Iteration: 16/50[==============] -Error: 0.0412397751  -Training_Accuracy:  95.53  -time: 228.07 
Iteration: 17/50[==============] -Error: 0.0401892492  -Training_Accuracy:  95.70  -time: 240.60 
Iteration: 18/50[==============] -Error: 0.0395053695  -Training_Accuracy:  95.81  -time: 257.48 
Iteration: 19/50[==============] -Error: 0.0387057297  -Training_Accuracy:  95.79  -time: 278.34 
Iteration: 20/50[==============] -Error: 0.0379411025  -Training_Accuracy:  95.86  -time: 298.13 
Iteration: 21/50[==============] -Error: 0.0372595145  -Training_Accuracy:  95.89  -time: 318.82 
Iteration: 22/50[==============] -Error: 0.0366133739  -Training_Accuracy:  96.03  -time: 344.07 
Iteration: 23/50[==============] -Error: 0.0358590081  -Training_Accuracy:  96.17  -time: 371.43 
Iteration: 24/50[==============] -Error: 0.0354264678  -Training_Accuracy:  96.23  -time: 404.70 
Iteration: 25/50[==============] -Error: 0.0348592548  -Training_Accuracy:  96.26  -time: 438.24 
Iteration: 26/50[==============] -Error: 0.0341021446  -Training_Accuracy:  96.36  -time: 486.06 
Iteration: 27/50[==============] -Error: 0.0336174579  -Training_Accuracy:  96.33  -time: 531.81 
Iteration: 28/50[==============] -Error: 0.0330403969  -Training_Accuracy:  96.37  -time: 581.44 
Iteration: 29/50[==============] -Error: 0.0327453452  -Training_Accuracy:  96.51  -time: 630.24 
Iteration: 30/50[==============] -Error: 0.0321364797  -Training_Accuracy:  96.53  -time: 652.97 
Iteration: 31/50[==============] -Error: 0.0318000802  -Training_Accuracy:  96.43  -time: 665.92 
Iteration: 32/50[==============] -Error: 0.0312824368  -Training_Accuracy:  96.51  -time: 680.31 
Iteration: 33/50[==============] -Error: 0.0308520135  -Training_Accuracy:  96.58  -time: 693.00 
Iteration: 34/50[==============] -Error: 0.0303655778  -Training_Accuracy:  96.63  -time: 705.47 
Iteration: 35/50[==============] -Error: 0.0300803703  -Training_Accuracy:  96.67  -time: 717.99 
Iteration: 36/50[==============] -Error: 0.0296854970  -Training_Accuracy:  96.76  -time: 730.59 
Iteration: 37/50[==============] -Error: 0.0293378231  -Training_Accuracy:  96.79  -time: 743.47 
Iteration: 38/50[==============] -Error: 0.0289111836  -Training_Accuracy:  96.82  -time: 761.73 
Iteration: 39/50[==============] -Error: 0.0285852916  -Training_Accuracy:  96.73  -time: 777.82 
Iteration: 40/50[==============] -Error: 0.0282486288  -Training_Accuracy:  96.87  -time: 791.67 
Iteration: 41/50[==============] -Error: 0.0280247024  -Training_Accuracy:  96.85  -time: 804.89 
Iteration: 42/50[==============] -Error: 0.0277103925  -Training_Accuracy:  96.92  -time: 818.11 
Iteration: 43/50[==============] -Error: 0.0273907233  -Training_Accuracy:  96.87  -time: 830.47 
Iteration: 44/50[==============] -Error: 0.0270130946  -Training_Accuracy:  96.94  -time: 842.71 
Iteration: 45/50[==============] -Error: 0.0269738001  -Training_Accuracy:  96.99  -time: 856.02 
Iteration: 46/50[==============] -Error: 0.0264930371  -Training_Accuracy:  97.02  -time: 868.31 
Iteration: 47/50[==============] -Error: 0.0263697697  -Training_Accuracy:  96.95  -time: 880.64 
Iteration: 48/50[==============] -Error: 0.0259954435  -Training_Accuracy:  97.03  -time: 893.32 
Iteration: 49/50[==============] -Error: 0.0257751541  -Training_Accuracy:  97.03  -time: 908.15 
Iteration: 50/50[==============] -Error: 0.0254204749  -Training_Accuracy:  97.12  -time: 926.98 

In [19]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
With a learning rate of 0.1 we obtain the best results.
We can state that this is probably the highest learning rate we can use, in fact we note that the validation accuracy reach a plateau where it starts to oscillate around 94.5%. We think that this is due to the fact that the learning rate is too high, and there is a ping pong effect.

In [20]:
#Your implementation with a learning rate of 1.0 goes here 
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=1.0, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1083921067  -Training_Accuracy:  90.82  -time: 16.83 
Iteration:  2/50[==============] -Error: 0.0724475698  -Training_Accuracy:  92.24  -time: 34.20 
Iteration:  3/50[==============] -Error: 0.0660974927  -Training_Accuracy:  93.33  -time: 48.57 
Iteration:  4/50[==============] -Error: 0.0610788945  -Training_Accuracy:  92.75  -time: 62.45 
Iteration:  5/50[==============] -Error: 0.0582589844  -Training_Accuracy:  93.78  -time: 74.99 
Iteration:  6/50[==============] -Error: 0.0556305137  -Training_Accuracy:  93.47  -time: 87.38 
Iteration:  7/50[==============] -Error: 0.0540748004  -Training_Accuracy:  93.94  -time: 103.06 
Iteration:  8/50[==============] -Error: 0.0527171779  -Training_Accuracy:  94.41  -time: 118.57 
Iteration:  9/50[==============] -Error: 0.0494298246  -Training_Accuracy:  94.67  -time: 136.18 
Iteration: 10/50[==============] -Error: 0.0479511801  -Training_Accuracy:  95.02  -time: 152.80 
Iteration: 11/50[==============] -Error: 0.0484407165  -Training_Accuracy:  94.36  -time: 170.09 
Iteration: 12/50[==============] -Error: 0.0469719220  -Training_Accuracy:  95.23  -time: 188.95 
Iteration: 13/50[==============] -Error: 0.0471343520  -Training_Accuracy:  94.56  -time: 202.51 
Iteration: 14/50[==============] -Error: 0.0459254186  -Training_Accuracy:  95.05  -time: 221.91 
Iteration: 15/50[==============] -Error: 0.0450645994  -Training_Accuracy:  95.27  -time: 235.19 
Iteration: 16/50[==============] -Error: 0.0444982113  -Training_Accuracy:  95.31  -time: 248.28 
Iteration: 17/50[==============] -Error: 0.0424898474  -Training_Accuracy:  95.54  -time: 262.13 
Iteration: 18/50[==============] -Error: 0.0422444661  -Training_Accuracy:  95.35  -time: 276.51 
Iteration: 19/50[==============] -Error: 0.0422327892  -Training_Accuracy:  95.48  -time: 289.75 
Iteration: 20/50[==============] -Error: 0.0419653799  -Training_Accuracy:  95.78  -time: 303.97 
Iteration: 21/50[==============] -Error: 0.0408210214  -Training_Accuracy:  95.83  -time: 317.97 
Iteration: 22/50[==============] -Error: 0.0404926198  -Training_Accuracy:  95.62  -time: 331.94 
Iteration: 23/50[==============] -Error: 0.0403151155  -Training_Accuracy:  96.01  -time: 345.65 
Iteration: 24/50[==============] -Error: 0.0406357310  -Training_Accuracy:  95.69  -time: 364.44 
Iteration: 25/50[==============] -Error: 0.0400290718  -Training_Accuracy:  95.58  -time: 377.63 
Iteration: 26/50[==============] -Error: 0.0386833785  -Training_Accuracy:  95.69  -time: 392.71 
Iteration: 27/50[==============] -Error: 0.0385501336  -Training_Accuracy:  95.76  -time: 408.54 
Iteration: 28/50[==============] -Error: 0.0385468061  -Training_Accuracy:  95.87  -time: 442.64 
Iteration: 29/50[==============] -Error: 0.0378276257  -Training_Accuracy:  95.94  -time: 468.98 
Iteration: 30/50[==============] -Error: 0.0381420940  -Training_Accuracy:  95.68  -time: 490.47 
Iteration: 31/50[==============] -Error: 0.0375683399  -Training_Accuracy:  96.35  -time: 504.96 
Iteration: 32/50[==============] -Error: 0.0374014714  -Training_Accuracy:  95.52  -time: 519.59 
Iteration: 33/50[==============] -Error: 0.0371843073  -Training_Accuracy:  96.16  -time: 534.77 
Iteration: 34/50[==============] -Error: 0.0367161710  -Training_Accuracy:  95.82  -time: 550.02 
Iteration: 35/50[==============] -Error: 0.0365581148  -Training_Accuracy:  96.20  -time: 562.55 
Iteration: 36/50[==============] -Error: 0.0360571260  -Training_Accuracy:  96.12  -time: 575.07 
Iteration: 37/50[==============] -Error: 0.0356630606  -Training_Accuracy:  96.27  -time: 588.37 
Iteration: 38/50[==============] -Error: 0.0352771593  -Training_Accuracy:  96.38  -time: 606.69 
Iteration: 39/50[==============] -Error: 0.0351239331  -Training_Accuracy:  96.06  -time: 622.69 
Iteration: 40/50[==============] -Error: 0.0343912518  -Training_Accuracy:  96.38  -time: 637.27 
Iteration: 41/50[==============] -Error: 0.0336619663  -Training_Accuracy:  96.24  -time: 652.93 
Iteration: 42/50[==============] -Error: 0.0340680851  -Training_Accuracy:  96.41  -time: 670.99 
Iteration: 43/50[==============] -Error: 0.0343577991  -Training_Accuracy:  96.40  -time: 685.38 
Iteration: 44/50[==============] -Error: 0.0336136827  -Training_Accuracy:  96.56  -time: 702.04 
Iteration: 45/50[==============] -Error: 0.0334659341  -Training_Accuracy:  96.31  -time: 717.39 
Iteration: 46/50[==============] -Error: 0.0323116953  -Training_Accuracy:  96.85  -time: 732.23 
Iteration: 47/50[==============] -Error: 0.0318071579  -Training_Accuracy:  96.41  -time: 746.76 
Iteration: 48/50[==============] -Error: 0.0328192102  -Training_Accuracy:  96.41  -time: 761.94 
Iteration: 49/50[==============] -Error: 0.0325410007  -Training_Accuracy:  96.78  -time: 775.91 
Iteration: 50/50[==============] -Error: 0.0321697384  -Training_Accuracy:  96.64  -time: 788.93 

In [21]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
With a learning rate of 1.0, we see big fluctuations regarding both the training accuracy and the validation accuracy. The surprising thing is that at the end we obtain a better result: 96.5% for the training accuracy and 95.5% for the validation accuracy.
Maybe fluctuations are good and help the model to achieve a better result.

In [22]:
#Your implementation with a learning rate of 10.0 goes here 
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=10.0, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.5184316462  -Training_Accuracy:  9.09  -time: 13.38 
Iteration:  2/50[==============] -Error: 0.5008010753  -Training_Accuracy:  17.87  -time: 26.85 
Iteration:  3/50[==============] -Error: 0.5002451455  -Training_Accuracy:  20.04  -time: 40.47 
Iteration:  4/50[==============] -Error: 0.5000468756  -Training_Accuracy:  10.92  -time: 54.13 
Iteration:  5/50[==============] -Error: 0.5001264602  -Training_Accuracy:  17.32  -time: 67.97 
Iteration:  6/50[==============] -Error: 0.5000491691  -Training_Accuracy:  15.41  -time: 83.02 
Iteration:  7/50[==============] -Error: 0.4999996631  -Training_Accuracy:  12.07  -time: 95.69 
Iteration:  8/50[==============] -Error: 0.5001920765  -Training_Accuracy:  12.64  -time: 108.36 
Iteration:  9/50[==============] -Error: 0.5000303581  -Training_Accuracy:  10.65  -time: 120.85 
Iteration: 10/50[==============] -Error: 0.4997744474  -Training_Accuracy:  13.34  -time: 133.84 
Iteration: 11/50[==============] -Error: 0.5000033406  -Training_Accuracy:  13.05  -time: 147.10 
Iteration: 12/50[==============] -Error: 0.4999999425  -Training_Accuracy:  13.06  -time: 160.33 
Iteration: 13/50[==============] -Error: 0.4999999369  -Training_Accuracy:  13.07  -time: 173.47 
Iteration: 14/50[==============] -Error: 0.4999999295  -Training_Accuracy:  13.06  -time: 186.38 
Iteration: 15/50[==============] -Error: 0.4999999192  -Training_Accuracy:  13.05  -time: 199.28 
Iteration: 16/50[==============] -Error: 0.4999999039  -Training_Accuracy:  13.07  -time: 212.94 
Iteration: 17/50[==============] -Error: 0.4999998775  -Training_Accuracy:  13.09  -time: 226.38 
Iteration: 18/50[==============] -Error: 0.4999998188  -Training_Accuracy:  13.08  -time: 239.76 
Iteration: 19/50[==============] -Error: 0.4999994308  -Training_Accuracy:  13.01  -time: 253.39 
Iteration: 20/50[==============] -Error: 0.5000103777  -Training_Accuracy:  13.07  -time: 268.33 
Iteration: 21/50[==============] -Error: 0.4998590142  -Training_Accuracy:  17.55  -time: 282.37 
Iteration: 22/50[==============] -Error: 0.4996863681  -Training_Accuracy:  10.64  -time: 294.78 
Iteration: 23/50[==============] -Error: 0.4985215511  -Training_Accuracy:  10.37  -time: 307.17 
Iteration: 24/50[==============] -Error: 0.4998431784  -Training_Accuracy:  13.47  -time: 320.93 
Iteration: 25/50[==============] -Error: 0.4990106564  -Training_Accuracy:  9.66  -time: 333.89 
Iteration: 26/50[==============] -Error: 0.4999996275  -Training_Accuracy:  9.64  -time: 346.48 
Iteration: 27/50[==============] -Error: 0.4999017411  -Training_Accuracy:  12.98  -time: 360.08 
Iteration: 28/50[==============] -Error: 0.4987800327  -Training_Accuracy:  14.08  -time: 372.94 
Iteration: 29/50[==============] -Error: 0.4983980442  -Training_Accuracy:  12.42  -time: 387.31 
Iteration: 30/50[==============] -Error: 0.4999999282  -Training_Accuracy:  12.37  -time: 400.91 
Iteration: 31/50[==============] -Error: 0.4999999186  -Training_Accuracy:  12.35  -time: 413.85 
Iteration: 32/50[==============] -Error: 0.4999999053  -Training_Accuracy:  12.34  -time: 426.99 
Iteration: 33/50[==============] -Error: 0.4999998847  -Training_Accuracy:  12.32  -time: 439.55 
Iteration: 34/50[==============] -Error: 0.4999998471  -Training_Accuracy:  12.31  -time: 452.38 
Iteration: 35/50[==============] -Error: 0.4999997420  -Training_Accuracy:  9.73  -time: 465.10 
Iteration: 36/50[==============] -Error: 0.4999997453  -Training_Accuracy:  12.32  -time: 478.44 
Iteration: 37/50[==============] -Error: 0.4999997341  -Training_Accuracy:  12.27  -time: 491.77 
Iteration: 38/50[==============] -Error: 0.5480787713  -Training_Accuracy:  12.30  -time: 504.72 
Iteration: 39/50[==============] -Error: 0.4999992650  -Training_Accuracy:  12.42  -time: 518.20 
Iteration: 40/50[==============] -Error: 0.4999795772  -Training_Accuracy:  11.54  -time: 537.79 
Iteration: 41/50[==============] -Error: 0.4999699592  -Training_Accuracy:  11.53  -time: 555.60 
Iteration: 42/50[==============] -Error: 0.4999699547  -Training_Accuracy:  11.53  -time: 571.56 
Iteration: 43/50[==============] -Error: 0.4999699492  -Training_Accuracy:  11.53  -time: 584.45 
Iteration: 44/50[==============] -Error: 0.4999699421  -Training_Accuracy:  11.52  -time: 598.72 
Iteration: 45/50[==============] -Error: 0.4999699324  -Training_Accuracy:  11.52  -time: 612.07 
Iteration: 46/50[==============] -Error: 0.4999699185  -Training_Accuracy:  11.52  -time: 625.17 
Iteration: 47/50[==============] -Error: 0.4999698961  -Training_Accuracy:  11.51  -time: 638.62 
Iteration: 48/50[==============] -Error: 0.4999698532  -Training_Accuracy:  11.50  -time: 652.28 
Iteration: 49/50[==============] -Error: 0.4999697177  -Training_Accuracy:  9.02  -time: 666.50 
Iteration: 50/50[==============] -Error: 0.4999720537  -Training_Accuracy:  11.54  -time: 679.68 

In [23]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
With a learning rate of 10.0, the training of our net is completely unstable and we obtain very bad results.
Final Observations
We saw that the learning rate is one of the most important parameters because it directly influences the training of our net. It looks that a learning rate from 0.1 to 1.0 is the best choice.

Question 2.2.2 : initialize all weights to 0. Plot the training accuracy curve. Comment your results


In [27]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=0.1, iterations=50)
my_mnist_net.weights_initialisation(np.zeros((28*28 + 1, 30)), np.zeros((31,10)))
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.4089646296  -Training_Accuracy:  28.48  -time: 20.27 
Iteration:  2/50[==============] -Error: 0.3715617453  -Training_Accuracy:  31.35  -time: 41.34 
Iteration:  3/50[==============] -Error: 0.3700178290  -Training_Accuracy:  32.74  -time: 57.77 
Iteration:  4/50[==============] -Error: 0.3694228377  -Training_Accuracy:  31.21  -time: 72.02 
Iteration:  5/50[==============] -Error: 0.3689573108  -Training_Accuracy:  31.71  -time: 87.33 
Iteration:  6/50[==============] -Error: 0.3687056744  -Training_Accuracy:  31.88  -time: 100.84 
Iteration:  7/50[==============] -Error: 0.3685375747  -Training_Accuracy:  35.59  -time: 115.71 
Iteration:  8/50[==============] -Error: 0.3683981473  -Training_Accuracy:  32.91  -time: 129.88 
Iteration:  9/50[==============] -Error: 0.3681146199  -Training_Accuracy:  31.48  -time: 143.64 
Iteration: 10/50[==============] -Error: 0.3680963706  -Training_Accuracy:  32.05  -time: 157.42 
Iteration: 11/50[==============] -Error: 0.3680431143  -Training_Accuracy:  34.83  -time: 171.16 
Iteration: 12/50[==============] -Error: 0.3679269589  -Training_Accuracy:  34.73  -time: 186.24 
Iteration: 13/50[==============] -Error: 0.3678165634  -Training_Accuracy:  31.31  -time: 200.00 
Iteration: 14/50[==============] -Error: 0.3677525791  -Training_Accuracy:  34.37  -time: 214.20 
Iteration: 15/50[==============] -Error: 0.3676338624  -Training_Accuracy:  34.23  -time: 228.03 
Iteration: 16/50[==============] -Error: 0.3676051980  -Training_Accuracy:  37.33  -time: 242.33 
Iteration: 17/50[==============] -Error: 0.3675005920  -Training_Accuracy:  36.11  -time: 259.53 
Iteration: 18/50[==============] -Error: 0.3673984545  -Training_Accuracy:  34.69  -time: 273.01 
Iteration: 19/50[==============] -Error: 0.3674986577  -Training_Accuracy:  36.63  -time: 286.33 
Iteration: 20/50[==============] -Error: 0.3673444685  -Training_Accuracy:  32.25  -time: 300.83 
Iteration: 21/50[==============] -Error: 0.3673834882  -Training_Accuracy:  32.91  -time: 321.45 
Iteration: 22/50[==============] -Error: 0.3673542543  -Training_Accuracy:  32.34  -time: 336.66 
Iteration: 23/50[==============] -Error: 0.3671353505  -Training_Accuracy:  35.40  -time: 351.78 
Iteration: 24/50[==============] -Error: 0.3671342845  -Training_Accuracy:  35.50  -time: 368.23 
Iteration: 25/50[==============] -Error: 0.3671141777  -Training_Accuracy:  34.13  -time: 383.38 
Iteration: 26/50[==============] -Error: 0.3671603744  -Training_Accuracy:  32.22  -time: 397.27 
Iteration: 27/50[==============] -Error: 0.3670494312  -Training_Accuracy:  36.18  -time: 411.02 
Iteration: 28/50[==============] -Error: 0.3670479099  -Training_Accuracy:  35.48  -time: 424.89 
Iteration: 29/50[==============] -Error: 0.3669267344  -Training_Accuracy:  35.63  -time: 438.80 
Iteration: 30/50[==============] -Error: 0.3669028457  -Training_Accuracy:  34.89  -time: 462.33 
Iteration: 31/50[==============] -Error: 0.3669202326  -Training_Accuracy:  34.17  -time: 475.65 
Iteration: 32/50[==============] -Error: 0.3668462348  -Training_Accuracy:  33.55  -time: 489.21 
Iteration: 33/50[==============] -Error: 0.3668760913  -Training_Accuracy:  33.87  -time: 502.64 
Iteration: 34/50[==============] -Error: 0.3668266603  -Training_Accuracy:  36.05  -time: 518.21 
Iteration: 35/50[==============] -Error: 0.3667521812  -Training_Accuracy:  36.43  -time: 532.05 
Iteration: 36/50[==============] -Error: 0.3666461558  -Training_Accuracy:  35.95  -time: 545.88 
Iteration: 37/50[==============] -Error: 0.3667597423  -Training_Accuracy:  36.31  -time: 559.64 
Iteration: 38/50[==============] -Error: 0.3666557282  -Training_Accuracy:  37.20  -time: 573.86 
Iteration: 39/50[==============] -Error: 0.3666867326  -Training_Accuracy:  36.27  -time: 587.88 
Iteration: 40/50[==============] -Error: 0.3666563886  -Training_Accuracy:  37.19  -time: 601.79 
Iteration: 41/50[==============] -Error: 0.3665842008  -Training_Accuracy:  35.34  -time: 615.52 
Iteration: 42/50[==============] -Error: 0.3666050737  -Training_Accuracy:  34.17  -time: 629.27 
Iteration: 43/50[==============] -Error: 0.3665736862  -Training_Accuracy:  36.43  -time: 642.97 
Iteration: 44/50[==============] -Error: 0.3665108308  -Training_Accuracy:  34.60  -time: 656.82 
Iteration: 45/50[==============] -Error: 0.3664489757  -Training_Accuracy:  34.85  -time: 670.65 
Iteration: 46/50[==============] -Error: 0.3665705849  -Training_Accuracy:  35.38  -time: 685.21 
Iteration: 47/50[==============] -Error: 0.3663964891  -Training_Accuracy:  36.83  -time: 699.38 
Iteration: 48/50[==============] -Error: 0.3663928370  -Training_Accuracy:  36.98  -time: 713.16 
Iteration: 49/50[==============] -Error: 0.3664281870  -Training_Accuracy:  35.52  -time: 726.94 
Iteration: 50/50[==============] -Error: 0.3663576781  -Training_Accuracy:  35.96  -time: 740.69 

In [28]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Your answer goes here

observations

When all weights are 0, we have all neurons with the same activation, in this case 0, and the same output, in this case 0.5. What happens is that all deltas will be equal, thus all the weight updates of a single layer will be equal. This means that all the neurons are learning in the same way.

This explains why the error increases in the first iterations and then it converges. This explains well because we need to randomize the weights to avoid this weird simmetry situations.

Question 2.2.3 : Try with a different transfer function (such as tanh). File transfer_functions.py provides you the python implementation of the tanh function and its derivative


In [30]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=0.1, iterations=50,transfer_function=TF.tanh,d_transfer_function=TF.dtanh)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 1.1954961310  -Training_Accuracy:  66.96  -time: 15.58 
Iteration:  2/50[==============] -Error: 1.0037567352  -Training_Accuracy:  77.14  -time: 28.06 
Iteration:  3/50[==============] -Error: 0.9558486091  -Training_Accuracy:  80.58  -time: 41.97 
Iteration:  4/50[==============] -Error: 0.9341380121  -Training_Accuracy:  71.66  -time: 55.46 
Iteration:  5/50[==============] -Error: 0.9154109693  -Training_Accuracy:  82.31  -time: 67.85 
Iteration:  6/50[==============] -Error: 0.9081055626  -Training_Accuracy:  82.33  -time: 80.33 
Iteration:  7/50[==============] -Error: 0.8908336148  -Training_Accuracy:  87.99  -time: 92.78 
Iteration:  8/50[==============] -Error: 0.8849879806  -Training_Accuracy:  87.20  -time: 105.08 
Iteration:  9/50[==============] -Error: 0.8771960499  -Training_Accuracy:  87.41  -time: 117.93 
Iteration: 10/50[==============] -Error: 0.8668901406  -Training_Accuracy:  85.92  -time: 131.63 
Iteration: 11/50[==============] -Error: 0.8692687816  -Training_Accuracy:  88.81  -time: 146.08 
Iteration: 12/50[==============] -Error: 0.8711757194  -Training_Accuracy:  87.87  -time: 158.58 
Iteration: 13/50[==============] -Error: 0.8595993289  -Training_Accuracy:  80.04  -time: 171.03 
Iteration: 14/50[==============] -Error: 0.8591815624  -Training_Accuracy:  84.12  -time: 183.64 
Iteration: 15/50[==============] -Error: 0.8494153658  -Training_Accuracy:  89.40  -time: 195.91 
Iteration: 16/50[==============] -Error: 0.8582281832  -Training_Accuracy:  88.42  -time: 208.18 
Iteration: 17/50[==============] -Error: 0.8426161794  -Training_Accuracy:  89.97  -time: 220.46 
Iteration: 18/50[==============] -Error: 0.8429654299  -Training_Accuracy:  83.89  -time: 232.73 
Iteration: 19/50[==============] -Error: 0.8543995365  -Training_Accuracy:  89.07  -time: 244.99 
Iteration: 20/50[==============] -Error: 0.8425652451  -Training_Accuracy:  86.91  -time: 257.71 
Iteration: 21/50[==============] -Error: 0.8392057848  -Training_Accuracy:  81.48  -time: 270.55 
Iteration: 22/50[==============] -Error: 0.8463190470  -Training_Accuracy:  87.95  -time: 283.48 
Iteration: 23/50[==============] -Error: 0.8345524724  -Training_Accuracy:  90.08  -time: 295.74 
Iteration: 24/50[==============] -Error: 0.8305792446  -Training_Accuracy:  89.63  -time: 309.07 
Iteration: 25/50[==============] -Error: 0.8289821445  -Training_Accuracy:  87.42  -time: 321.54 
Iteration: 26/50[==============] -Error: 0.8291375753  -Training_Accuracy:  89.91  -time: 333.98 
Iteration: 27/50[==============] -Error: 0.8269616431  -Training_Accuracy:  85.21  -time: 345.04 
Iteration: 28/50[==============] -Error: 0.8178724979  -Training_Accuracy:  90.17  -time: 356.48 
Iteration: 29/50[==============] -Error: 0.8195823161  -Training_Accuracy:  82.42  -time: 367.41 
Iteration: 30/50[==============] -Error: 0.8246619991  -Training_Accuracy:  90.88  -time: 378.33 
Iteration: 31/50[==============] -Error: 0.8219820880  -Training_Accuracy:  88.26  -time: 390.04 
Iteration: 32/50[==============] -Error: 0.8167935143  -Training_Accuracy:  90.36  -time: 400.98 
Iteration: 33/50[==============] -Error: 0.8228723972  -Training_Accuracy:  88.98  -time: 412.82 
Iteration: 34/50[==============] -Error: 0.8128562570  -Training_Accuracy:  90.61  -time: 425.36 
Iteration: 35/50[==============] -Error: 0.8118262910  -Training_Accuracy:  90.37  -time: 437.76 
Iteration: 36/50[==============] -Error: 0.8107208410  -Training_Accuracy:  89.70  -time: 455.29 
Iteration: 37/50[==============] -Error: 0.8141660241  -Training_Accuracy:  90.73  -time: 467.82 
Iteration: 38/50[==============] -Error: 0.8074261112  -Training_Accuracy:  83.27  -time: 479.71 
Iteration: 39/50[==============] -Error: 0.7960289170  -Training_Accuracy:  90.56  -time: 491.14 
Iteration: 40/50[==============] -Error: 0.7971170254  -Training_Accuracy:  90.69  -time: 502.69 
Iteration: 41/50[==============] -Error: 0.7984789855  -Training_Accuracy:  90.21  -time: 514.38 
Iteration: 42/50[==============] -Error: 0.8097630958  -Training_Accuracy:  89.97  -time: 528.44 
Iteration: 43/50[==============] -Error: 0.8068317007  -Training_Accuracy:  85.78  -time: 540.61 
Iteration: 44/50[==============] -Error: 0.7902285056  -Training_Accuracy:  90.18  -time: 551.80 
Iteration: 45/50[==============] -Error: 0.7882359408  -Training_Accuracy:  88.42  -time: 562.72 
Iteration: 46/50[==============] -Error: 0.7817277347  -Training_Accuracy:  91.06  -time: 574.49 
Iteration: 47/50[==============] -Error: 0.7906428053  -Training_Accuracy:  88.36  -time: 588.71 
Iteration: 48/50[==============] -Error: 0.7831658696  -Training_Accuracy:  90.31  -time: 600.16 
Iteration: 49/50[==============] -Error: 0.7848202288  -Training_Accuracy:  89.64  -time: 611.14 
Iteration: 50/50[==============] -Error: 0.7811713016  -Training_Accuracy:  90.20  -time: 622.05 

In [31]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Your answer goes here

Observations
The tanh transfer function perform worse than sigmoid function. 97% for the sigmoid vs 90 for the tanh. It seems that using the tanh we obtain many oscillations, maybe we should change the learning rate.

In [52]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 30, 10, learning_rate=0.01, iterations=50,transfer_function=TF.tanh,d_transfer_function=TF.dtanh)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.3688623088  -Training_Accuracy:  58.68  -time: 14.24 
Iteration:  2/50[==============] -Error: 0.3138715414  -Training_Accuracy:  62.85  -time: 25.36 
Iteration:  3/50[==============] -Error: 0.2791992110  -Training_Accuracy:  72.26  -time: 37.31 
Iteration:  4/50[==============] -Error: 0.2514132469  -Training_Accuracy:  71.97  -time: 48.14 
Iteration:  5/50[==============] -Error: 0.2281110686  -Training_Accuracy:  75.71  -time: 59.64 
Iteration:  6/50[==============] -Error: 0.2067592982  -Training_Accuracy:  78.47  -time: 72.88 
Iteration:  7/50[==============] -Error: 0.1915024334  -Training_Accuracy:  80.72  -time: 84.51 
Iteration:  8/50[==============] -Error: 0.1805226967  -Training_Accuracy:  81.45  -time: 95.43 
Iteration:  9/50[==============] -Error: 0.1724811449  -Training_Accuracy:  82.84  -time: 107.42 
Iteration: 10/50[==============] -Error: 0.1652117053  -Training_Accuracy:  83.38  -time: 119.46 
Iteration: 11/50[==============] -Error: 0.1592613815  -Training_Accuracy:  83.81  -time: 132.67 
Iteration: 12/50[==============] -Error: 0.1543760613  -Training_Accuracy:  84.80  -time: 145.43 
Iteration: 13/50[==============] -Error: 0.1499186945  -Training_Accuracy:  84.30  -time: 156.64 
Iteration: 14/50[==============] -Error: 0.1456487970  -Training_Accuracy:  85.87  -time: 167.66 
Iteration: 15/50[==============] -Error: 0.1418746070  -Training_Accuracy:  86.11  -time: 179.86 
Iteration: 16/50[==============] -Error: 0.1377997628  -Training_Accuracy:  86.53  -time: 198.76 
Iteration: 17/50[==============] -Error: 0.1345087944  -Training_Accuracy:  86.68  -time: 211.53 
Iteration: 18/50[==============] -Error: 0.1315807343  -Training_Accuracy:  87.12  -time: 225.03 
Iteration: 19/50[==============] -Error: 0.1289805021  -Training_Accuracy:  87.29  -time: 237.30 
Iteration: 20/50[==============] -Error: 0.1265501479  -Training_Accuracy:  87.70  -time: 251.38 
Iteration: 21/50[==============] -Error: 0.1245476574  -Training_Accuracy:  86.99  -time: 262.80 
Iteration: 22/50[==============] -Error: 0.1228307661  -Training_Accuracy:  87.88  -time: 274.20 
Iteration: 23/50[==============] -Error: 0.1209206747  -Training_Accuracy:  87.49  -time: 285.09 
Iteration: 24/50[==============] -Error: 0.1192018103  -Training_Accuracy:  88.28  -time: 297.27 
Iteration: 25/50[==============] -Error: 0.1175417015  -Training_Accuracy:  87.89  -time: 308.20 
Iteration: 26/50[==============] -Error: 0.1161410121  -Training_Accuracy:  88.19  -time: 320.02 
Iteration: 27/50[==============] -Error: 0.1145777969  -Training_Accuracy:  89.07  -time: 331.60 
Iteration: 28/50[==============] -Error: 0.1133749994  -Training_Accuracy:  88.77  -time: 344.41 
Iteration: 29/50[==============] -Error: 0.1120794425  -Training_Accuracy:  88.93  -time: 362.51 
Iteration: 30/50[==============] -Error: 0.1109536377  -Training_Accuracy:  88.88  -time: 380.32 
Iteration: 31/50[==============] -Error: 0.1099188701  -Training_Accuracy:  89.10  -time: 391.23 
Iteration: 32/50[==============] -Error: 0.1090542673  -Training_Accuracy:  89.40  -time: 402.08 
Iteration: 33/50[==============] -Error: 0.1083190221  -Training_Accuracy:  88.79  -time: 413.13 
Iteration: 34/50[==============] -Error: 0.1073983899  -Training_Accuracy:  88.92  -time: 424.75 
Iteration: 35/50[==============] -Error: 0.1062430138  -Training_Accuracy:  89.18  -time: 436.47 
Iteration: 36/50[==============] -Error: 0.1057597703  -Training_Accuracy:  89.49  -time: 448.34 
Iteration: 37/50[==============] -Error: 0.1049280819  -Training_Accuracy:  89.40  -time: 460.49 
Iteration: 38/50[==============] -Error: 0.1043805017  -Training_Accuracy:  89.38  -time: 472.48 
Iteration: 39/50[==============] -Error: 0.1036378496  -Training_Accuracy:  89.96  -time: 483.59 
Iteration: 40/50[==============] -Error: 0.1032039618  -Training_Accuracy:  89.55  -time: 494.39 
Iteration: 41/50[==============] -Error: 0.1023699524  -Training_Accuracy:  89.84  -time: 505.23 
Iteration: 42/50[==============] -Error: 0.1019585680  -Training_Accuracy:  90.09  -time: 516.22 
Iteration: 43/50[==============] -Error: 0.1014303830  -Training_Accuracy:  89.78  -time: 527.14 
Iteration: 44/50[==============] -Error: 0.1008407188  -Training_Accuracy:  89.53  -time: 537.91 
Iteration: 45/50[==============] -Error: 0.1003164049  -Training_Accuracy:  90.00  -time: 549.33 
Iteration: 46/50[==============] -Error: 0.0999731426  -Training_Accuracy:  90.10  -time: 561.06 
Iteration: 47/50[==============] -Error: 0.0993271022  -Training_Accuracy:  89.80  -time: 572.59 
Iteration: 48/50[==============] -Error: 0.0988768329  -Training_Accuracy:  90.17  -time: 584.94 
Iteration: 49/50[==============] -Error: 0.0984162544  -Training_Accuracy:  89.97  -time: 596.77 
Iteration: 50/50[==============] -Error: 0.0980911896  -Training_Accuracy:  90.10  -time: 608.56 

In [53]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Also after the correction of the learning rate, the accuracy is quite bad.
This means that the transfer function should be carefully chosen based on the problem.

Question 2.2.4 : Add more neurons in the hidden layer (try with 100, 200, 300). Plot the curve representing the validation accuracy versus the number of neurons in the hidden layer. (Choose and justify other hyper-parameters)


In [32]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 100, 10, learning_rate=0.1, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1268500114  -Training_Accuracy:  89.84  -time: 30.49 
Iteration:  2/50[==============] -Error: 0.0794810963  -Training_Accuracy:  92.37  -time: 67.60 
Iteration:  3/50[==============] -Error: 0.0652931397  -Training_Accuracy:  93.64  -time: 95.17 
Iteration:  4/50[==============] -Error: 0.0565248980  -Training_Accuracy:  94.40  -time: 126.00 
Iteration:  5/50[==============] -Error: 0.0504918769  -Training_Accuracy:  94.98  -time: 152.72 
Iteration:  6/50[==============] -Error: 0.0457521597  -Training_Accuracy:  95.30  -time: 184.14 
Iteration:  7/50[==============] -Error: 0.0421894590  -Training_Accuracy:  95.70  -time: 208.53 
Iteration:  8/50[==============] -Error: 0.0391551011  -Training_Accuracy:  96.04  -time: 232.52 
Iteration:  9/50[==============] -Error: 0.0364598872  -Training_Accuracy:  96.27  -time: 266.76 
Iteration: 10/50[==============] -Error: 0.0342296890  -Training_Accuracy:  96.58  -time: 292.31 
Iteration: 11/50[==============] -Error: 0.0322722108  -Training_Accuracy:  96.80  -time: 316.08 
Iteration: 12/50[==============] -Error: 0.0303530713  -Training_Accuracy:  96.94  -time: 339.86 
Iteration: 13/50[==============] -Error: 0.0288913171  -Training_Accuracy:  97.08  -time: 363.81 
Iteration: 14/50[==============] -Error: 0.0274066154  -Training_Accuracy:  97.22  -time: 387.53 
Iteration: 15/50[==============] -Error: 0.0260874583  -Training_Accuracy:  97.30  -time: 412.01 
Iteration: 16/50[==============] -Error: 0.0248719406  -Training_Accuracy:  97.40  -time: 435.79 
Iteration: 17/50[==============] -Error: 0.0237814110  -Training_Accuracy:  97.51  -time: 459.63 
Iteration: 18/50[==============] -Error: 0.0228439962  -Training_Accuracy:  97.60  -time: 483.31 
Iteration: 19/50[==============] -Error: 0.0218611524  -Training_Accuracy:  97.68  -time: 507.04 
Iteration: 20/50[==============] -Error: 0.0210272205  -Training_Accuracy:  97.73  -time: 531.70 
Iteration: 21/50[==============] -Error: 0.0202463405  -Training_Accuracy:  97.82  -time: 557.71 
Iteration: 22/50[==============] -Error: 0.0195229707  -Training_Accuracy:  97.86  -time: 583.25 
Iteration: 23/50[==============] -Error: 0.0188170678  -Training_Accuracy:  97.92  -time: 609.96 
Iteration: 24/50[==============] -Error: 0.0182172369  -Training_Accuracy:  97.95  -time: 635.61 
Iteration: 25/50[==============] -Error: 0.0176485720  -Training_Accuracy:  98.00  -time: 659.55 
Iteration: 26/50[==============] -Error: 0.0170122861  -Training_Accuracy:  98.08  -time: 682.94 
Iteration: 27/50[==============] -Error: 0.0165431676  -Training_Accuracy:  98.09  -time: 706.47 
Iteration: 28/50[==============] -Error: 0.0160012191  -Training_Accuracy:  98.11  -time: 730.73 
Iteration: 29/50[==============] -Error: 0.0155507189  -Training_Accuracy:  98.13  -time: 754.35 
Iteration: 30/50[==============] -Error: 0.0151092730  -Training_Accuracy:  98.16  -time: 778.44 
Iteration: 31/50[==============] -Error: 0.0147330607  -Training_Accuracy:  98.20  -time: 802.73 
Iteration: 32/50[==============] -Error: 0.0143563525  -Training_Accuracy:  98.21  -time: 828.85 
Iteration: 33/50[==============] -Error: 0.0140289896  -Training_Accuracy:  98.23  -time: 853.61 
Iteration: 34/50[==============] -Error: 0.0137129742  -Training_Accuracy:  98.25  -time: 878.22 
Iteration: 35/50[==============] -Error: 0.0134216763  -Training_Accuracy:  98.27  -time: 901.21 
Iteration: 36/50[==============] -Error: 0.0131508960  -Training_Accuracy:  98.29  -time: 924.70 
Iteration: 37/50[==============] -Error: 0.0128895251  -Training_Accuracy:  98.31  -time: 948.39 
Iteration: 38/50[==============] -Error: 0.0126235890  -Training_Accuracy:  98.32  -time: 971.99 
Iteration: 39/50[==============] -Error: 0.0124025151  -Training_Accuracy:  98.33  -time: 995.31 
Iteration: 40/50[==============] -Error: 0.0121626468  -Training_Accuracy:  98.34  -time: 1019.52 
Iteration: 41/50[==============] -Error: 0.0119705313  -Training_Accuracy:  98.37  -time: 1045.05 
Iteration: 42/50[==============] -Error: 0.0117681809  -Training_Accuracy:  98.37  -time: 1068.15 
Iteration: 43/50[==============] -Error: 0.0115829831  -Training_Accuracy:  98.39  -time: 1091.36 
Iteration: 44/50[==============] -Error: 0.0113947584  -Training_Accuracy:  98.41  -time: 1114.58 
Iteration: 45/50[==============] -Error: 0.0112390290  -Training_Accuracy:  98.43  -time: 1138.23 
Iteration: 46/50[==============] -Error: 0.0110951943  -Training_Accuracy:  98.44  -time: 1161.45 
Iteration: 47/50[==============] -Error: 0.0109260779  -Training_Accuracy:  98.45  -time: 1185.62 
Iteration: 48/50[==============] -Error: 0.0107876568  -Training_Accuracy:  98.46  -time: 1208.70 
Iteration: 49/50[==============] -Error: 0.0106577116  -Training_Accuracy:  98.47  -time: 1231.71 
Iteration: 50/50[==============] -Error: 0.0105247097  -Training_Accuracy:  98.50  -time: 1254.86 

In [33]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")



In [34]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 200, 10, learning_rate=0.1, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1147831921  -Training_Accuracy:  91.00  -time: 47.79 
Iteration:  2/50[==============] -Error: 0.0750538955  -Training_Accuracy:  93.17  -time: 95.25 
Iteration:  3/50[==============] -Error: 0.0609653363  -Training_Accuracy:  94.14  -time: 142.30 
Iteration:  4/50[==============] -Error: 0.0521653289  -Training_Accuracy:  94.67  -time: 189.40 
Iteration:  5/50[==============] -Error: 0.0460117198  -Training_Accuracy:  95.53  -time: 236.63 
Iteration:  6/50[==============] -Error: 0.0412511240  -Training_Accuracy:  95.87  -time: 283.95 
Iteration:  7/50[==============] -Error: 0.0374865033  -Training_Accuracy:  96.32  -time: 331.19 
Iteration:  8/50[==============] -Error: 0.0343969228  -Training_Accuracy:  96.55  -time: 378.86 
Iteration:  9/50[==============] -Error: 0.0316770894  -Training_Accuracy:  96.84  -time: 426.05 
Iteration: 10/50[==============] -Error: 0.0295265670  -Training_Accuracy:  97.05  -time: 473.36 
Iteration: 11/50[==============] -Error: 0.0271802202  -Training_Accuracy:  97.16  -time: 521.31 
Iteration: 12/50[==============] -Error: 0.0254827165  -Training_Accuracy:  97.37  -time: 568.57 
Iteration: 13/50[==============] -Error: 0.0238098892  -Training_Accuracy:  97.37  -time: 615.79 
Iteration: 14/50[==============] -Error: 0.0223241136  -Training_Accuracy:  97.57  -time: 662.89 
Iteration: 15/50[==============] -Error: 0.0209411882  -Training_Accuracy:  97.67  -time: 710.39 
Iteration: 16/50[==============] -Error: 0.0197862823  -Training_Accuracy:  97.78  -time: 757.88 
Iteration: 17/50[==============] -Error: 0.0188045575  -Training_Accuracy:  97.87  -time: 804.89 
Iteration: 18/50[==============] -Error: 0.0178745860  -Training_Accuracy:  97.93  -time: 852.70 
Iteration: 19/50[==============] -Error: 0.0169322406  -Training_Accuracy:  98.01  -time: 899.95 
Iteration: 20/50[==============] -Error: 0.0161971739  -Training_Accuracy:  98.05  -time: 947.17 
Iteration: 21/50[==============] -Error: 0.0154766214  -Training_Accuracy:  98.10  -time: 994.31 
Iteration: 22/50[==============] -Error: 0.0148373205  -Training_Accuracy:  98.11  -time: 1041.56 
Iteration: 23/50[==============] -Error: 0.0142816570  -Training_Accuracy:  98.16  -time: 1088.77 
Iteration: 24/50[==============] -Error: 0.0137912522  -Training_Accuracy:  98.20  -time: 1136.40 
Iteration: 25/50[==============] -Error: 0.0132674890  -Training_Accuracy:  98.26  -time: 1183.67 
Iteration: 26/50[==============] -Error: 0.0129276897  -Training_Accuracy:  98.30  -time: 1230.87 
Iteration: 27/50[==============] -Error: 0.0125163232  -Training_Accuracy:  98.33  -time: 1278.01 
Iteration: 28/50[==============] -Error: 0.0121169302  -Training_Accuracy:  98.35  -time: 1325.16 
Iteration: 29/50[==============] -Error: 0.0117804126  -Training_Accuracy:  98.40  -time: 1372.96 
Iteration: 30/50[==============] -Error: 0.0114908575  -Training_Accuracy:  98.41  -time: 1420.50 
Iteration: 31/50[==============] -Error: 0.0112430227  -Training_Accuracy:  98.43  -time: 1467.66 
Iteration: 32/50[==============] -Error: 0.0109529691  -Training_Accuracy:  98.43  -time: 1514.76 
Iteration: 33/50[==============] -Error: 0.0106740548  -Training_Accuracy:  98.47  -time: 1562.29 
Iteration: 34/50[==============] -Error: 0.0104614579  -Training_Accuracy:  98.49  -time: 1609.40 
Iteration: 35/50[==============] -Error: 0.0102451955  -Training_Accuracy:  98.50  -time: 1656.46 
Iteration: 36/50[==============] -Error: 0.0100454140  -Training_Accuracy:  98.51  -time: 1703.80 
Iteration: 37/50[==============] -Error: 0.0098738691  -Training_Accuracy:  98.53  -time: 1751.22 
Iteration: 38/50[==============] -Error: 0.0097380863  -Training_Accuracy:  98.55  -time: 1798.54 
Iteration: 39/50[==============] -Error: 0.0096008583  -Training_Accuracy:  98.57  -time: 1845.77 
Iteration: 40/50[==============] -Error: 0.0094736610  -Training_Accuracy:  98.58  -time: 1893.04 
Iteration: 41/50[==============] -Error: 0.0093423156  -Training_Accuracy:  98.59  -time: 1940.21 
Iteration: 42/50[==============] -Error: 0.0091923274  -Training_Accuracy:  98.61  -time: 1987.33 
Iteration: 43/50[==============] -Error: 0.0090796101  -Training_Accuracy:  98.62  -time: 2034.68 
Iteration: 44/50[==============] -Error: 0.0089842476  -Training_Accuracy:  98.63  -time: 2082.14 
Iteration: 45/50[==============] -Error: 0.0088912105  -Training_Accuracy:  98.64  -time: 2129.39 
Iteration: 46/50[==============] -Error: 0.0087899338  -Training_Accuracy:  98.64  -time: 2176.91 
Iteration: 47/50[==============] -Error: 0.0086743969  -Training_Accuracy:  98.65  -time: 2223.97 
Iteration: 48/50[==============] -Error: 0.0085879363  -Training_Accuracy:  98.66  -time: 2271.39 
Iteration: 49/50[==============] -Error: 0.0084959069  -Training_Accuracy:  98.67  -time: 2318.52 
Iteration: 50/50[==============] -Error: 0.0084195919  -Training_Accuracy:  98.68  -time: 2365.87 

In [35]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")



In [36]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 300, 10, learning_rate=0.1, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1078365927  -Training_Accuracy:  91.53  -time: 134.35 
Iteration:  2/50[==============] -Error: 0.0717383261  -Training_Accuracy:  93.81  -time: 244.43 
Iteration:  3/50[==============] -Error: 0.0586870443  -Training_Accuracy:  93.99  -time: 354.60 
Iteration:  4/50[==============] -Error: 0.0503120262  -Training_Accuracy:  95.34  -time: 464.77 
Iteration:  5/50[==============] -Error: 0.0440187541  -Training_Accuracy:  95.80  -time: 551.88 
Iteration:  6/50[==============] -Error: 0.0393003128  -Training_Accuracy:  95.98  -time: 638.81 
Iteration:  7/50[==============] -Error: 0.0349806051  -Training_Accuracy:  96.54  -time: 725.34 
Iteration:  8/50[==============] -Error: 0.0318702961  -Training_Accuracy:  96.91  -time: 812.03 
Iteration:  9/50[==============] -Error: 0.0290220216  -Training_Accuracy:  97.01  -time: 898.82 
Iteration: 10/50[==============] -Error: 0.0266755271  -Training_Accuracy:  97.32  -time: 985.42 
Iteration: 11/50[==============] -Error: 0.0246019062  -Training_Accuracy:  97.49  -time: 1072.20 
Iteration: 12/50[==============] -Error: 0.0225283922  -Training_Accuracy:  97.54  -time: 1159.75 
Iteration: 13/50[==============] -Error: 0.0211397584  -Training_Accuracy:  97.70  -time: 1246.26 
Iteration: 14/50[==============] -Error: 0.0196574672  -Training_Accuracy:  97.78  -time: 1332.97 
Iteration: 15/50[==============] -Error: 0.0183298938  -Training_Accuracy:  97.87  -time: 1419.59 
Iteration: 16/50[==============] -Error: 0.0172481024  -Training_Accuracy:  97.95  -time: 1506.32 
Iteration: 17/50[==============] -Error: 0.0162040472  -Training_Accuracy:  98.02  -time: 1592.84 
Iteration: 18/50[==============] -Error: 0.0153299989  -Training_Accuracy:  98.04  -time: 1679.65 
Iteration: 19/50[==============] -Error: 0.0145642026  -Training_Accuracy:  98.10  -time: 1767.09 
Iteration: 20/50[==============] -Error: 0.0138846898  -Training_Accuracy:  98.16  -time: 1853.72 
Iteration: 21/50[==============] -Error: 0.0134539083  -Training_Accuracy:  98.21  -time: 1940.24 
Iteration: 22/50[==============] -Error: 0.0128727232  -Training_Accuracy:  98.23  -time: 2026.84 
Iteration: 23/50[==============] -Error: 0.0124082902  -Training_Accuracy:  98.24  -time: 2113.50 
Iteration: 24/50[==============] -Error: 0.0120287710  -Training_Accuracy:  98.28  -time: 2200.05 
Iteration: 25/50[==============] -Error: 0.0116857025  -Training_Accuracy:  98.31  -time: 2287.06 
Iteration: 26/50[==============] -Error: 0.0114068184  -Training_Accuracy:  98.33  -time: 2373.90 
Iteration: 27/50[==============] -Error: 0.0111342481  -Training_Accuracy:  98.36  -time: 2460.55 
Iteration: 28/50[==============] -Error: 0.0109000105  -Training_Accuracy:  98.37  -time: 2547.12 
Iteration: 29/50[==============] -Error: 0.0106762419  -Training_Accuracy:  98.41  -time: 2633.75 
Iteration: 30/50[==============] -Error: 0.0104491740  -Training_Accuracy:  98.43  -time: 2720.44 
Iteration: 31/50[==============] -Error: 0.0102400213  -Training_Accuracy:  98.44  -time: 2807.12 
Iteration: 32/50[==============] -Error: 0.0100325417  -Training_Accuracy:  98.47  -time: 2893.69 
Iteration: 33/50[==============] -Error: 0.0098499423  -Training_Accuracy:  98.50  -time: 2980.73 
Iteration: 34/50[==============] -Error: 0.0096884728  -Training_Accuracy:  98.51  -time: 3067.38 
Iteration: 35/50[==============] -Error: 0.0095137701  -Training_Accuracy:  98.53  -time: 3154.00 
Iteration: 36/50[==============] -Error: 0.0093633392  -Training_Accuracy:  98.53  -time: 3240.60 
Iteration: 37/50[==============] -Error: 0.0091746386  -Training_Accuracy:  98.55  -time: 3327.36 
Iteration: 38/50[==============] -Error: 0.0090784179  -Training_Accuracy:  98.57  -time: 3413.94 
Iteration: 39/50[==============] -Error: 0.0089672770  -Training_Accuracy:  98.58  -time: 3500.66 
Iteration: 40/50[==============] -Error: 0.0088667348  -Training_Accuracy:  98.59  -time: 3587.65 
Iteration: 41/50[==============] -Error: 0.0087396892  -Training_Accuracy:  98.60  -time: 3674.37 
Iteration: 42/50[==============] -Error: 0.0086444439  -Training_Accuracy:  98.62  -time: 3760.96 
Iteration: 43/50[==============] -Error: 0.0085545332  -Training_Accuracy:  98.63  -time: 3847.50 
Iteration: 44/50[==============] -Error: 0.0084551882  -Training_Accuracy:  98.64  -time: 3934.63 
Iteration: 45/50[==============] -Error: 0.0083699784  -Training_Accuracy:  98.64  -time: 4021.12 
Iteration: 46/50[==============] -Error: 0.0082766187  -Training_Accuracy:  98.65  -time: 4107.73 
Iteration: 47/50[==============] -Error: 0.0082010501  -Training_Accuracy:  98.67  -time: 4194.76 
Iteration: 48/50[==============] -Error: 0.0081481795  -Training_Accuracy:  98.66  -time: 4281.60 
Iteration: 49/50[==============] -Error: 0.0080860521  -Training_Accuracy:  98.68  -time: 4368.06 
Iteration: 50/50[==============] -Error: 0.0080003270  -Training_Accuracy:  98.69  -time: 4454.56 

In [37]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Your answer goes here


In [50]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 100, 10, learning_rate=1.0, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1044525482  -Training_Accuracy:  92.97  -time: 30.01 
Iteration:  2/50[==============] -Error: 0.0633675508  -Training_Accuracy:  93.33  -time: 53.12 
Iteration:  3/50[==============] -Error: 0.0537446218  -Training_Accuracy:  94.53  -time: 76.02 
Iteration:  4/50[==============] -Error: 0.0486671421  -Training_Accuracy:  95.31  -time: 100.37 
Iteration:  5/50[==============] -Error: 0.0451096115  -Training_Accuracy:  95.64  -time: 125.56 
Iteration:  6/50[==============] -Error: 0.0416869338  -Training_Accuracy:  95.89  -time: 148.60 
Iteration:  7/50[==============] -Error: 0.0409889159  -Training_Accuracy:  96.00  -time: 171.51 
Iteration:  8/50[==============] -Error: 0.0392427264  -Training_Accuracy:  96.42  -time: 194.43 
Iteration:  9/50[==============] -Error: 0.0365482051  -Training_Accuracy:  96.60  -time: 218.00 
Iteration: 10/50[==============] -Error: 0.0348344635  -Training_Accuracy:  96.33  -time: 241.34 
Iteration: 11/50[==============] -Error: 0.0331665900  -Training_Accuracy:  96.71  -time: 264.30 
Iteration: 12/50[==============] -Error: 0.0336390678  -Training_Accuracy:  96.80  -time: 287.55 
Iteration: 13/50[==============] -Error: 0.0333422312  -Training_Accuracy:  97.06  -time: 316.48 
Iteration: 14/50[==============] -Error: 0.0311217553  -Training_Accuracy:  97.19  -time: 345.99 
Iteration: 15/50[==============] -Error: 0.0309827751  -Training_Accuracy:  96.90  -time: 379.35 
Iteration: 16/50[==============] -Error: 0.0301843074  -Training_Accuracy:  96.97  -time: 405.48 
Iteration: 17/50[==============] -Error: 0.0289908138  -Training_Accuracy:  97.16  -time: 438.32 
Iteration: 18/50[==============] -Error: 0.0285188004  -Training_Accuracy:  97.11  -time: 462.21 
Iteration: 19/50[==============] -Error: 0.0268095502  -Training_Accuracy:  97.32  -time: 486.49 
Iteration: 20/50[==============] -Error: 0.0260671829  -Training_Accuracy:  97.59  -time: 510.64 
Iteration: 21/50[==============] -Error: 0.0248522437  -Training_Accuracy:  97.79  -time: 534.62 
Iteration: 22/50[==============] -Error: 0.0247450553  -Training_Accuracy:  97.62  -time: 564.91 
Iteration: 23/50[==============] -Error: 0.0243791724  -Training_Accuracy:  97.31  -time: 588.19 
Iteration: 24/50[==============] -Error: 0.0245431831  -Training_Accuracy:  97.76  -time: 611.23 
Iteration: 25/50[==============] -Error: 0.0243891688  -Training_Accuracy:  97.56  -time: 635.46 
Iteration: 26/50[==============] -Error: 0.0226318407  -Training_Accuracy:  97.85  -time: 659.90 
Iteration: 27/50[==============] -Error: 0.0225604924  -Training_Accuracy:  97.80  -time: 683.83 
Iteration: 28/50[==============] -Error: 0.0210511317  -Training_Accuracy:  98.01  -time: 707.34 
Iteration: 29/50[==============] -Error: 0.0211115230  -Training_Accuracy:  98.11  -time: 731.63 
Iteration: 30/50[==============] -Error: 0.0212092657  -Training_Accuracy:  97.84  -time: 754.96 
Iteration: 31/50[==============] -Error: 0.0219228154  -Training_Accuracy:  97.88  -time: 784.15 
Iteration: 32/50[==============] -Error: 0.0206109494  -Training_Accuracy:  98.26  -time: 809.21 
Iteration: 33/50[==============] -Error: 0.0193262676  -Training_Accuracy:  98.17  -time: 837.63 
Iteration: 34/50[==============] -Error: 0.0196226193  -Training_Accuracy:  98.24  -time: 865.10 
Iteration: 35/50[==============] -Error: 0.0197158189  -Training_Accuracy:  98.15  -time: 888.49 
Iteration: 36/50[==============] -Error: 0.0196174534  -Training_Accuracy:  98.33  -time: 912.12 
Iteration: 37/50[==============] -Error: 0.0179509287  -Training_Accuracy:  98.29  -time: 936.15 
Iteration: 38/50[==============] -Error: 0.0176158004  -Training_Accuracy:  98.45  -time: 960.07 
Iteration: 39/50[==============] -Error: 0.0168376186  -Training_Accuracy:  98.44  -time: 983.69 
Iteration: 40/50[==============] -Error: 0.0170818752  -Training_Accuracy:  98.30  -time: 1009.34 
Iteration: 41/50[==============] -Error: 0.0163666990  -Training_Accuracy:  98.39  -time: 1035.23 
Iteration: 42/50[==============] -Error: 0.0163327858  -Training_Accuracy:  98.43  -time: 1069.00 
Iteration: 43/50[==============] -Error: 0.0162884510  -Training_Accuracy:  98.53  -time: 1095.97 
Iteration: 44/50[==============] -Error: 0.0153743001  -Training_Accuracy:  98.58  -time: 1119.30 
Iteration: 45/50[==============] -Error: 0.0153998617  -Training_Accuracy:  98.46  -time: 1143.93 
Iteration: 46/50[==============] -Error: 0.0158380920  -Training_Accuracy:  98.52  -time: 1168.88 
Iteration: 47/50[==============] -Error: 0.0157551831  -Training_Accuracy:  98.51  -time: 1198.41 
Iteration: 48/50[==============] -Error: 0.0152686971  -Training_Accuracy:  98.57  -time: 1225.46 
Iteration: 49/50[==============] -Error: 0.0147065500  -Training_Accuracy:  98.49  -time: 1251.80 
Iteration: 50/50[==============] -Error: 0.0146741206  -Training_Accuracy:  98.54  -time: 1279.17 

In [51]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Here we see that adding neurons in our hidden layer allows us to obtain better result regarding the training accuracy. Unfortunately, validation accuracy remains stuck at 96%. This means that our net is overfitting the training data.
Here we used 50 iterations as before, and a learning rate of 0.1 because it was good.
Then we did one experiment with 100 neurons and a learning rate of 1.0 and we got better results: 98.5% of training accuracy and 97% of validation accuracy. Maybe this means that when we add neurons we should increase our learning rate.

Our explanation is that with more neurons it is easyer to overfit and finish in a local minimum, thus if we use a greater learning rate, we are creating a more robust net.

In [54]:
#Your implementation goes here
my_mnist_net = NeuralNetwork(28*28, 200, 10, learning_rate=1.0, iterations=50)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1255351345  -Training_Accuracy:  91.87  -time: 56.96 
Iteration:  2/50[==============] -Error: 0.0735093105  -Training_Accuracy:  94.00  -time: 105.43 
Iteration:  3/50[==============] -Error: 0.0634531515  -Training_Accuracy:  94.46  -time: 157.52 
Iteration:  4/50[==============] -Error: 0.0571875516  -Training_Accuracy:  94.84  -time: 209.54 
Iteration:  5/50[==============] -Error: 0.0526363257  -Training_Accuracy:  94.83  -time: 263.47 
Iteration:  6/50[==============] -Error: 0.0505247631  -Training_Accuracy:  95.45  -time: 312.66 
Iteration:  7/50[==============] -Error: 0.0483123419  -Training_Accuracy:  95.54  -time: 361.94 
Iteration:  8/50[==============] -Error: 0.0482562760  -Training_Accuracy:  95.46  -time: 411.21 
Iteration:  9/50[==============] -Error: 0.0478075534  -Training_Accuracy:  96.15  -time: 469.00 
Iteration: 10/50[==============] -Error: 0.0461245338  -Training_Accuracy:  95.78  -time: 523.95 
Iteration: 11/50[==============] -Error: 0.0441368220  -Training_Accuracy:  96.07  -time: 578.42 
Iteration: 12/50[==============] -Error: 0.0444256510  -Training_Accuracy:  96.16  -time: 631.35 
Iteration: 13/50[==============] -Error: 0.0431527091  -Training_Accuracy:  96.21  -time: 683.46 
Iteration: 14/50[==============] -Error: 0.0414023699  -Training_Accuracy:  96.49  -time: 735.07 
Iteration: 15/50[==============] -Error: 0.0402369652  -Training_Accuracy:  96.54  -time: 790.18 
Iteration: 16/50[==============] -Error: 0.0387567209  -Training_Accuracy:  96.70  -time: 842.26 
Iteration: 17/50[==============] -Error: 0.0397942342  -Training_Accuracy:  95.56  -time: 892.29 
Iteration: 18/50[==============] -Error: 0.0393964239  -Training_Accuracy:  96.63  -time: 942.12 
Iteration: 19/50[==============] -Error: 0.0386824521  -Training_Accuracy:  96.52  -time: 996.07 
Iteration: 20/50[==============] -Error: 0.0384386842  -Training_Accuracy:  96.71  -time: 1046.11 
Iteration: 21/50[==============] -Error: 0.0371660950  -Training_Accuracy:  96.63  -time: 1098.88 
Iteration: 22/50[==============] -Error: 0.0391738283  -Training_Accuracy:  96.65  -time: 1150.96 
Iteration: 23/50[==============] -Error: 0.0390264829  -Training_Accuracy:  96.75  -time: 1208.95 
Iteration: 24/50[==============] -Error: 0.0371542953  -Training_Accuracy:  96.62  -time: 1263.55 
Iteration: 25/50[==============] -Error: 0.0367470983  -Training_Accuracy:  96.79  -time: 1319.99 
Iteration: 26/50[==============] -Error: 0.0369347279  -Training_Accuracy:  96.27  -time: 1373.95 
Iteration: 27/50[==============] -Error: 0.0363817053  -Training_Accuracy:  96.90  -time: 1424.00 
Iteration: 28/50[==============] -Error: 0.0365513724  -Training_Accuracy:  96.75  -time: 1474.50 
Iteration: 29/50[==============] -Error: 0.0357079260  -Training_Accuracy:  97.00  -time: 1526.38 
Iteration: 30/50[==============] -Error: 0.0348189483  -Training_Accuracy:  96.81  -time: 1575.84 
Iteration: 31/50[==============] -Error: 0.0337479968  -Training_Accuracy:  96.93  -time: 1628.79 
Iteration: 32/50[==============] -Error: 0.0354339403  -Training_Accuracy:  97.17  -time: 1683.45 
Iteration: 33/50[==============] -Error: 0.0357263070  -Training_Accuracy:  97.02  -time: 1732.80 
Iteration: 34/50[==============] -Error: 0.0353416288  -Training_Accuracy:  97.05  -time: 1781.47 
Iteration: 35/50[==============] -Error: 0.0365043639  -Training_Accuracy:  96.97  -time: 1831.85 
Iteration: 36/50[==============] -Error: 0.0349483231  -Training_Accuracy:  96.79  -time: 1882.76 
Iteration: 37/50[==============] -Error: 0.0344980819  -Training_Accuracy:  96.83  -time: 1932.59 
Iteration: 38/50[==============] -Error: 0.0351947011  -Training_Accuracy:  97.11  -time: 1983.01 
Iteration: 39/50[==============] -Error: 0.0332453204  -Training_Accuracy:  97.16  -time: 2032.89 
Iteration: 40/50[==============] -Error: 0.0333848796  -Training_Accuracy:  97.12  -time: 2081.72 
Iteration: 41/50[==============] -Error: 0.0316233339  -Training_Accuracy:  97.31  -time: 2133.94 
Iteration: 42/50[==============] -Error: 0.0333303964  -Training_Accuracy:  97.19  -time: 2183.28 
Iteration: 43/50[==============] -Error: 0.0310822615  -Training_Accuracy:  97.13  -time: 2232.26 
Iteration: 44/50[==============] -Error: 0.0328299277  -Training_Accuracy:  97.52  -time: 2280.74 
Iteration: 45/50[==============] -Error: 0.0312651995  -Training_Accuracy:  97.24  -time: 2329.41 
Iteration: 46/50[==============] -Error: 0.0310090707  -Training_Accuracy:  97.43  -time: 2379.90 
Iteration: 47/50[==============] -Error: 0.0305126594  -Training_Accuracy:  97.22  -time: 2428.46 
Iteration: 48/50[==============] -Error: 0.0311735601  -Training_Accuracy:  97.31  -time: 2481.19 
Iteration: 49/50[==============] -Error: 0.0309200503  -Training_Accuracy:  97.12  -time: 2537.48 
Iteration: 50/50[==============] -Error: 0.0319976716  -Training_Accuracy:  97.34  -time: 2590.02 

In [55]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
With a learning rate = 1 and 200 hidden neurons, the validation accuracy remains high, but does not improve. We noted that with the same learning rate, but 100 neurons we had a higher training accuracy. Now training accuracy and validation accuracy are almost the same, so maybe we are building a more robust model.

Question 2.2.5 : Add one additionnal hidden layers and train your network, discuss your results with different setting.


In [59]:
# Your implementation goes here
import NeuralNetwork2 as NN2
reload (NN2)
NeuralNetwork2 = NN2.NeuralNetwork2

my_mnist_net = NeuralNetwork2(28*28, 30, 30, 10, iterations=50, learning_rate=1.0)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.1189465449  -Training_Accuracy:  91.29  -time: 20.35 
Iteration:  2/50[==============] -Error: 0.0734038658  -Training_Accuracy:  91.53  -time: 36.01 
Iteration:  3/50[==============] -Error: 0.0629521285  -Training_Accuracy:  92.26  -time: 51.52 
Iteration:  4/50[==============] -Error: 0.0615263703  -Training_Accuracy:  93.66  -time: 67.73 
Iteration:  5/50[==============] -Error: 0.0569907713  -Training_Accuracy:  93.30  -time: 83.21 
Iteration:  6/50[==============] -Error: 0.0533270302  -Training_Accuracy:  93.36  -time: 99.06 
Iteration:  7/50[==============] -Error: 0.0524697758  -Training_Accuracy:  93.53  -time: 114.60 
Iteration:  8/50[==============] -Error: 0.0511637554  -Training_Accuracy:  94.15  -time: 130.16 
Iteration:  9/50[==============] -Error: 0.0477502216  -Training_Accuracy:  93.62  -time: 145.92 
Iteration: 10/50[==============] -Error: 0.0484027177  -Training_Accuracy:  94.62  -time: 161.51 
Iteration: 11/50[==============] -Error: 0.0479208472  -Training_Accuracy:  94.42  -time: 177.00 
Iteration: 12/50[==============] -Error: 0.0472455781  -Training_Accuracy:  95.15  -time: 192.49 
Iteration: 13/50[==============] -Error: 0.0460627460  -Training_Accuracy:  94.88  -time: 207.93 
Iteration: 14/50[==============] -Error: 0.0456912685  -Training_Accuracy:  94.94  -time: 223.88 
Iteration: 15/50[==============] -Error: 0.0424301667  -Training_Accuracy:  94.95  -time: 239.37 
Iteration: 16/50[==============] -Error: 0.0424327651  -Training_Accuracy:  95.26  -time: 254.92 
Iteration: 17/50[==============] -Error: 0.0439726886  -Training_Accuracy:  94.74  -time: 270.39 
Iteration: 18/50[==============] -Error: 0.0436724303  -Training_Accuracy:  94.61  -time: 285.91 
Iteration: 19/50[==============] -Error: 0.0430359728  -Training_Accuracy:  95.14  -time: 301.45 
Iteration: 20/50[==============] -Error: 0.0432932852  -Training_Accuracy:  94.74  -time: 316.95 
Iteration: 21/50[==============] -Error: 0.0409888625  -Training_Accuracy:  95.57  -time: 332.48 
Iteration: 22/50[==============] -Error: 0.0387176832  -Training_Accuracy:  95.79  -time: 348.57 
Iteration: 23/50[==============] -Error: 0.0395159464  -Training_Accuracy:  94.97  -time: 364.10 
Iteration: 24/50[==============] -Error: 0.0386749223  -Training_Accuracy:  95.52  -time: 379.63 
Iteration: 25/50[==============] -Error: 0.0404393167  -Training_Accuracy:  94.90  -time: 395.21 
Iteration: 26/50[==============] -Error: 0.0385666964  -Training_Accuracy:  95.69  -time: 410.74 
Iteration: 27/50[==============] -Error: 0.0398453481  -Training_Accuracy:  95.64  -time: 426.23 
Iteration: 28/50[==============] -Error: 0.0382712278  -Training_Accuracy:  95.59  -time: 441.71 
Iteration: 29/50[==============] -Error: 0.0380569160  -Training_Accuracy:  95.85  -time: 457.67 
Iteration: 30/50[==============] -Error: 0.0353623623  -Training_Accuracy:  95.81  -time: 473.33 
Iteration: 31/50[==============] -Error: 0.0373414443  -Training_Accuracy:  95.87  -time: 488.96 
Iteration: 32/50[==============] -Error: 0.0388266083  -Training_Accuracy:  95.54  -time: 504.49 
Iteration: 33/50[==============] -Error: 0.0380042051  -Training_Accuracy:  95.52  -time: 520.08 
Iteration: 34/50[==============] -Error: 0.0387818404  -Training_Accuracy:  95.56  -time: 535.64 
Iteration: 35/50[==============] -Error: 0.0378461318  -Training_Accuracy:  95.50  -time: 551.25 
Iteration: 36/50[==============] -Error: 0.0379583952  -Training_Accuracy:  95.40  -time: 566.85 
Iteration: 37/50[==============] -Error: 0.0365465259  -Training_Accuracy:  95.54  -time: 583.05 
Iteration: 38/50[==============] -Error: 0.0382723706  -Training_Accuracy:  96.06  -time: 599.85 
Iteration: 39/50[==============] -Error: 0.0363441149  -Training_Accuracy:  95.79  -time: 617.82 
Iteration: 40/50[==============] -Error: 0.0347637536  -Training_Accuracy:  96.16  -time: 636.02 
Iteration: 41/50[==============] -Error: 0.0368932180  -Training_Accuracy:  95.39  -time: 652.02 
Iteration: 42/50[==============] -Error: 0.0368948769  -Training_Accuracy:  95.86  -time: 668.21 
Iteration: 43/50[==============] -Error: 0.0369898081  -Training_Accuracy:  96.10  -time: 683.88 
Iteration: 44/50[==============] -Error: 0.0342532376  -Training_Accuracy:  95.30  -time: 699.88 
Iteration: 45/50[==============] -Error: 0.0352168514  -Training_Accuracy:  96.26  -time: 715.72 
Iteration: 46/50[==============] -Error: 0.0339134712  -Training_Accuracy:  96.18  -time: 731.19 
Iteration: 47/50[==============] -Error: 0.0348767174  -Training_Accuracy:  96.13  -time: 746.72 
Iteration: 48/50[==============] -Error: 0.0336727268  -Training_Accuracy:  96.04  -time: 762.28 
Iteration: 49/50[==============] -Error: 0.0341812462  -Training_Accuracy:  96.12  -time: 777.69 
Iteration: 50/50[==============] -Error: 0.0334475335  -Training_Accuracy:  96.10  -time: 794.09 

In [60]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Here we see that results are poor. When we look at the fluctuations, we see that they are quite big: maybe we should decrease our learning rate.

In [61]:
# Your implementation goes here
import NeuralNetwork2 as NN2
reload (NN2)
NeuralNetwork2 = NN2.NeuralNetwork2

my_mnist_net = NeuralNetwork2(28*28, 30, 30, 10, iterations=50, learning_rate=0.1)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.2541963575  -Training_Accuracy:  84.88  -time: 17.01 
Iteration:  2/50[==============] -Error: 0.0970889344  -Training_Accuracy:  89.70  -time: 35.62 
Iteration:  3/50[==============] -Error: 0.0754924267  -Training_Accuracy:  91.69  -time: 52.74 
Iteration:  4/50[==============] -Error: 0.0649225454  -Training_Accuracy:  92.59  -time: 69.16 
Iteration:  5/50[==============] -Error: 0.0585373976  -Training_Accuracy:  93.45  -time: 84.79 
Iteration:  6/50[==============] -Error: 0.0539358436  -Training_Accuracy:  93.90  -time: 100.35 
Iteration:  7/50[==============] -Error: 0.0498412608  -Training_Accuracy:  93.94  -time: 115.81 
Iteration:  8/50[==============] -Error: 0.0471367139  -Training_Accuracy:  94.70  -time: 131.30 
Iteration:  9/50[==============] -Error: 0.0444789850  -Training_Accuracy:  95.09  -time: 146.83 
Iteration: 10/50[==============] -Error: 0.0423649478  -Training_Accuracy:  95.25  -time: 162.86 
Iteration: 11/50[==============] -Error: 0.0403956732  -Training_Accuracy:  95.51  -time: 178.41 
Iteration: 12/50[==============] -Error: 0.0388325457  -Training_Accuracy:  95.47  -time: 193.94 
Iteration: 13/50[==============] -Error: 0.0370175283  -Training_Accuracy:  95.82  -time: 209.51 
Iteration: 14/50[==============] -Error: 0.0355189369  -Training_Accuracy:  95.91  -time: 225.01 
Iteration: 15/50[==============] -Error: 0.0342377194  -Training_Accuracy:  96.27  -time: 240.92 
Iteration: 16/50[==============] -Error: 0.0330824899  -Training_Accuracy:  96.44  -time: 256.53 
Iteration: 17/50[==============] -Error: 0.0317651244  -Training_Accuracy:  96.47  -time: 272.16 
Iteration: 18/50[==============] -Error: 0.0310558883  -Training_Accuracy:  96.53  -time: 288.88 
Iteration: 19/50[==============] -Error: 0.0300491803  -Training_Accuracy:  96.72  -time: 308.27 
Iteration: 20/50[==============] -Error: 0.0292205484  -Training_Accuracy:  96.76  -time: 327.21 
Iteration: 21/50[==============] -Error: 0.0282722535  -Training_Accuracy:  96.84  -time: 343.10 
Iteration: 22/50[==============] -Error: 0.0277513121  -Training_Accuracy:  96.88  -time: 360.56 
Iteration: 23/50[==============] -Error: 0.0269693156  -Training_Accuracy:  96.96  -time: 379.33 
Iteration: 24/50[==============] -Error: 0.0263002327  -Training_Accuracy:  96.96  -time: 396.87 
Iteration: 25/50[==============] -Error: 0.0256617121  -Training_Accuracy:  97.09  -time: 414.87 
Iteration: 26/50[==============] -Error: 0.0250214603  -Training_Accuracy:  97.23  -time: 430.56 
Iteration: 27/50[==============] -Error: 0.0247602965  -Training_Accuracy:  97.31  -time: 446.08 
Iteration: 28/50[==============] -Error: 0.0241311330  -Training_Accuracy:  97.40  -time: 462.83 
Iteration: 29/50[==============] -Error: 0.0235197385  -Training_Accuracy:  97.36  -time: 478.44 
Iteration: 30/50[==============] -Error: 0.0230424750  -Training_Accuracy:  97.36  -time: 494.07 
Iteration: 31/50[==============] -Error: 0.0226575627  -Training_Accuracy:  97.30  -time: 509.70 
Iteration: 32/50[==============] -Error: 0.0221398220  -Training_Accuracy:  97.55  -time: 525.81 
Iteration: 33/50[==============] -Error: 0.0217971042  -Training_Accuracy:  97.50  -time: 541.47 
Iteration: 34/50[==============] -Error: 0.0213823621  -Training_Accuracy:  97.57  -time: 557.07 
Iteration: 35/50[==============] -Error: 0.0209267231  -Training_Accuracy:  97.70  -time: 573.00 
Iteration: 36/50[==============] -Error: 0.0205569983  -Training_Accuracy:  97.79  -time: 588.54 
Iteration: 37/50[==============] -Error: 0.0201865337  -Training_Accuracy:  97.76  -time: 604.18 
Iteration: 38/50[==============] -Error: 0.0197926916  -Training_Accuracy:  97.80  -time: 619.78 
Iteration: 39/50[==============] -Error: 0.0196688411  -Training_Accuracy:  97.80  -time: 636.11 
Iteration: 40/50[==============] -Error: 0.0191590313  -Training_Accuracy:  97.83  -time: 652.18 
Iteration: 41/50[==============] -Error: 0.0188510435  -Training_Accuracy:  97.78  -time: 667.87 
Iteration: 42/50[==============] -Error: 0.0183547695  -Training_Accuracy:  97.86  -time: 683.51 
Iteration: 43/50[==============] -Error: 0.0182330766  -Training_Accuracy:  97.92  -time: 700.34 
Iteration: 44/50[==============] -Error: 0.0179847118  -Training_Accuracy:  97.97  -time: 715.93 
Iteration: 45/50[==============] -Error: 0.0176362591  -Training_Accuracy:  97.83  -time: 731.48 
Iteration: 46/50[==============] -Error: 0.0173887523  -Training_Accuracy:  98.00  -time: 747.08 
Iteration: 47/50[==============] -Error: 0.0171784429  -Training_Accuracy:  98.02  -time: 763.17 
Iteration: 48/50[==============] -Error: 0.0169359788  -Training_Accuracy:  98.04  -time: 778.75 
Iteration: 49/50[==============] -Error: 0.0167721632  -Training_Accuracy:  98.01  -time: 794.38 
Iteration: 50/50[==============] -Error: 0.0163981076  -Training_Accuracy:  98.04  -time: 809.99 

In [62]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Here we see that with a lower learning rate, we obtain a better training accuracy and a lower validation accuracy. Maybe we should add more neurons.

In [63]:
# Your implementation goes here
import NeuralNetwork2 as NN2
reload (NN2)
NeuralNetwork2 = NN2.NeuralNetwork2

my_mnist_net = NeuralNetwork2(28*28, 100, 100, 10, iterations=50, learning_rate=0.1)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.2147527590  -Training_Accuracy:  89.41  -time: 34.32 
Iteration:  2/50[==============] -Error: 0.0761971170  -Training_Accuracy:  91.87  -time: 65.07 
Iteration:  3/50[==============] -Error: 0.0578224964  -Training_Accuracy:  94.09  -time: 1363.75 
Iteration:  4/50[==============] -Error: 0.0481329333  -Training_Accuracy:  95.04  -time: 1402.47 
Iteration:  5/50[==============] -Error: 0.0415289357  -Training_Accuracy:  95.62  -time: 1442.42 
Iteration:  6/50[==============] -Error: 0.0367287112  -Training_Accuracy:  96.12  -time: 1479.86 
Iteration:  7/50[==============] -Error: 0.0326946112  -Training_Accuracy:  96.60  -time: 1514.81 
Iteration:  8/50[==============] -Error: 0.0295592929  -Training_Accuracy:  96.89  -time: 1544.75 
Iteration:  9/50[==============] -Error: 0.0271667355  -Training_Accuracy:  97.15  -time: 1575.08 
Iteration: 10/50[==============] -Error: 0.0249136733  -Training_Accuracy:  97.20  -time: 1604.94 
Iteration: 11/50[==============] -Error: 0.0227941891  -Training_Accuracy:  97.65  -time: 1634.58 
Iteration: 12/50[==============] -Error: 0.0210459188  -Training_Accuracy:  97.87  -time: 1664.96 
Iteration: 13/50[==============] -Error: 0.0194237736  -Training_Accuracy:  97.96  -time: 1695.08 
Iteration: 14/50[==============] -Error: 0.0181465198  -Training_Accuracy:  98.13  -time: 1724.80 
Iteration: 15/50[==============] -Error: 0.0167777124  -Training_Accuracy:  98.21  -time: 1754.59 
Iteration: 16/50[==============] -Error: 0.0157379980  -Training_Accuracy:  98.37  -time: 1789.32 
Iteration: 17/50[==============] -Error: 0.0145860294  -Training_Accuracy:  98.48  -time: 1820.87 
Iteration: 18/50[==============] -Error: 0.0136353249  -Training_Accuracy:  98.55  -time: 1851.51 
Iteration: 19/50[==============] -Error: 0.0128384522  -Training_Accuracy:  98.59  -time: 1882.02 
Iteration: 20/50[==============] -Error: 0.0120253180  -Training_Accuracy:  98.69  -time: 1913.58 
Iteration: 21/50[==============] -Error: 0.0114093387  -Training_Accuracy:  98.71  -time: 1944.30 
Iteration: 22/50[==============] -Error: 0.0107674249  -Training_Accuracy:  98.77  -time: 1974.97 
Iteration: 23/50[==============] -Error: 0.0102374197  -Training_Accuracy:  98.80  -time: 2006.51 
Iteration: 24/50[==============] -Error: 0.0097095784  -Training_Accuracy:  98.85  -time: 2038.68 
Iteration: 25/50[==============] -Error: 0.0092110467  -Training_Accuracy:  98.87  -time: 2069.41 
Iteration: 26/50[==============] -Error: 0.0088195741  -Training_Accuracy:  98.91  -time: 2099.97 
Iteration: 27/50[==============] -Error: 0.0084791984  -Training_Accuracy:  98.92  -time: 2131.01 
Iteration: 28/50[==============] -Error: 0.0081384667  -Training_Accuracy:  98.94  -time: 2162.59 
Iteration: 29/50[==============] -Error: 0.0078668788  -Training_Accuracy:  98.97  -time: 2192.54 
Iteration: 30/50[==============] -Error: 0.0075737510  -Training_Accuracy:  98.99  -time: 2222.42 
Iteration: 31/50[==============] -Error: 0.0073310263  -Training_Accuracy:  99.01  -time: 2252.77 
Iteration: 32/50[==============] -Error: 0.0071495747  -Training_Accuracy:  99.01  -time: 2283.99 
Iteration: 33/50[==============] -Error: 0.0069836950  -Training_Accuracy:  99.02  -time: 2314.61 
Iteration: 34/50[==============] -Error: 0.0068064181  -Training_Accuracy:  99.04  -time: 2344.75 
Iteration: 35/50[==============] -Error: 0.0066782235  -Training_Accuracy:  99.04  -time: 2375.97 
Iteration: 36/50[==============] -Error: 0.0065458426  -Training_Accuracy:  99.06  -time: 2406.08 
Iteration: 37/50[==============] -Error: 0.0064345715  -Training_Accuracy:  99.06  -time: 2436.01 
Iteration: 38/50[==============] -Error: 0.0063123225  -Training_Accuracy:  99.07  -time: 2465.82 
Iteration: 39/50[==============] -Error: 0.0062300790  -Training_Accuracy:  99.07  -time: 2496.12 
Iteration: 40/50[==============] -Error: 0.0061338878  -Training_Accuracy:  99.09  -time: 2526.68 
Iteration: 41/50[==============] -Error: 0.0060701323  -Training_Accuracy:  99.09  -time: 2556.96 
Iteration: 42/50[==============] -Error: 0.0059787079  -Training_Accuracy:  99.10  -time: 2589.86 
Iteration: 43/50[==============] -Error: 0.0059073121  -Training_Accuracy:  99.11  -time: 2621.17 
Iteration: 44/50[==============] -Error: 0.0058159015  -Training_Accuracy:  99.12  -time: 2651.35 
Iteration: 45/50[==============] -Error: 0.0057423052  -Training_Accuracy:  99.12  -time: 2681.27 
Iteration: 46/50[==============] -Error: 0.0056694891  -Training_Accuracy:  99.12  -time: 2711.83 
Iteration: 47/50[==============] -Error: 0.0055927457  -Training_Accuracy:  99.13  -time: 2742.33 
Iteration: 48/50[==============] -Error: 0.0055304196  -Training_Accuracy:  99.14  -time: 2772.25 
Iteration: 49/50[==============] -Error: 0.0054746928  -Training_Accuracy:  99.15  -time: 2802.28 
Iteration: 50/50[==============] -Error: 0.0054173623  -Training_Accuracy:  99.16  -time: 2832.17 

In [64]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Here we see that with 100 neurons per layer we achieve a better result: training accuracy is 98.8% and validation accuracy 96.5%. We still cannot do better of the net with a single hidden layer.
If we add that the training time is higher, there is no reason to use a more complex network in this case

In [65]:
# Your implementation goes here
import NeuralNetwork2 as NN2
reload (NN2)
NeuralNetwork2 = NN2.NeuralNetwork2

my_mnist_net = NeuralNetwork2(28*28, 100, 100, 10, iterations=50, learning_rate=1.0)
evaluations = my_mnist_net.train(training_data, validation_data)


Iteration:  1/50[==============] -Error: 0.2691610796  -Training_Accuracy:  92.06  -time: 33.13 
Iteration:  2/50[==============] -Error: 0.0572520071  -Training_Accuracy:  93.85  -time: 63.65 
Iteration:  3/50[==============] -Error: 0.0463233834  -Training_Accuracy:  95.07  -time: 94.36 
Iteration:  4/50[==============] -Error: 0.0401808939  -Training_Accuracy:  95.79  -time: 124.04 
Iteration:  5/50[==============] -Error: 0.0388679504  -Training_Accuracy:  95.69  -time: 155.13 
Iteration:  6/50[==============] -Error: 0.0362619725  -Training_Accuracy:  95.93  -time: 185.00 
Iteration:  7/50[==============] -Error: 0.0340694680  -Training_Accuracy:  95.74  -time: 214.80 
Iteration:  8/50[==============] -Error: 0.0314056647  -Training_Accuracy:  96.76  -time: 245.00 
Iteration:  9/50[==============] -Error: 0.0301799804  -Training_Accuracy:  96.66  -time: 275.78 
Iteration: 10/50[==============] -Error: 0.0276120490  -Training_Accuracy:  97.04  -time: 306.78 
Iteration: 11/50[==============] -Error: 0.0258119322  -Training_Accuracy:  97.10  -time: 336.88 
Iteration: 12/50[==============] -Error: 0.0258341426  -Training_Accuracy:  97.21  -time: 366.89 
Iteration: 13/50[==============] -Error: 0.0252982855  -Training_Accuracy:  97.22  -time: 398.04 
Iteration: 14/50[==============] -Error: 0.0254672920  -Training_Accuracy:  97.42  -time: 428.35 
Iteration: 15/50[==============] -Error: 0.0247100539  -Training_Accuracy:  97.25  -time: 458.65 
Iteration: 16/50[==============] -Error: 0.0241160307  -Training_Accuracy:  97.82  -time: 489.00 
Iteration: 17/50[==============] -Error: 0.0224951613  -Training_Accuracy:  97.50  -time: 519.77 
Iteration: 18/50[==============] -Error: 0.0225507708  -Training_Accuracy:  97.80  -time: 549.88 
Iteration: 19/50[==============] -Error: 0.0212095687  -Training_Accuracy:  97.64  -time: 580.59 
Iteration: 20/50[==============] -Error: 0.0207318953  -Training_Accuracy:  98.02  -time: 610.83 
Iteration: 21/50[==============] -Error: 0.0200879302  -Training_Accuracy:  97.78  -time: 641.47 
Iteration: 22/50[==============] -Error: 0.0204685099  -Training_Accuracy:  97.91  -time: 671.77 
Iteration: 23/50[==============] -Error: 0.0193763111  -Training_Accuracy:  98.00  -time: 701.90 
Iteration: 24/50[==============] -Error: 0.0189566197  -Training_Accuracy:  97.98  -time: 732.98 
Iteration: 25/50[==============] -Error: 0.0189918399  -Training_Accuracy:  98.25  -time: 764.32 
Iteration: 26/50[==============] -Error: 0.0177559162  -Training_Accuracy:  97.77  -time: 794.41 
Iteration: 27/50[==============] -Error: 0.0188961614  -Training_Accuracy:  97.96  -time: 824.47 
Iteration: 28/50[==============] -Error: 0.0170190966  -Training_Accuracy:  98.32  -time: 855.32 
Iteration: 29/50[==============] -Error: 0.0157436292  -Training_Accuracy:  98.50  -time: 885.90 
Iteration: 30/50[==============] -Error: 0.0153539496  -Training_Accuracy:  98.21  -time: 916.02 
Iteration: 31/50[==============] -Error: 0.0167261821  -Training_Accuracy:  97.83  -time: 945.96 
Iteration: 32/50[==============] -Error: 0.0170012677  -Training_Accuracy:  98.13  -time: 975.89 
Iteration: 33/50[==============] -Error: 0.0169507953  -Training_Accuracy:  98.04  -time: 1006.49 
Iteration: 34/50[==============] -Error: 0.0162530027  -Training_Accuracy:  98.18  -time: 1036.76 
Iteration: 35/50[==============] -Error: 0.0151743136  -Training_Accuracy:  98.43  -time: 1067.44 
Iteration: 36/50[==============] -Error: 0.0153680525  -Training_Accuracy:  98.16  -time: 1097.44 
Iteration: 37/50[==============] -Error: 0.0157386167  -Training_Accuracy:  98.51  -time: 1128.36 
Iteration: 38/50[==============] -Error: 0.0147276998  -Training_Accuracy:  97.96  -time: 1158.58 
Iteration: 39/50[==============] -Error: 0.0138099847  -Training_Accuracy:  98.61  -time: 1188.53 
Iteration: 40/50[==============] -Error: 0.0137706004  -Training_Accuracy:  98.54  -time: 1218.58 
Iteration: 41/50[==============] -Error: 0.0139069004  -Training_Accuracy:  98.46  -time: 1249.15 
Iteration: 42/50[==============] -Error: 0.0143867297  -Training_Accuracy:  98.68  -time: 1279.32 
Iteration: 43/50[==============] -Error: 0.0126389597  -Training_Accuracy:  98.75  -time: 1311.17 
Iteration: 44/50[==============] -Error: 0.0137102522  -Training_Accuracy:  98.30  -time: 1341.14 
Iteration: 45/50[==============] -Error: 0.0138588666  -Training_Accuracy:  98.79  -time: 1372.91 
Iteration: 46/50[==============] -Error: 0.0133378976  -Training_Accuracy:  98.57  -time: 1403.23 
Iteration: 47/50[==============] -Error: 0.0133145434  -Training_Accuracy:  98.60  -time: 1433.20 
Iteration: 48/50[==============] -Error: 0.0130364507  -Training_Accuracy:  98.56  -time: 1463.34 
Iteration: 49/50[==============] -Error: 0.0127471848  -Training_Accuracy:  98.75  -time: 1494.38 
Iteration: 50/50[==============] -Error: 0.0122893884  -Training_Accuracy:  98.84  -time: 1524.53 

In [66]:
UT.plot_curve(range(1,my_mnist_net.iterations+1),evaluations[0], "Error")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[1], "Training_Accuracy")
UT.plot_curve(range(1,my_mnist_net.iterations+1), evaluations[2], "Validation_Accuracy")


Observations
Changing learning rate helped us achieving a better result: training accuracy is almost 99% and validation accuracy is 96.5%. Looking at the pattern of the curve, we think that with more iterations we can do a little bit better, but not much.