## 2 Hidden Layers

Programming a Neural Network with 2 hidden layers is essentially the same process as with a single hidden layer with a single extra step. Because there is an extra layer, there is a third weight array and an extra level of gradient descent required. Thus there is a bit more math, and a few extra lines of code.



In [17]:

import NeuralNetImport as NN
import numpy as np
from sklearn.datasets import load_digits
digits = load_digits()
import NNpix as npx
from IPython.display import HTML
from IPython.display import display



### Neuron with 2 Hidden Layers



In [18]:

npx.cneuron2




Out[18]:



### Gradient Descent with 2 Hidden Layers



In [19]:

npx.derivation2




Out[19]:




In [20]:

f = open("HTML2.html")




In [21]:

display(HTML(f.read()))




Diagram Equations
Partial Derivatives

$\hat{y}$
$\hat{y}=\sigma(N)$
$\frac{\partial\hat{y}}{\partial N} = \sigma\prime(N)$

N
$N = \sigma(N) \times w3$
$\frac{\partial N}{\partial M} = \sigma\prime(M) \times w_3$
$\frac{\partial N}{\partial w_3} = \sigma(M)$

M
$M = \sigma(L) \times w2$
$\frac{\partial M}{\partial L} = \sigma\prime(L) \times w_2$
$\frac{\partial M}{\partial w_2} = \sigma(L)$

L
$L = K \times w1$
$\frac{\partial L}{\partial w_1} = K$

Gradients with Chain Rule
Gradients with Substitution

$w_3$
$\frac{\partial{C}}{\partial{w_3}} = -(y-\hat{y}) \frac{\partial{\hat{y}}}{\partial{N}} \frac{\partial{N}}{\partial{w_3}}$
$\frac{\partial{C}}{\partial{w_3}} = -(y - \hat{y}) \times \sigma \prime(N) \times \sigma(M)$

$w_2$
$\frac{\partial{C}}{\partial{w_2}} = -(y-\hat{y}) \frac{\partial{\hat{y}}}{\partial{N}} \frac{\partial{N}}{\partial{M}} \frac{\partial{M}}{\partial{w_2}}$
$\frac{\partial{C}}{\partial{w_2}} = -(y-\hat{y}) \times \sigma\prime(N) \times \sigma\prime(M) \times w_3 \times \sigma(L)$

$w_3$
$\frac{\partial{C}}{\partial{w_1}} = -(y-\hat{y}) \frac{\partial{\hat{y}}}{\partial{N}} \frac{\partial{N}}{\partial{M}} \frac{\partial{M}}{\partial{L}} \frac{\partial{L}}{\partial{w_1}}$
$\frac{\partial{C}}{\partial{w_1}} = -(y-\hat{y}) \times \sigma\prime(N) \times \sigma\prime(M) \times w_3 \times \sigma\prime(L) \times w_2 \times K$




In [22]:

f.close()



### Create Training Inputs and Solutions

Use 1000 random samples to generate an input and solution. The other 797 will be used to test.



In [23]:

perm = np.random.permutation(1792)
training_input = np.array([digits.images[perm[i]].flatten() for i in range(1000)])/100




In [24]:

training_solution = NN.create_training_soln(digits.target[perm], 10)




In [25]:

train = NN.NN_training_2(training_input, training_solution, 64, 10, 60, 50, 80, 0.7)



### Getting Weights

To find weights, use the commented out line below.



In [33]:

# x,y,z = train.train()




In [34]:

f = np.load("2HiddenWeights.npz")




In [35]:

x = f['arr_0']
y = f['arr_1']
z = f['arr_2']




In [36]:

assert len(x) == 60
assert len(y) == 50
assert len(z) == 10



### Find Solutions



In [37]:

ask = [NN.NN_ask_2(np.array([digits.images[perm[i]].flatten()])/100,x,y,z) for i in range(1000,1792)]




In [38]:

comp_vals = [ask[i].get_ans() for i in range(len(ask))]



### Calculate Accuracy



In [39]:

print((sum(((comp_vals - np.array([digits.target[perm[i]] for i in range(1000,1792)]) == 0).astype(int)) / 792 * 100)), "%")




97.7272727273 %




In [ ]: