Logistic Neurons


In [ ]:
import numpy as np
from utils import make_classification, draw_decision_boundary, sigmoid
from sklearn.metrics import accuracy_score
from theano import tensor as T
from theano import function, shared
import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.rc('figure', figsize=(8, 6))
%matplotlib inline

Activation of a logistic neuron:

$$ z = \sum_{i \in L} x_{i}w_{i} + b$$

Predicted output:

$$ y = \frac{1}{1 + e^{-z}} $$

Loss function: Mean Squared Error:

$$ E = \frac{1}{2}\sum_{i \in L} (t^{i} - y^{i})^{2} $$

Where $L$ is the set of training cases, and $t$ is the target value

Logistic Neuron in NumPy:

Step 1: Make dummy data


In [ ]:
X, Y = make_classification()
W = np.random.rand(2, 1)
B = np.random.rand(1,)

In [ ]:
draw_decision_boundary(W.ravel().tolist() + [B[0]], X, Y)

Step 2: Get activation and prediction


In [ ]:
# activation
Z = np.dot(X, W) + B

# prediction
Y_pred = sigmoid(Z)

Step 3: Derive gradient for loss function

Gradient: $\nabla{E} = \frac{\partial{E}}{\partial{w_{j}}}$

Trick:

$$

\begin{equation} \frac{\partial{\mathbf{E}}}{\partial{\mathbf{W}}} = \frac{\partial{\mathbf{y}}}{\partial{\mathbf{W}}}\frac{\partial{\mathbf{E}}}{\partial{\mathbf{y}}} \end{equation}$$ ## Second term on RHS: ## $$

\frac{\partial{\mathbf{E}}}{\partial{\mathbf{y}}} = -(\mathbf{t} - \mathbf{y})$$

First term on RHS: (using same trick):

$$\frac{\partial{\mathbf{y}}}{\partial{\mathbf{W}}} = \frac{\partial{\mathbf{y}}}{\partial{\mathbf{z}}}\frac{\partial{\mathbf{z}}}{\partial{\mathbf{W}}}$$

From first exercise, first term on RHS reduces to:

$$\frac{\partial{\mathbf{y}}}{\partial{\mathbf{z}}} = \mathbf{y}(1 - \mathbf{y})$$

From definition of logistic activation:

$$\mathbf{z} = \mathbf{X}\mathbf{W} + \mathbf{b} $$

Second term in RHS:

$$\frac{\partial{\mathbf{z}}}{\partial{\mathbf{W}}} = \mathbf{X}$$

Substituting:

$$\frac{\partial{\mathbf{y}}}{\partial{\mathbf{W}}} = \mathbf{y}(1 - \mathbf{y})\mathbf{X}$$

Substituting back in original equation

$$\frac{\partial{\mathbf{E}}}{\partial{\mathbf{W}}} = -(\mathbf{t} - \mathbf{y})\mathbf{y}(1 - \mathbf{y})\mathbf{X}$$

Using this gradient to train neuron with NumPy


In [ ]:
def predict(X, weights, bias=None):
    if bias is not None:
        z = np.dot(X, weights) + bias
    else:
        z = np.dot(X, weights)
    return sigmoid(z)

def train(X, Y, weights, alpha=0.3):
    y_hat = predict(X, weights)
    _gw = -1 * (Y - y_hat) * y_hat * (1 - y_hat)
    _gw = np.repeat(_gw, X.shape[1], axis=1)
    weights -= (alpha * _gw * X).sum(0).reshape(-1, 1)
    return weights

def loss(y1, y2):
    return (0.5 * ((y1 - y2) ** 2)).sum()

In [ ]:
for i in range(10000):
    y_hat = predict(X, W)
    W = train(X, Y, W)
    if i % 1000 == 0:
        print("Loss: ", loss(Y, y_hat))

In [ ]:
draw_decision_boundary(W.ravel().tolist() + [B[0]], X, Y)

Exercise: Implement logistic neuron with Theano


In [ ]:
# enter code here