In this notebook, we explore the concept of convolutional neural networks.
You may want to read this wikipedia page if you're not familiar with the concept of a convolution.
In a convolutional neural network
You may read Wikipedia's web page
I we consider two functions $f$ and $g$ taking values from $\mathbb{Z} \to \mathbb{R}$ then:
$ (f * g)[n] = \sum_{m = -\infty}^{+\infty} f[m] \cdot g[n - m] $
In our case, we consider the two vectors $x$ and $w$ :
$ x = (x_1, x_2, ..., x_{n-1}, x_n) $
$ w = (w_1, w_2) $
And get :
$ x * w = (w_1 x_1 + w_2 x_2, w_1 x_2 + w_2 x_3, ..., w_1 x_{n-1} + w_2 x_n)$
In most of deep learning framewoks, you'll get to chose in between three paddings:
In [122]:
# Which is easily implemented on python :
def _convolve(x, w, type='valid'):
# x and w are np vectors
conv = []
for i in range(len(x)):
if type == 'valid':
conv.append((x[i: i+len(w)] * w).sum())
return np.array(conv)
def convolve(X, w):
# Convolves a batch X to w
w = np.array(w)
X = np.array(X)
conv = []
for i in range(len(X)):
conv.append(_convolve(X[i], w))
return np.array(conv)
As we use it, the convolution is parametrised by two vectors $x$ and $w$ and outputs a vector $z$. We have:
$ x * w = z$
$ z_i = (w_1 x_i + w_2 x_{i+1})$
We want to derive $z$ with respect to some weights $w_j$:
$\frac{\delta z_i}{\delta w_j} = x_{i+j}$
$\frac{\delta z_i}{\delta w} = (x_{i}, x_{i+1}, ..., x_{i+n})$
In [113]:
from utils import *
import utils
reload(utils)
from utils import *
(x_train, y_train), (x_test, y_test) = load_up_down(50)
plt.plot(x_train.T)
plt.show()
In [120]:
# Rename y_silver to X and y_gold to Y
X, Y = [x_train, ], y_train
# Initilize the parameters
Ws = [0.5, 0.5]
alphas = (0.01, 0.01)
# Load Trainer
t = Trainer(X, Y, Ws, alphas)
# Define Prediction and Loss
t.pred = lambda X : convolve(X[0], (t.Ws[0], t.Ws[1])).mean(axis=1)
t.loss = lambda : (np.power((t.Y - t.pred(t.X)), 2) * 1 / 2.).mean()
print t.pred(X)
t.acc = lambda X, Y : t.pred(X)
# Define the gradient functions
dl_dp = lambda : -(t.Y - t.pred(X))
dl_dw0 = lambda : (t.X[0][:-1]).mean()
dl_dw1 = lambda : (t.X[0][1:]).mean()
t.dWs = (dl_dw0, dl_dw1)
# Start training
anim = t.animated_train(is_notebook=True)
from IPython.display import HTML
HTML(anim.to_html5_video())
Out[120]:
In [117]:
t.loss()
Out[117]:
In [42]:
from scipy import signal
# Load MNIST
(x_train, y_train), (x_test, y_test) = load_MNIST()
img = x_train[2]
# Design the kernels
kernels = [[[-1, 2, -1],[-1, 2, -1],[-1, 2, -1]],
[[-1, -1, -1],[2, 2, 2],[-1, -1, -1]],
[[2, -1, -1],[-1, 2, -1],[-1, -1, 2]],
[[-1, -1, 2],[-1, 2, -1],[2, -1, -1]], ]
# Plot and convolve them to the image
for i, k in enumerate(kernels):
i = i*2+1
plt.subplot(3,4,i)
plt.imshow(k, cmap='gray')
plt.subplot(3,4,i+1)
conv = signal.convolve2d(img, k)
plt.imshow(conv > 1.5, cmap='gray')
plt.subplot(349)
plt.imshow(img, cmap='gray')
plt.show()