Use of convolutions with tensorflow

In this notebook, you'll be using tensorflow to build a Convolutional Neural Network (CNN).

Convolution

Both, this notebook and this wikipedia page might help you understand what is a convolution.

no, if we consider two functions $f$ and $g$ taking values from $\mathbb{Z} \to \mathbb{R}$ then:
$ (f * g)[n] = \sum_{m = -\infty}^{+\infty} f[m] \cdot g[n - m] $

In our case, we consider the two vectors $x$ and $w$ :
$ x = (x_1, x_2, ..., x_{n-1}, x_n) $
$ w = (w_1, w_2) $

And get :
$ x * w = (w_1 x_1 + w_2 x_2, w_1 x_2 + w_2 x_3, ..., w_1 x_{n-1} + w_2 x_n)$

Deep learning subtility :

In most of deep learning framewoks, you'll get to chose in between three paddings:

  • Same: $(f*g)$ has the same shape as x (we pad the entry with zeros)
  • valid: $(f*g)$ has the shape of x minus the shape of w plus 1 (no padding on x)
  • Causal: $(f*g)(n_t)$ does not depend on any $(n_{t+1})$

Tensorflow

"TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and also used for machine learning applications such as neural networks.[3] It is used for both research and production at Google often replacing its closed-source predecessor, DistBelief." - Wikipedia

We'll be using tensorflow to build the models we want to use.

Here below, we build a AND gate with a very simple neural network :


In [1]:
import tensorflow as tf
import numpy as np

tf.reset_default_graph()

# Define our Dataset
X = np.array([[0,0],[0,1],[1,0],[1,1]])
Y = np.array([0,0,0,1]).reshape(-1,1)


# Define the tensorflow tensors
x = tf.placeholder(tf.float32, [None, 2], name='X')  # inputs
y = tf.placeholder(tf.float32, [None, 1], name='Y')  # outputs
W = tf.Variable(tf.zeros([2, 1]), name='W')
b = tf.Variable(tf.zeros([1,]), name='b')

# Define the model
pred = tf.nn.sigmoid(tf.matmul(x, W) + b)  # Model

# Define the loss
with tf.name_scope("loss"):
    loss = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred) + (1-y) * tf.log(1-pred), reduction_indices=1))

# Define the optimizer method you want to use
with tf.name_scope("optimizer"):
    optimizer = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

# Include some Tensorboard visualization
writer_train = tf.summary.FileWriter("./my_model/")


# Start training session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    writer_train.add_graph(sess.graph)
    
    for epoch in range(1000):
        _, c, p = sess.run([optimizer, loss, pred], feed_dict={x: X,
                                                      y: Y})
print p, y


---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-1f05daf21c83> in <module>()
     28 # Include some Tensorboard visualization
     29 writer_train = tf.summary.FileWriter("./my_model/")
---> 30 writer_train.add_graph(sess.graph)
     31 
     32 

NameError: name 'sess' is not defined

To visualize the graph you just created, launch tensorbord.
$tensorboard --logdirs=./ on linux (with corresponding logdir)


Get inspiration from the preceding code to build a XOR gate

Design a neural network with 2 layers.

  • layer1 has 2 neurons (sigmoid or tanh activation)
  • Layer2 has 1 neuron (it outouts the prediction)

And train it

It's mandatory that you get a tensorboard visualization of your graph, try to make it look good, plz :)

Here below I put a graph of the model you want to have (yet your weights won't be the same)


In [ ]:
### Code here

And give an interpretation on what they are doing


In [ ]:
### Code here

Build a CNN to predict the MNIST digits

You can now move to CNNs. You'll have to train a convolutional neural network to predict the digits from MNIST.

You might want to reuse some pieces of code from SNN

Your model should have 3 layers:

  • 1st layer : 6 convolutional kernels with shape (3,3)
  • 2nd layer : 6 convolutional kernels with shape (3,3)
  • 3rd layer : Softmax layer

Train your model.

Explain all you do, and why, make it lovely to read, plz o:)


In [ ]:
### code here

And give an interpretation on what they are doing


In [ ]:
### code here

Chose one (tell me what you chose...)

  • Show how the gradients (show only one kernel) evolve for good and wrong prediction. (hard)
  • Initialize the kernels with values that make sense for you and show how they evolve. (easy)
  • When training is finished, show the 6+6=12 results of some convolved immages. (easy)

In [ ]:
### Code here