Use of convolutions with tensorflow

In this notebook, you'll be using tensorflow to build a Convolutional Neural Network (CNN).

Convolution

Both, this notebook and this wikipedia page might help you understand what is a convolution.

no, if we consider two functions $f$ and $g$ taking values from $\mathbb{Z} \to \mathbb{R}$ then:
$ (f * g)[n] = \sum_{m = -\infty}^{+\infty} f[m] \cdot g[n - m] $

In our case, we consider the two vectors $x$ and $w$ :
$ x = (x_1, x_2, ..., x_{n-1}, x_n) $
$ w = (w_1, w_2) $

And get :
$ x * w = (w_1 x_1 + w_2 x_2, w_1 x_2 + w_2 x_3, ..., w_1 x_{n-1} + w_2 x_n)$

Deep learning subtility :

In most of deep learning framewoks, you'll get to chose in between three paddings:

Same: $(f*g)$ has the same shape as x (we pad the entry with zeros)
valid: $(f*g)$ has the shape of x minus the shape of w plus 1 (no padding on x)
Causal: $(f*g)(n_t)$ does not depend on any $(n_{t+1})$

Tensorflow

"TensorFlow is an open-source software library for dataflow programming across a range of tasks. It is a symbolic math library, and also used for machine learning applications such as neural networks.[3] It is used for both research and production at Google often replacing its closed-source predecessor, DistBelief." - Wikipedia

We'll be using tensorflow to build the models we want to use.

Here below, we build a AND gate with a very simple neural network :



In [1]:

    
import tensorflow as tf
import numpy as np

tf.reset_default_graph()

# Define our Dataset
X = np.array([[0,0],[0,1],[1,0],[1,1]])
Y = np.array([0,0,0,1]).reshape(-1,1)


# Define the tensorflow tensors
x = tf.placeholder(tf.float32, [None, 2], name='X')  # inputs
y = tf.placeholder(tf.float32, [None, 1], name='Y')  # outputs
W = tf.Variable(tf.zeros([2, 1]), name='W')
b = tf.Variable(tf.zeros([1,]), name='b')

# Define the model
pred = tf.nn.sigmoid(tf.matmul(x, W) + b)  # Model

# Define the loss
with tf.name_scope("loss"):
    loss = tf.reduce_mean(-tf.reduce_sum(y * tf.log(pred) + (1-y) * tf.log(1-pred), reduction_indices=1))

# Define the optimizer method you want to use
with tf.name_scope("optimizer"):
    optimizer = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

# Include some Tensorboard visualization
writer_train = tf.summary.FileWriter("./my_model/")


# Start training session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    writer_train.add_graph(sess.graph)
    
    for epoch in range(1000):
        _, c, p = sess.run([optimizer, loss, pred], feed_dict={x: X,
                                                      y: Y})
print p, y









    



---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-1f05daf21c83> in <module>()
     28 # Include some Tensorboard visualization
     29 writer_train = tf.summary.FileWriter("./my_model/")
---> 30 writer_train.add_graph(sess.graph)
     31 
     32 

NameError: name 'sess' is not defined

To visualize the graph you just created, launch tensorbord.
$tensorboard --logdirs=./ on linux (with corresponding logdir)

Get inspiration from the preceding code to build a XOR gate

Design a neural network with 2 layers.

layer1 has 2 neurons (sigmoid or tanh activation)
Layer2 has 1 neuron (it outouts the prediction)

And train it

It's mandatory that you get a tensorboard visualization of your graph, try to make it look good, plz :)

Here below I put a graph of the model you want to have (yet your weights won't be the same)



In [ ]:

    
### Code here

Print the weights of your model

And give an interpretation on what they are doing



In [ ]:

    
### Code here

Build a CNN to predict the MNIST digits

You can now move to CNNs. You'll have to train a convolutional neural network to predict the digits from MNIST.

You might want to reuse some pieces of code from SNN

Your model should have 3 layers:

1st layer : 6 convolutional kernels with shape (3,3)
2nd layer : 6 convolutional kernels with shape (3,3)
3rd layer : Softmax layer

Train your model.

Explain all you do, and why, make it lovely to read, plz o:)



In [ ]:

    
### code here

Print the weights of your model

And give an interpretation on what they are doing



In [ ]:

    
### code here

Chose one (tell me what you chose...)

Show how the gradients (show only one kernel) evolve for good and wrong prediction. (hard)
Initialize the kernels with values that make sense for you and show how they evolve. (easy)
When training is finished, show the 6+6=12 results of some convolved immages. (easy)



In [ ]:

    
### Code here