Deep Convolutional Neural Network in TensorFlow

In this notebook, we convert our LeNet-5-inspired, MNIST-classifying, deep convolutional network from Keras to TensorFlow (compare them side by side) following Aymeric Damien's style.

Load dependencies



In [1]:

    
import numpy as np
np.random.seed(42)
import tensorflow as tf
tf.set_random_seed(42)

Load data



In [2]:

    
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)









    



Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Set neural network hyperparameters



In [3]:

    
epochs = 1
batch_size = 128
display_progress = 10 # after this many batches, output progress to screen
wt_init = tf.contrib.layers.xavier_initializer() # weight initializer

Set parameters for each layer



In [4]:

    
# input layer: 
n_input = 784

# first convolutional layer: 
n_conv_1 = 32
k_conv_1 = 3

# second convolutional layer: 
n_conv_2 = 64
k_conv_2 = 3

# max pooling layer:
pool_size = 2
mp_layer_dropout = 0.25

# dense layer: 
n_dense = 128
dense_layer_dropout = 0.5

# output layer: 
n_classes = 10

Define placeholder Tensors for inputs and labels



In [5]:

    
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

Define types of layers



In [7]:

    
# dense layer with ReLU activation:
def dense(x, W, b):
    z = tf.add(tf.matmul(x, W), b)
    a = tf.nn.relu(z)
    return a

# convolutional layer with ReLU activation:
def conv2d(x, W, b, stride_length=1):
    xW = tf.nn.conv2d(x, W, strides=[1, stride_length, stride_length, 1], padding='SAME')
    z = tf.nn.bias_add(xW, b)
    a = tf.nn.relu(z)
    return a

# max-pooling layer: 
def maxpooling2d(x, p_size):
    return tf.nn.max_pool(x, 
                          ksize=[1, p_size, p_size, 1],
                          strides=[1, p_size, p_size, 1],
                          padding='SAME'
                         )

Define dictionaries for storing weights and biases for each layer -- and initialize



In [8]:

    
bias_dict = {
    'b_c1': tf.Variable(tf.zeros([n_conv_1])),
    'b_c2': tf.Variable(tf.zeros([n_conv_2])),
    'b_d1': tf.Variable(tf.zeros([n_dense])),
    'b_out': tf.Variable(tf.zeros([n_classes]))
}

# calculate number of inputs to dense layer: 
full_square_length = np.sqrt(n_input)
pooled_square_length = int(full_square_length / pool_size)
dense_inputs = pooled_square_length**2 * n_conv_2

weight_dict = {
    'W_c1': tf.get_variable('W_c1', 
                            [k_conv_1, k_conv_1, 1, n_conv_1], initializer=wt_init),
    'W_c2': tf.get_variable('W_c2', 
                            [k_conv_2, k_conv_2, n_conv_1, n_conv_2], initializer=wt_init),
    'W_d1': tf.get_variable('W_d1', 
                            [dense_inputs, n_dense], initializer=wt_init),
    'W_out': tf.get_variable('W_out', 
                             [n_dense, n_classes], initializer=wt_init)
}

Design neural network architecture



In [9]:

    
def network(x, weights, biases, n_in, mp_psize, mp_dropout, dense_dropout):

    # reshape linear MNIST pixel input into square image: 
    square_dimensions = int(np.sqrt(n_in))
    square_x = tf.reshape(x, shape=[-1, square_dimensions, square_dimensions, 1])
    
    # convolutional and max-pooling layers:
    conv_1 = conv2d(square_x, weights['W_c1'], biases['b_c1'])
    conv_2 = conv2d(conv_1, weights['W_c2'], biases['b_c2'])
    pool_1 = maxpooling2d(conv_2, mp_psize)
    pool_1 = tf.nn.dropout(pool_1, 1-mp_dropout)
    
    # dense layer: 
    flat = tf.reshape(pool_1, [-1, weight_dict['W_d1'].get_shape().as_list()[0]])
    dense_1 = dense(flat, weights['W_d1'], biases['b_d1'])
    dense_1 = tf.nn.dropout(dense_1, 1-dense_dropout)
    
    # output layer: 
    out_layer_z = tf.add(tf.matmul(dense_1, weights['W_out']), biases['b_out'])
    
    return out_layer_z

Build model



In [10]:

    
predictions = network(x, weight_dict, bias_dict, n_input, 
                      pool_size, mp_layer_dropout, dense_layer_dropout)

Define model's loss and its optimizer



In [11]:

    
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predictions, labels=y))
optimizer = tf.train.AdamOptimizer().minimize(cost)

Define evaluation metrics



In [12]:

    
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
accuracy_pct = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) * 100

Create op for variable initialization



In [13]:

    
initializer_op = tf.global_variables_initializer()

Train the network in a session (identical to `intermediate_net_in_tensorflow.ipynb` except addition of `display_progress`)



In [ ]:

    
with tf.Session() as session:
    session.run(initializer_op)
    
    print("Training for", epochs, "epochs.")
    
    # loop over epochs: 
    for epoch in range(epochs):
        
        avg_cost = 0.0 # track cost to monitor performance during training
        avg_accuracy_pct = 0.0
        
        # loop over all batches of the epoch:
        n_batches = int(mnist.train.num_examples / batch_size)
        for i in range(n_batches):

            # to reassure you something's happening! 
            if i % display_progress == 0:
                print("Step ", i+1, " of ", n_batches, " in epoch ", epoch+1, ".", sep='')
            
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            
            # feed batch data to run optimization and fetching cost and accuracy: 
            _, batch_cost, batch_acc = session.run([optimizer, cost, accuracy_pct], 
                                                   feed_dict={x: batch_x, y: batch_y})
            
            # accumulate mean loss and accuracy over epoch: 
            avg_cost += batch_cost / n_batches
            avg_accuracy_pct += batch_acc / n_batches
            
        # output logs at end of each epoch of training:
        print("Epoch ", '%03d' % (epoch+1), 
              ": cost = ", '{:.3f}'.format(avg_cost), 
              ", accuracy = ", '{:.2f}'.format(avg_accuracy_pct), "%", 
              sep='')
    
    print("Training Complete. Testing Model.\n")
    
    test_cost = cost.eval({x: mnist.test.images, y: mnist.test.labels})
    test_accuracy_pct = accuracy_pct.eval({x: mnist.test.images, y: mnist.test.labels})
    
    print("Test Cost:", '{:.3f}'.format(test_cost))
    print("Test Accuracy: ", '{:.2f}'.format(test_accuracy_pct), "%", sep='')









    



Training for 1 epochs.
Step 1 of 429 in epoch 1.
Step 11 of 429 in epoch 1.
Step 21 of 429 in epoch 1.
Step 31 of 429 in epoch 1.
Step 41 of 429 in epoch 1.
Step 51 of 429 in epoch 1.
Step 61 of 429 in epoch 1.
Step 71 of 429 in epoch 1.
Step 81 of 429 in epoch 1.
Step 91 of 429 in epoch 1.
Step 101 of 429 in epoch 1.
Step 111 of 429 in epoch 1.
Step 121 of 429 in epoch 1.
Step 131 of 429 in epoch 1.
Step 141 of 429 in epoch 1.
Step 151 of 429 in epoch 1.
Step 161 of 429 in epoch 1.
Step 171 of 429 in epoch 1.
Step 181 of 429 in epoch 1.
Step 191 of 429 in epoch 1.
Step 201 of 429 in epoch 1.
Step 211 of 429 in epoch 1.
Step 221 of 429 in epoch 1.
Step 231 of 429 in epoch 1.
Step 241 of 429 in epoch 1.
Step 251 of 429 in epoch 1.
Step 261 of 429 in epoch 1.
Step 271 of 429 in epoch 1.
Step 281 of 429 in epoch 1.
Step 291 of 429 in epoch 1.
Step 301 of 429 in epoch 1.
Step 311 of 429 in epoch 1.
Step 321 of 429 in epoch 1.
Step 331 of 429 in epoch 1.
Step 341 of 429 in epoch 1.
Step 351 of 429 in epoch 1.
Step 361 of 429 in epoch 1.
Step 371 of 429 in epoch 1.
Step 381 of 429 in epoch 1.
Step 391 of 429 in epoch 1.
Step 401 of 429 in epoch 1.
Step 411 of 429 in epoch 1.
Step 421 of 429 in epoch 1.
Epoch 001: cost = 0.247, accuracy = 92.72%
Training Complete. Testing Model.

Compare with LeNet Keras results

Increase dropout probability(/ies) or add dropout to other conv layer? Stop earlier? Coming up in Lecture 5 :)



In [ ]:



In [ ]:

As an exercise, try converting our AlexNet from Keras to TensorFlow following the same style as this LeNet-5 notebook.



In [ ]:

Deep Convolutional Neural Network in TensorFlow

Load dependencies

Load data

Set neural network hyperparameters

Set parameters for each layer

Define placeholder Tensors for inputs and labels

Define types of layers

Define dictionaries for storing weights and biases for each layer -- and initialize

Design neural network architecture

Build model

Define model's loss and its optimizer

Define evaluation metrics

Create op for variable initialization

Train the network in a session (identical to intermediate_net_in_tensorflow.ipynb except addition of display_progress)

Compare with LeNet Keras results

Increase dropout probability(/ies) or add dropout to other conv layer? Stop earlier? Coming up in Lecture 5 :)

As an exercise, try converting our AlexNet from Keras to TensorFlow following the same style as this LeNet-5 notebook.

Train the network in a session (identical to `intermediate_net_in_tensorflow.ipynb` except addition of `display_progress`)