Deep Convolutional Neural Network in TensorFlow

In this notebook, we convert our LeNet-5-inspired, MNIST-classifying, deep convolutional network from Keras to TensorFlow (compare them side by side) following Aymeric Damien's style.

Load dependencies


In [1]:
import numpy as np
np.random.seed(42)
import tensorflow as tf
tf.set_random_seed(42)

Load data


In [2]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)


Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Set neural network hyperparameters


In [3]:
epochs = 20
batch_size = 128
display_progress = 40 # after this many batches, output progress to screen
wt_init = tf.contrib.layers.xavier_initializer() # weight initializer

Set parameters for each layer


In [4]:
# input layer: 
n_input = 784

# first convolutional layer: 
n_conv_1 = 32
k_conv_1 = 3 # k_size

# second convolutional layer: 
n_conv_2 = 64
k_conv_2 = 3

# max pooling layer:
pool_size = 2
mp_layer_dropout = 0.25

# dense layer: 
n_dense = 128
dense_layer_dropout = 0.5

# output layer: 
n_classes = 10

Define placeholder Tensors for inputs and labels


In [5]:
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

Define types of layers


In [6]:
# dense layer with ReLU activation:
def dense(x, W, b):
    z = tf.add(tf.matmul(x, W), b)
    a = tf.nn.relu(z)
    return a

# convolutional layer with ReLU activation:
def conv2d(x, W, b, stride_length=1):
    xW = tf.nn.conv2d(x, W, strides=[1, stride_length, stride_length, 1], padding='SAME')
    z = tf.nn.bias_add(xW, b)
    a = tf.nn.relu(z)
    return a

# max-pooling layer: 
def maxpooling2d(x, p_size):
    return tf.nn.max_pool(x, 
                          ksize=[1, p_size, p_size, 1], 
                          strides=[1, p_size, p_size, 1], 
                          padding='SAME')

Design neural network architecture


In [7]:
def network(x, weights, biases, n_in, mp_psize, mp_dropout, dense_dropout):

    # reshape linear MNIST pixel input into square image: 
    square_dimensions = int(np.sqrt(n_in))
    square_x = tf.reshape(x, shape=[-1, square_dimensions, square_dimensions, 1])
    
    # convolutional and max-pooling layers:
    conv_1 = conv2d(square_x, weights['W_c1'], biases['b_c1'])
    conv_2 = conv2d(conv_1, weights['W_c2'], biases['b_c2'])
    pool_1 = maxpooling2d(conv_2, mp_psize)
    pool_1 = tf.nn.dropout(pool_1, 1-mp_dropout)
    
    # dense layer: 
    flat = tf.reshape(pool_1, [-1, weights['W_d1'].get_shape().as_list()[0]])
    dense_1 = dense(flat, weights['W_d1'], biases['b_d1'])
    dense_1 = tf.nn.dropout(dense_1, 1-dense_dropout)
    
    # output layer: 
    out_layer_z = tf.add(tf.matmul(dense_1, weights['W_out']), biases['b_out'])
    
    return out_layer_z

Define dictionaries for storing weights and biases for each layer -- and initialize


In [8]:
bias_dict = {
    'b_c1': tf.Variable(tf.zeros([n_conv_1])),
    'b_c2': tf.Variable(tf.zeros([n_conv_2])),
    'b_d1': tf.Variable(tf.zeros([n_dense])),
    'b_out': tf.Variable(tf.zeros([n_classes]))
}

# calculate number of inputs to dense layer: 
full_square_length = np.sqrt(n_input)
pooled_square_length = int(full_square_length / pool_size)
dense_inputs = pooled_square_length**2 * n_conv_2

weight_dict = {
    'W_c1': tf.get_variable('W_c1', 
                            [k_conv_1, k_conv_1, 1, n_conv_1], initializer=wt_init),
    'W_c2': tf.get_variable('W_c2', 
                            [k_conv_2, k_conv_2, n_conv_1, n_conv_2], initializer=wt_init),
    'W_d1': tf.get_variable('W_d1', 
                            [dense_inputs, n_dense], initializer=wt_init),
    'W_out': tf.get_variable('W_out', 
                             [n_dense, n_classes], initializer=wt_init)
}

Build model


In [9]:
predictions = network(x, weight_dict, bias_dict, n_input, 
                      pool_size, mp_layer_dropout, dense_layer_dropout)

Define model's loss and its optimizer


In [10]:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predictions, labels=y))
optimizer = tf.train.AdamOptimizer().minimize(cost)

Define evaluation metrics


In [11]:
correct_prediction = tf.equal(tf.argmax(predictions, 1), tf.argmax(y, 1))
accuracy_pct = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) * 100

Create op for variable initialization


In [12]:
initializer_op = tf.global_variables_initializer()

Train the network in a session (identical to intermediate_net_in_tensorflow.ipynb except addition of display_progress)


In [13]:
with tf.Session() as session:
    session.run(initializer_op)
    
    print("Training for", epochs, "epochs.")
    
    # loop over epochs: 
    for epoch in range(epochs):
        
        avg_cost = 0.0 # track cost to monitor performance during training
        avg_accuracy_pct = 0.0
        
        # loop over all batches of the epoch:
        n_batches = int(mnist.train.num_examples / batch_size)
        for i in range(n_batches):

            # to reassure you something's happening! 
            if i % display_progress == 0:
                print("Step ", i+1, " of ", n_batches, " in epoch ", epoch+1, ".", sep='')
            
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            
            # feed batch data to run optimization and fetching cost and accuracy: 
            _, batch_cost, batch_acc = session.run([optimizer, cost, accuracy_pct], 
                                                   feed_dict={x: batch_x, y: batch_y})
            
            # accumulate mean loss and accuracy over epoch: 
            avg_cost += batch_cost / n_batches
            avg_accuracy_pct += batch_acc / n_batches
            
        # output logs at end of each epoch of training:
        print("Epoch ", '%03d' % (epoch+1), 
              ": cost = ", '{:.3f}'.format(avg_cost), 
              ", accuracy = ", '{:.2f}'.format(avg_accuracy_pct), "%", 
              sep='')
    
    print("Training Complete. Testing Model.\n")
    
    test_cost = cost.eval({x: mnist.test.images, y: mnist.test.labels})
    test_accuracy_pct = accuracy_pct.eval({x: mnist.test.images, y: mnist.test.labels})
    
    print("Test Cost:", '{:.3f}'.format(test_cost))
    print("Test Accuracy: ", '{:.2f}'.format(test_accuracy_pct), "%", sep='')


Training for 20 epochs.
Step 1 of 429 in epoch 1.
Step 41 of 429 in epoch 1.
Step 81 of 429 in epoch 1.
Step 121 of 429 in epoch 1.
Step 161 of 429 in epoch 1.
Step 201 of 429 in epoch 1.
Step 241 of 429 in epoch 1.
Step 281 of 429 in epoch 1.
Step 321 of 429 in epoch 1.
Step 361 of 429 in epoch 1.
Step 401 of 429 in epoch 1.
Epoch 001: cost = 0.266, accuracy = 91.86%
Step 1 of 429 in epoch 2.
Step 41 of 429 in epoch 2.
Step 81 of 429 in epoch 2.
Step 121 of 429 in epoch 2.
Step 161 of 429 in epoch 2.
Step 201 of 429 in epoch 2.
Step 241 of 429 in epoch 2.
Step 281 of 429 in epoch 2.
Step 321 of 429 in epoch 2.
Step 361 of 429 in epoch 2.
Step 401 of 429 in epoch 2.
Epoch 002: cost = 0.095, accuracy = 97.19%
Step 1 of 429 in epoch 3.
Step 41 of 429 in epoch 3.
Step 81 of 429 in epoch 3.
Step 121 of 429 in epoch 3.
Step 161 of 429 in epoch 3.
Step 201 of 429 in epoch 3.
Step 241 of 429 in epoch 3.
Step 281 of 429 in epoch 3.
Step 321 of 429 in epoch 3.
Step 361 of 429 in epoch 3.
Step 401 of 429 in epoch 3.
Epoch 003: cost = 0.067, accuracy = 98.01%
Step 1 of 429 in epoch 4.
Step 41 of 429 in epoch 4.
Step 81 of 429 in epoch 4.
Step 121 of 429 in epoch 4.
Step 161 of 429 in epoch 4.
Step 201 of 429 in epoch 4.
Step 241 of 429 in epoch 4.
Step 281 of 429 in epoch 4.
Step 321 of 429 in epoch 4.
Step 361 of 429 in epoch 4.
Step 401 of 429 in epoch 4.
Epoch 004: cost = 0.055, accuracy = 98.30%
Step 1 of 429 in epoch 5.
Step 41 of 429 in epoch 5.
Step 81 of 429 in epoch 5.
Step 121 of 429 in epoch 5.
Step 161 of 429 in epoch 5.
Step 201 of 429 in epoch 5.
Step 241 of 429 in epoch 5.
Step 281 of 429 in epoch 5.
Step 321 of 429 in epoch 5.
Step 361 of 429 in epoch 5.
Step 401 of 429 in epoch 5.
Epoch 005: cost = 0.048, accuracy = 98.51%
Step 1 of 429 in epoch 6.
Step 41 of 429 in epoch 6.
Step 81 of 429 in epoch 6.
Step 121 of 429 in epoch 6.
Step 161 of 429 in epoch 6.
Step 201 of 429 in epoch 6.
Step 241 of 429 in epoch 6.
Step 281 of 429 in epoch 6.
Step 321 of 429 in epoch 6.
Step 361 of 429 in epoch 6.
Step 401 of 429 in epoch 6.
Epoch 006: cost = 0.041, accuracy = 98.71%
Step 1 of 429 in epoch 7.
Step 41 of 429 in epoch 7.
Step 81 of 429 in epoch 7.
Step 121 of 429 in epoch 7.
Step 161 of 429 in epoch 7.
Step 201 of 429 in epoch 7.
Step 241 of 429 in epoch 7.
Step 281 of 429 in epoch 7.
Step 321 of 429 in epoch 7.
Step 361 of 429 in epoch 7.
Step 401 of 429 in epoch 7.
Epoch 007: cost = 0.037, accuracy = 98.86%
Step 1 of 429 in epoch 8.
Step 41 of 429 in epoch 8.
Step 81 of 429 in epoch 8.
Step 121 of 429 in epoch 8.
Step 161 of 429 in epoch 8.
Step 201 of 429 in epoch 8.
Step 241 of 429 in epoch 8.
Step 281 of 429 in epoch 8.
Step 321 of 429 in epoch 8.
Step 361 of 429 in epoch 8.
Step 401 of 429 in epoch 8.
Epoch 008: cost = 0.031, accuracy = 99.00%
Step 1 of 429 in epoch 9.
Step 41 of 429 in epoch 9.
Step 81 of 429 in epoch 9.
Step 121 of 429 in epoch 9.
Step 161 of 429 in epoch 9.
Step 201 of 429 in epoch 9.
Step 241 of 429 in epoch 9.
Step 281 of 429 in epoch 9.
Step 321 of 429 in epoch 9.
Step 361 of 429 in epoch 9.
Step 401 of 429 in epoch 9.
Epoch 009: cost = 0.029, accuracy = 99.06%
Step 1 of 429 in epoch 10.
Step 41 of 429 in epoch 10.
Step 81 of 429 in epoch 10.
Step 121 of 429 in epoch 10.
Step 161 of 429 in epoch 10.
Step 201 of 429 in epoch 10.
Step 241 of 429 in epoch 10.
Step 281 of 429 in epoch 10.
Step 321 of 429 in epoch 10.
Step 361 of 429 in epoch 10.
Step 401 of 429 in epoch 10.
Epoch 010: cost = 0.025, accuracy = 99.20%
Step 1 of 429 in epoch 11.
Step 41 of 429 in epoch 11.
Step 81 of 429 in epoch 11.
Step 121 of 429 in epoch 11.
Step 161 of 429 in epoch 11.
Step 201 of 429 in epoch 11.
Step 241 of 429 in epoch 11.
Step 281 of 429 in epoch 11.
Step 321 of 429 in epoch 11.
Step 361 of 429 in epoch 11.
Step 401 of 429 in epoch 11.
Epoch 011: cost = 0.023, accuracy = 99.23%
Step 1 of 429 in epoch 12.
Step 41 of 429 in epoch 12.
Step 81 of 429 in epoch 12.
Step 121 of 429 in epoch 12.
Step 161 of 429 in epoch 12.
Step 201 of 429 in epoch 12.
Step 241 of 429 in epoch 12.
Step 281 of 429 in epoch 12.
Step 321 of 429 in epoch 12.
Step 361 of 429 in epoch 12.
Step 401 of 429 in epoch 12.
Epoch 012: cost = 0.022, accuracy = 99.28%
Step 1 of 429 in epoch 13.
Step 41 of 429 in epoch 13.
Step 81 of 429 in epoch 13.
Step 121 of 429 in epoch 13.
Step 161 of 429 in epoch 13.
Step 201 of 429 in epoch 13.
Step 241 of 429 in epoch 13.
Step 281 of 429 in epoch 13.
Step 321 of 429 in epoch 13.
Step 361 of 429 in epoch 13.
Step 401 of 429 in epoch 13.
Epoch 013: cost = 0.021, accuracy = 99.34%
Step 1 of 429 in epoch 14.
Step 41 of 429 in epoch 14.
Step 81 of 429 in epoch 14.
Step 121 of 429 in epoch 14.
Step 161 of 429 in epoch 14.
Step 201 of 429 in epoch 14.
Step 241 of 429 in epoch 14.
Step 281 of 429 in epoch 14.
Step 321 of 429 in epoch 14.
Step 361 of 429 in epoch 14.
Step 401 of 429 in epoch 14.
Epoch 014: cost = 0.019, accuracy = 99.37%
Step 1 of 429 in epoch 15.
Step 41 of 429 in epoch 15.
Step 81 of 429 in epoch 15.
Step 121 of 429 in epoch 15.
Step 161 of 429 in epoch 15.
Step 201 of 429 in epoch 15.
Step 241 of 429 in epoch 15.
Step 281 of 429 in epoch 15.
Step 321 of 429 in epoch 15.
Step 361 of 429 in epoch 15.
Step 401 of 429 in epoch 15.
Epoch 015: cost = 0.018, accuracy = 99.37%
Step 1 of 429 in epoch 16.
Step 41 of 429 in epoch 16.
Step 81 of 429 in epoch 16.
Step 121 of 429 in epoch 16.
Step 161 of 429 in epoch 16.
Step 201 of 429 in epoch 16.
Step 241 of 429 in epoch 16.
Step 281 of 429 in epoch 16.
Step 321 of 429 in epoch 16.
Step 361 of 429 in epoch 16.
Step 401 of 429 in epoch 16.
Epoch 016: cost = 0.016, accuracy = 99.43%
Step 1 of 429 in epoch 17.
Step 41 of 429 in epoch 17.
Step 81 of 429 in epoch 17.
Step 121 of 429 in epoch 17.
Step 161 of 429 in epoch 17.
Step 201 of 429 in epoch 17.
Step 241 of 429 in epoch 17.
Step 281 of 429 in epoch 17.
Step 321 of 429 in epoch 17.
Step 361 of 429 in epoch 17.
Step 401 of 429 in epoch 17.
Epoch 017: cost = 0.016, accuracy = 99.42%
Step 1 of 429 in epoch 18.
Step 41 of 429 in epoch 18.
Step 81 of 429 in epoch 18.
Step 121 of 429 in epoch 18.
Step 161 of 429 in epoch 18.
Step 201 of 429 in epoch 18.
Step 241 of 429 in epoch 18.
Step 281 of 429 in epoch 18.
Step 321 of 429 in epoch 18.
Step 361 of 429 in epoch 18.
Step 401 of 429 in epoch 18.
Epoch 018: cost = 0.014, accuracy = 99.52%
Step 1 of 429 in epoch 19.
Step 41 of 429 in epoch 19.
Step 81 of 429 in epoch 19.
Step 121 of 429 in epoch 19.
Step 161 of 429 in epoch 19.
Step 201 of 429 in epoch 19.
Step 241 of 429 in epoch 19.
Step 281 of 429 in epoch 19.
Step 321 of 429 in epoch 19.
Step 361 of 429 in epoch 19.
Step 401 of 429 in epoch 19.
Epoch 019: cost = 0.015, accuracy = 99.47%
Step 1 of 429 in epoch 20.
Step 41 of 429 in epoch 20.
Step 81 of 429 in epoch 20.
Step 121 of 429 in epoch 20.
Step 161 of 429 in epoch 20.
Step 201 of 429 in epoch 20.
Step 241 of 429 in epoch 20.
Step 281 of 429 in epoch 20.
Step 321 of 429 in epoch 20.
Step 361 of 429 in epoch 20.
Step 401 of 429 in epoch 20.
Epoch 020: cost = 0.015, accuracy = 99.50%
Training Complete. Testing Model.

Test Cost: 0.059
Test Accuracy: 98.62%

Compare with LeNet Keras results

Increase dropout probability(/ies) or add dropout to other conv layer? Stop earlier? Coming up in Lecture 5 :)


In [ ]:


In [ ]:

As an exercise, try converting our AlexNet from Keras to TensorFlow following the same style as this LeNet-5 notebook.


In [ ]: