E-63 Big Data Analytics - Assignment 11 - Convolutional NN

Shanaka De Soysa


In [1]:
## Printing versions for future reference
import sys
import tensorflow as tf

print(sys.version)
print(sys.version_info)
print("TensorFlow Version: {0}".format(tf.__version__))


3.5.2 |Anaconda 4.2.0 (x86_64)| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
sys.version_info(major=3, minor=5, micro=2, releaselevel='final', serial=0)
TensorFlow Version: 1.0.1

Problem 1.

Please find 2 files from Google’s tutorials sets. I used file mnist2.py in class yesterday and for preparation of my notes. If you read the file carefully you will see that you can run it in at least two modes. The way it is setup now it selects one learning rate and one particular neural network architecture and generates TensorBoard graph in a particular directory. One problem with this script is that its accuracy is surprisingly low. Such complex architecture and so many lines of code and we get 70% or lower accuracy. We expected more from Convolutional Neural Networks. File cnn_mnist.py is practically the same, at least it does all the same things, creates the same architecture, sets the same or similar parameters, but does much better job. Its accuracy is in high 90%-s. Run two files compare results and then fix the first file (mnist2.py) based on what you saw in file cnn_mnist.py. Capture the Accuracy and Cross Entropy (summary) graphs from the corrected version of mnist2.py and provide working and fixed version of that file. Please describe in detail experiments you undertook and fixes you made. (45%)

Results from cnn_mnist.py shows accuracies upto 99%. Our objective is to improve the mnist2.py to match this.

Since we are benchmarking with cnn_mnist.py program, choose the parameters similar to that program. So we can compare apples to apples.

Iterations: 500
Learning rate: .005
use_two_fc: True
use_two_conv: True

Bias values for both conv_layer and fc_layer have been set to constant 0.1. This could cause problems changing these values to tf.zeros for conv_layer and truncated_normal fc_layer, that is what's been used in cnn_mnist as well.

## For conv_layer()
b = tf.Variable(tf.zeros([size_out], dtype=tf.float32), name="B")

## For fc_layer()
b = tf.Variable(tf.truncated_normal([size_out], stddev=0.1, dtype=tf.float32), name="B")

After removing constants

That didn't really help much. Accuracy is still around 0.1.

Filter size for the conv_layer is set to a 5 x 5 matrix. This could be an issue as our images 28 x 28. Let's try changing it to 4 x4, similar to cnn_mist program.

w = tf.Variable(tf.truncated_normal([4, 4, size_in, size_out], stddev=0.1), name="W")

Changing the filter to 4x4 increased accuracy to about 40%.

Let's try changing the fully connected output to 100 from 1024.

if use_two_fc:
    fc1 = fc_layer(flattened, 7 * 7 * conv2_features, 100, "fc1")
    embedding_input = fc1
    embedding_size = 100
    logits = fc_layer(fc1, 100, 10, "fc2")

Changing the fully connected layer's output to 100 increased the accuracy to about 70%. Also played around convolution layer sizes from 32/64 to 25/50 in some cases it improved the performance.

Let's try changing the optimizer to MomentumOptimizer.

train_step = tf.train.MomentumOptimizer(learning_rate, 0.9).minimize(xent)

We can see changing the optimizer increased the accuracy about 90%

Let's change the fully connected layer 1 size from 100 to 512. We can clearly see that increased accuracy of 98%. Even with 100 it achieves 98% since there is randomness in the model we have to run multiple times and get average to measure it's accuracy.

Cross entropy

Fixed Code

# Copyright 2017 Google, Inc. All Rights Reserved.
#
# ==============================================================================
import os
import tensorflow as tf
import sys
import urllib


if sys.version_info[0] >= 3:
    from urllib.request import urlretrieve
else:
    from urllib import urlretrieve

LOGDIR = 'log_mnist_500_512_2/'
GITHUB_URL = 'https://raw.githubusercontent.com/mamcgrath/TensorBoard-TF-Dev-Summit-Tutorial/master/'
GENERATIONS = 500

### MNIST EMBEDDINGS ###
mnist = tf.contrib.learn.datasets.mnist.read_data_sets(
    train_dir=LOGDIR + 'data', one_hot=True)
### Get a sprite and labels file for the embedding projector ###
urlretrieve(GITHUB_URL + 'labels_1024.tsv', LOGDIR + 'labels_1024.tsv')
urlretrieve(GITHUB_URL + 'sprite_1024.png', LOGDIR + 'sprite_1024.png')

# Add convolution layer


def conv_layer(input, size_in, size_out, name="conv"):
    with tf.name_scope(name):
        #w = tf.Variable(tf.zeros([5, 5, size_in, size_out]), name="W")
        #b = tf.Variable(tf.zeros([size_out]), name="B")
        w = tf.Variable(tf.truncated_normal(
            [4, 4, size_in, size_out], stddev=0.1), name="W")
        b = tf.Variable(tf.zeros([size_out], dtype=tf.float32), name="B")
        conv = tf.nn.conv2d(input, w, strides=[1, 1, 1, 1], padding="SAME")
        act = tf.nn.relu(conv + b)
        tf.summary.histogram("weights", w)
        tf.summary.histogram("biases", b)
        tf.summary.histogram("activations", act)
        return tf.nn.max_pool(act, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")


# Add fully connected layer
def fc_layer(input, size_in, size_out, name="fc"):
    with tf.name_scope(name):
        w = tf.Variable(tf.truncated_normal(
            [size_in, size_out], stddev=0.1), name="W")
        b = tf.Variable(tf.truncated_normal(
            [size_out], stddev=0.1, dtype=tf.float32), name="B")
        act = tf.nn.relu(tf.add(tf.matmul(input, w), b))
        tf.summary.histogram("weights", w)
        tf.summary.histogram("biases", b)
        tf.summary.histogram("activations", act)
        return act


def mnist_model(learning_rate, use_two_conv, use_two_fc, conv1_features, conv2_features,
                hparam, generations=500, fully_connected_size1=100):
    tf.reset_default_graph()
    sess = tf.Session()

    # Setup placeholders, and reshape the data
    x = tf.placeholder(tf.float32, shape=[None, 784], name="x")
    x_image = tf.reshape(x, [-1, 28, 28, 1])
    tf.summary.image('input', x_image, 3)
    y = tf.placeholder(tf.float32, shape=[None, 10], name="labels")

    if use_two_conv:
        conv1 = conv_layer(x_image, 1, conv1_features, "conv1")
        conv_out = conv_layer(conv1, conv1_features, conv2_features, "conv2")
    else:
        conv1 = conv_layer(x_image, 1, conv2_features, "conv")
        conv_out = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[
                                  1, 2, 2, 1], padding="SAME")

    flattened = tf.reshape(conv_out, [-1, 7 * 7 * conv2_features])

    if use_two_fc:
        fc1 = fc_layer(flattened, 7 * 7 * conv2_features, 100, "fc1")
        embedding_input = fc1
        embedding_size = 100
        logits = fc_layer(fc1, 100, 10, "fc2")
    else:
        embedding_input = flattened
        embedding_size = 7 * 7 * conv2_features
        logits = fc_layer(flattened, 7 * 7 * conv2_features, 10, "fc")

    with tf.name_scope("xent"):
        xent = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(
                logits=logits, labels=y), name="xent")
        tf.summary.scalar("xent", xent)

    with tf.name_scope("train"):
        train_step = tf.train.MomentumOptimizer(
            learning_rate, 0.9).minimize(xent)

    with tf.name_scope("accuracy"):
        correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.summary.scalar("accuracy", accuracy)

    summ = tf.summary.merge_all()

    embedding = tf.Variable(
        tf.zeros([1024, embedding_size]), name="test_embedding")
    assignment = embedding.assign(embedding_input)
    saver = tf.train.Saver()

    sess.run(tf.global_variables_initializer())
    writer = tf.summary.FileWriter(LOGDIR + hparam)
    writer.add_graph(sess.graph)

    config = tf.contrib.tensorboard.plugins.projector.ProjectorConfig()
    embedding_config = config.embeddings.add()
    embedding_config.tensor_name = embedding.name
    embedding_config.sprite.image_path = LOGDIR + 'sprite_1024.png'
    embedding_config.metadata_path = LOGDIR + 'labels_1024.tsv'
    # Specify the width and height of a single thumbnail.
    embedding_config.sprite.single_image_dim.extend([28, 28])
    tf.contrib.tensorboard.plugins.projector.visualize_embeddings(
        writer, config)

    for i in range(generations + 1):
        batch = mnist.train.next_batch(100)
        if i % 5 == 0:
            [train_accuracy, s] = sess.run([accuracy, summ], feed_dict={
                                           x: batch[0], y: batch[1]})
            writer.add_summary(s, i)
        if i % (generations / 4) == 0:
            sess.run(assignment, feed_dict={
                     x: mnist.test.images[:1024], y: mnist.test.labels[:1024]})
            saver.save(sess, os.path.join(LOGDIR, "model.ckpt"), i)
        sess.run(train_step, feed_dict={x: batch[0], y: batch[1]})


def make_hparam_string(learning_rate, use_two_fc, use_two_conv, conv1_features, conv2_features):
    conv_param = "conv2" if use_two_conv else "conv1"
    fc_param = "fc2" if use_two_fc else "fc1"
    return "lr_%.0E%s%s_%d_%d" % (learning_rate, conv_param, fc_param, conv1_features, conv2_features)


def main():
    # You can try adding some more learning rates
    # for learning_rate in [1E-3, 1E-4, 1E-5]:
    for learning_rate in [.005]:
        # Include "False" as a value to try different model architectures
        # for use_two_fc in [True, False]:
        for use_two_fc in [True]:
            # for use_two_conv in [True, False]:
            for use_two_conv in [True]:
                # for use_two_conv in [25, 32]:
                for conv1_features in [32]:
                    # for use_two_conv in [50, 64]:
                    for conv2_features in [64]:
                        # Construct a hyperparameter string for each one (example:
                        # "lr_1E-3fc2conv2")
                        hparam = make_hparam_string(
                            learning_rate, use_two_fc, use_two_conv, conv1_features, conv2_features)
                        print('Starting run for %s' % hparam)
                        # this forces print-ed lines to show up.
                        sys.stdout.flush()

                        # Actually run with the new settings
                        mnist_model(learning_rate, use_two_fc, use_two_conv, conv1_features,
                                    conv2_features, hparam, GENERATIONS, fully_connected_size1=512)


if __name__ == '__main__':
    main()

Problem 2.

Run corrected version of mnist2.py for 4 different architectures (2 conv, 1 conv, 2 fully connected, 1 fully connected layer) and 3 values of the learning rate. As one learning rate choose the one you selected in Problem 1 and then add one smaller and one larger learning rate around that one. Capture Accuracy (summary) graphs and One of Histograms to demonstrate to us that your code is working. Please also capture an image of “colorful” T-SNE Embedding. Please be aware that you are running 12 models and the execution might take many minutes. You might want to run your models in smaller groups so that you see them finish their work without too much wait. Submit working code of mnist2.py used in this problem. Collect execution times, final (smoothed) accuracies and final cross entropies for different models and provide tabulated presentation of the final results of different models (20%)

Run corrected version of mnist2.py for 4 different architectures (2 conv, 1 conv, 2 fully connected, 1 fully connected layer) and 3 values of the learning rate. As one learning rate choose the one you selected in Problem 1 and then add one smaller and one larger learning rate around that one.

Selected learning rates: 1E-03, 5E-03, 1E-04

Capture Accuracy (summary) graphs and One of Histograms to demonstrate to us that your code is working.

TensorBoard for 12 models for 2000 steps

Convolution layer 25/50
Fully connected layer 100

Cross enthropy

Collect execution times, final (smoothed) accuracies and final cross entropies for different models and provide tabulated presentation of the final results of different models.

Results for 2000 steps. We can see the accuracies have improved significantly close to 100% in some models.

Histograms

Please also capture an image of “colorful” T-SNE Embedding.

Submit working code of mnist2.py used in this problem.

# Copyright 2017 Google, Inc. All Rights Reserved.
#
# ==============================================================================
import os
import time
import sys
import tensorflow as tf
import urllib
import pandas as pd


if sys.version_info[0] >= 3:
    from urllib.request import urlretrieve
else:
    from urllib import urlretrieve

LOGDIR = 'log_mnist_fixed_25_50/'
GITHUB_URL = 'https://raw.githubusercontent.com/mamcgrath/TensorBoard-TF-Dev-Summit-Tutorial/master/'
GENERATIONS = 2000

### MNIST EMBEDDINGS ###
mnist = tf.contrib.learn.datasets.mnist.read_data_sets(
    train_dir=LOGDIR + 'data', one_hot=True)
### Get a sprite and labels file for the embedding projector ###
urlretrieve(GITHUB_URL + 'labels_1024.tsv', LOGDIR + 'labels_1024.tsv')
urlretrieve(GITHUB_URL + 'sprite_1024.png', LOGDIR + 'sprite_1024.png')

# Add convolution layer


def conv_layer(input, size_in, size_out, name="conv"):
    with tf.name_scope(name):
        #w = tf.Variable(tf.zeros([5, 5, size_in, size_out]), name="W")
        #b = tf.Variable(tf.zeros([size_out]), name="B")
        w = tf.Variable(tf.truncated_normal(
            [4, 4, size_in, size_out], stddev=0.1), name="W")
        #b = tf.Variable(tf.constant(0.1, shape=[size_out]), name="B")
        b = tf.Variable(tf.zeros([size_out], dtype=tf.float32), name="B")
        conv = tf.nn.conv2d(input, w, strides=[1, 1, 1, 1], padding="SAME")
        act = tf.nn.relu(conv + b)
        tf.summary.histogram("weights", w)
        tf.summary.histogram("biases", b)
        tf.summary.histogram("activations", act)
        return tf.nn.max_pool(act,
                              ksize=[1, 2, 2, 1],
                              strides=[1, 2, 2, 1],
                              padding="SAME")


# Add fully connected layer
def fc_layer(input, size_in, size_out, name="fc"):
    with tf.name_scope(name):
        w = tf.Variable(tf.truncated_normal(
            [size_in, size_out], stddev=0.1), name="W")
        #b = tf.Variable(tf.constant(0.1, shape=[size_out]), name="B")
        #b = tf.Variable(tf.zeros([size_out], dtype=tf.float32), name="B")
        b = tf.Variable(tf.truncated_normal(
            [size_out], stddev=0.1, dtype=tf.float32), name="B")
        act = tf.nn.relu(tf.add(tf.matmul(input, w), b))
        tf.summary.histogram("weights", w)
        tf.summary.histogram("biases", b)
        tf.summary.histogram("activations", act)
        return act


def mnist_model(learning_rate, use_two_conv, use_two_fc,
                hparam, conv1_features=25, conv2_features=50,
                generations=500, fully_connected_size1=100):
    tf.reset_default_graph()
    sess = tf.Session()

    # Setup placeholders, and reshape the data
    x = tf.placeholder(tf.float32, shape=[None, 784], name="x")
    x_image = tf.reshape(x, [-1, 28, 28, 1])
    tf.summary.image('input', x_image, 3)
    y = tf.placeholder(tf.float32, shape=[None, 10], name="labels")

    if use_two_conv:
        conv1 = conv_layer(x_image, 1, conv1_features, "conv1")
        conv_out = conv_layer(conv1, conv1_features, conv2_features, "conv2")
    else:
        conv1 = conv_layer(x_image, 1, conv2_features, "conv")
        conv_out = tf.nn.max_pool(conv1,
                                  ksize=[1, 2, 2, 1],
                                  strides=[1, 2, 2, 1], padding="SAME")

    flattened = tf.reshape(conv_out, [-1, 7 * 7 * conv2_features])

    if use_two_fc:
        fc1 = fc_layer(flattened, 7 * 7 * conv2_features,
                       fully_connected_size1, "fc1")
        embedding_input = fc1
        embedding_size = fully_connected_size1
        logits = fc_layer(fc1, fully_connected_size1, 10, "fc2")
    else:
        embedding_input = flattened
        embedding_size = 7 * 7 * conv2_features
        logits = fc_layer(flattened, 7 * 7 * conv2_features, 10, "fc")

    with tf.name_scope("xent"):
        xent = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(logits=logits,
                                                    labels=y), name="xent")
        tf.summary.scalar("xent", xent)

    with tf.name_scope("train"):
        #train_step = tf.train.AdamOptimizer(learning_rate).minimize(xent)
        train_step = tf.train.MomentumOptimizer(
            learning_rate, 0.9).minimize(xent)

    with tf.name_scope("accuracy"):
        correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        tf.summary.scalar("accuracy", accuracy)

    summ = tf.summary.merge_all()

    embedding = tf.Variable(tf.zeros([1024, embedding_size]),
                            name="test_embedding")
    assignment = embedding.assign(embedding_input)
    saver = tf.train.Saver()

    sess.run(tf.global_variables_initializer())
    writer = tf.summary.FileWriter(LOGDIR + hparam)
    writer.add_graph(sess.graph)

    config = tf.contrib.tensorboard.plugins.projector.ProjectorConfig()
    embedding_config = config.embeddings.add()
    embedding_config.tensor_name = embedding.name
    embedding_config.sprite.image_path = LOGDIR + 'sprite_1024.png'
    embedding_config.metadata_path = LOGDIR + 'labels_1024.tsv'
    # Specify the width and height of a single thumbnail.
    embedding_config.sprite.single_image_dim.extend([28, 28])
    tf.contrib.tensorboard.plugins.projector.visualize_embeddings(
        writer, config)

    for i in range(generations + 1):
        batch = mnist.train.next_batch(100)
        if i % 5 == 0:
            [train_accuracy, s] = sess.run([accuracy, summ],
                                           feed_dict={x: batch[0], y: batch[1]})
            writer.add_summary(s, i)
        if i % (generations / 4) == 0:
            sess.run(assignment,
                     feed_dict={x: mnist.test.images[:1024], y: mnist.test.labels[:1024]})
            saver.save(sess, os.path.join(LOGDIR, "model.ckpt"), i)
        sess.run(train_step, feed_dict={x: batch[0], y: batch[1]})
    [train_accuracy, train_xent] = sess.run(
        [accuracy, xent], feed_dict={x: batch[0], y: batch[1]})
    return [train_accuracy, train_xent]


def make_hparam_string(learning_rate, use_two_fc, use_two_conv):
    conv_param = "conv2" if use_two_conv else "conv1"
    fc_param = "fc2" if use_two_fc else "fc1"
    return "lr_%.0E%s%s" % (learning_rate, conv_param, fc_param)


def main():
    model_metrics_cols = ['Exec. Time', 'Accuracy', 'Cross Entropy']
    model_metrics_result = []
    model_metrics_idx = []
    # You can try adding some more learning rates
    # for learning_rate in [1E-3, 1E-4, 1E-5]:
    for learning_rate in [0.005, 1E-4, 1E-3]:
        # Include "False" as a value to try different model architectures
        # for use_two_fc in [True, False]:
        for use_two_fc in [True, False]:
            # for use_two_conv in [True, False]:
            for use_two_conv in [True, False]:
                # Construct a hyperparameter string for each one (example:
                # "lr_1E-3fc2conv2")
                hparam = make_hparam_string(learning_rate,
                                            use_two_fc, use_two_conv)
                print('Starting run for %s' % hparam)
                # this forces print-ed lines to show up.
                sys.stdout.flush()
                start_time = time.time()
                # Actually run with the new settings
                accuracy, xent = mnist_model(
                    learning_rate, use_two_fc,
                    use_two_conv, hparam, generations=GENERATIONS,
                    fully_connected_size1=100)
                total_time = time.time() - start_time
                model_metrics_idx.append(hparam)
                model_metrics_result.append([total_time, accuracy, xent])
                # print(model_metrics_result)
    df = pd.DataFrame(model_metrics_result,
                      index=model_metrics_idx,
                      columns=model_metrics_cols)
    print(df)


if __name__ == '__main__':
    main()

Problem 3.

Modify file cnn_mnist.py so that it publishes its summaries to the TensorBoard. Describe changes you are making and provide images of Accuracy and Cross Entropy summaries as captured by the Tensor Board. Provide the Graph of your model. Describe the differences if any between the graph of this program and the graph generated by mnist2.py script running with 2 convolutional and 2 fully connected layers. Provide working code. (35%).

Describe changes you are making and provide images of Accuracy and Cross Entropy summaries as captured by the Tensor Board.

  1. Added name scopes to organize models, variables, operations.
    Example:
    with tf.name_scope("graph"):
     with tf.name_scope("variables"):
    
  2. Added variable names for easy identification.
    Example:
    x_input = tf.placeholder(
     tf.float32, shape=x_input_shape, name="train_x")
    y_target = tf.placeholder(
     tf.int32, shape=(batch_size),  name="train_y")
    conv1_weight = tf.Variable(tf.truncated_normal(
     [4, 4, num_channels, conv1_features],
     stddev=0.1, dtype=tf.float32), name="conv1_W")
    conv1_bias = tf.Variable(tf.zeros(
     [conv1_features],
     dtype=tf.float32), name="conv1_B")
    
  3. Added scalar summaries for variables.
    Example:
    tf.summary.histogram("weights", conv1_weight)
    tf.summary.histogram("biases", conv1_bias)
    
  4. Added names for operations
    Example:

    with tf.name_scope("loss"):
         loss = tf.reduce_mean(
             tf.nn.sparse_softmax_cross_entropy_with_logits(
                 logits=model_output, labels=y_target))
         tf.summary.scalar("loss", loss)
    
     with tf.name_scope("optimizer"):
         my_optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9)
         train_step = my_optimizer.minimize(loss)
    
     with tf.name_scope("optimizer"):
         prediction = tf.nn.softmax(model_output)
    
  5. Added accuracy operation
    Example:

    with tf.name_scope("accuracy"):
         b_pred = tf.argmax(model_output, 1)
         # Debug
         # b_pred = tf.Print(b_pred, [b_pred], "b_pred = ")
         correct_prediction = tf.equal(tf.cast(b_pred, tf.int32), y_target)
         accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
     tf.summary.scalar("accuracy", accuracy)
    

TensorBoard acuracy

TensorBoard loss

TensorBoard Histograms

Describe the differences if any between the graph of this program and the graph generated by mnist2.py script running with 2 convolutional and 2 fully connected layers.

Full Graph

Main difference is that we have two graphs in this script. The training and test graphs.

Train and Test graphs

Provide working code.

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets
from tensorflow.python.framework import ops
ops.reset_default_graph()

# Start a graph session
sess = tf.Session()

# Load data
data_dir = 'mnist/'
log_dir = 'logcnn/'
mnist = read_data_sets(data_dir)

# Convert images into 28x28 (they are downloaded as 1x784)
train_xdata = np.array([np.reshape(x, (28, 28)) for x in mnist.train.images])
test_xdata = np.array([np.reshape(x, (28, 28)) for x in mnist.test.images])

# Convert labels into one-hot encoded vectors
train_labels = mnist.train.labels
test_labels = mnist.test.labels

# Set model parameters
generations = 500

batch_size = 100
learning_rate = 0.005
evaluation_size = 500
image_width = train_xdata[0].shape[0]
image_height = train_xdata[0].shape[1]
target_size = max(train_labels) + 1
num_channels = 1  # greyscale = 1 channel
eval_every = 5
conv1_features = 25
conv2_features = 50
max_pool_size1 = 2  # NxN window for 1st max pool layer
max_pool_size2 = 2  # NxN window for 2nd max pool layer
fully_connected_size1 = 100


with tf.name_scope("graph"):
    with tf.name_scope("variables"):
        x_input_shape = (batch_size, image_width, image_height, num_channels)
        x_input = tf.placeholder(
            tf.float32, shape=x_input_shape, name="train_x")
        y_target = tf.placeholder(
            tf.int32, shape=(batch_size),  name="train_y")

        eval_input_shape = (evaluation_size, image_width,
                            image_height, num_channels)
        eval_input = tf.placeholder(
            tf.float32, shape=eval_input_shape,  name="test_x")
        eval_target = tf.placeholder(
            tf.int32, shape=(evaluation_size), name="test_y")

        # Convolutional layer variables
        conv1_weight = tf.Variable(tf.truncated_normal(
            [4, 4, num_channels, conv1_features],
            stddev=0.1, dtype=tf.float32), name="conv1_W")
        conv1_bias = tf.Variable(tf.zeros(
            [conv1_features],
            dtype=tf.float32), name="conv1_B")
        conv2_weight = tf.Variable(tf.truncated_normal(
            [4, 4, conv1_features, conv2_features],
            stddev=0.1, dtype=tf.float32), name="conv2_W")
        conv2_bias = tf.Variable(tf.zeros(
            [conv2_features], dtype=tf.float32), name="conv2_B")

        # fully connected variables
        resulting_width = image_width // (max_pool_size1 * max_pool_size2)  # 7
        resulting_height = image_height // (max_pool_size1 * max_pool_size2) # 7
        full1_input_size = resulting_width * resulting_height * conv2_features  # 7*7*50=2450

        full1_weight = tf.Variable(tf.truncated_normal(
            [full1_input_size, fully_connected_size1],
            stddev=0.1, dtype=tf.float32), name="full1_W")
        full1_bias = tf.Variable(tf.truncated_normal(
            [fully_connected_size1],
            stddev=0.1, dtype=tf.float32), name="full1_B")
        full2_weight = tf.Variable(tf.truncated_normal(
            [fully_connected_size1, target_size],
            stddev=0.1, dtype=tf.float32), name="full2_W")
        full2_bias = tf.Variable(tf.truncated_normal(
            [target_size],
            stddev=0.1, dtype=tf.float32), name="full2_B")

    # Initialize Model Operations
    def my_conv_net(input_data, graph_name):
        # with tf.name_scope(graph_name):
        # First Conv-ReLU-MaxPool Layer
        with tf.name_scope("conv"):
            conv1 = tf.nn.conv2d(input_data, conv1_weight,
                                    strides=[1, 1, 1, 1], padding='SAME')
            relu1 = tf.nn.relu(tf.nn.bias_add(conv1, conv1_bias))
            max_pool1 = tf.nn.max_pool(relu1,
                                        ksize=[1, max_pool_size1,
                                                max_pool_size1, 1],
                                        strides=[1, max_pool_size1,
                                                max_pool_size1, 1],
                                        padding='SAME')
            tf.summary.histogram("weights", conv1_weight)
            tf.summary.histogram("biases", conv1_bias)

        # Second Conv-ReLU-MaxPool Layer
        with tf.name_scope("conv"):
            conv2 = tf.nn.conv2d(max_pool1, conv2_weight,
                                    strides=[1, 1, 1, 1], padding='SAME')
            relu2 = tf.nn.relu(tf.nn.bias_add(conv2, conv2_bias))
            max_pool2 = tf.nn.max_pool(relu2,
                                        ksize=[1, max_pool_size2,
                                                max_pool_size2, 1],
                                        strides=[1, max_pool_size2,
                                                max_pool_size2, 1],
                                        padding='SAME')
            tf.summary.histogram("weights", conv2_weight)
            tf.summary.histogram("biases", conv2_bias)

        # Transform Output into a 1xN layer for next fully connected layer
        with tf.name_scope("reshape"):
            final_conv_shape = max_pool2.get_shape().as_list()
            final_shape = final_conv_shape[1] * \
                final_conv_shape[2] * final_conv_shape[3]
            flat_output = tf.reshape(
                max_pool2, [final_conv_shape[0], final_shape])

        # First Fully Connected Layer
        with tf.name_scope("fc"):
            fully_connected1 = tf.nn.relu(
                tf.add(tf.matmul(flat_output, full1_weight), full1_bias))
            tf.summary.histogram("weights", full1_weight)
            tf.summary.histogram("biases", full1_bias)

        # Second Fully Connected Layer
        with tf.name_scope("fc"):
            final_model_output = tf.add(
                tf.matmul(fully_connected1, full2_weight), full2_bias)
            fully_connected1 = tf.nn.relu(
                tf.add(tf.matmul(flat_output, full1_weight), full1_bias))
            tf.summary.histogram("weights", full2_weight)
            tf.summary.histogram("biases", full2_bias)

        return final_model_output

    with tf.name_scope("train"):
        model_output = my_conv_net(x_input, "train")

        # Declare Loss Function (softmax cross entropy)
        with tf.name_scope("loss"):
            loss = tf.reduce_mean(
                tf.nn.sparse_softmax_cross_entropy_with_logits(
                    logits=model_output, labels=y_target))
            tf.summary.scalar("loss", loss)

        with tf.name_scope("accuracy"):
            b_pred = tf.argmax(model_output, 1)
            # Debug
            # b_pred = tf.Print(b_pred, [b_pred], "b_pred = ")
            correct_prediction = tf.equal(tf.cast(b_pred, tf.int32), y_target)
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
            tf.summary.scalar("accuracy", accuracy)

        # Create an optimizer
        with tf.name_scope("optimizer"):
            my_optimizer = tf.train.MomentumOptimizer(learning_rate, 0.9)
            train_step = my_optimizer.minimize(loss)

        # Create a prediction function
        with tf.name_scope("optimizer"):
            prediction = tf.nn.softmax(model_output)


    with tf.name_scope("test"):
        test_model_output = my_conv_net(eval_input, "test")

        # Create a prediction function
        with tf.name_scope("optimizer"):
            test_prediction = tf.nn.softmax(test_model_output)

    with tf.name_scope("global_ops"):
        # Initialize Variables
        init = tf.global_variables_initializer()
        summ = tf.summary.merge_all()

    # Create accuracy function
    def get_accuracy(logits, targets):
        batch_predictions = np.argmax(logits, axis=1)
        num_correct = np.sum(np.equal(batch_predictions, targets))
        ret_val = 100. * num_correct / batch_predictions.shape[0]
        return(ret_val)

sess.run(init)
writer = tf.summary.FileWriter(log_dir)
writer.add_graph(sess.graph)


# Start training loop
train_loss = []
train_acc = []
test_acc = []
for i in range(generations):
    rand_index = np.random.choice(len(train_xdata),
                                  size=batch_size)
    rand_x = train_xdata[rand_index]
    rand_x = np.expand_dims(rand_x, 3)
    rand_y = train_labels[rand_index]
    train_dict = {x_input: rand_x, y_target: rand_y}

    sess.run(train_step, feed_dict=train_dict)
    temp_train_loss, temp_train_preds, s = sess.run(
        [loss, prediction, summ], feed_dict=train_dict)
    temp_train_acc = get_accuracy(temp_train_preds, rand_y)

    if (i + 1) % eval_every == 0:
        # Write summaries
        writer.add_summary(s, i)
        eval_index = np.random.choice(len(test_xdata),
                                      size=evaluation_size)
        eval_x = test_xdata[eval_index]
        eval_x = np.expand_dims(eval_x, 3)
        eval_y = test_labels[eval_index]
        test_dict = {eval_input: eval_x, eval_target: eval_y}
        test_preds = sess.run(test_prediction, feed_dict=test_dict)
        temp_test_acc = get_accuracy(test_preds, eval_y)

        # Record and print results
        train_loss.append(temp_train_loss)
        train_acc.append(temp_train_acc)
        test_acc.append(temp_test_acc)
        acc_and_loss = [(i + 1), temp_train_loss,
                        temp_train_acc, temp_test_acc]
        acc_and_loss = [np.round(x, 2) for x in acc_and_loss]
        print('Generation # {}. Train Loss: {:.2f}. Train Acc (Test Acc): {:.2f} ({:.2f})'.
              format(*acc_and_loss))

writer.flush()
writer.close()
sess.close()