This Notebook implements the TensorFlow tutorial which uses SoftMax Logistic Regression on the MNIST dataset


In [ ]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import time

Import MNIST Data


In [2]:
mnist = input_data.read_data_sets("../datasets/MNIST/", one_hot=True)


Extracting ../datasets/MNIST/train-images-idx3-ubyte.gz
Extracting ../datasets/MNIST/train-labels-idx1-ubyte.gz
Extracting ../datasets/MNIST/t10k-images-idx3-ubyte.gz
Extracting ../datasets/MNIST/t10k-labels-idx1-ubyte.gz

Look at sizes of training, validation and test sets Each image is 28 X 28 pixels Labels are in one hot encoding for use with softmax


In [3]:
print(mnist.train.num_examples)
print(mnist.validation.num_examples)
print(mnist.test.num_examples)
plt.imshow(mnist.train.images[30].reshape(28,28),cmap="Greys")
plt.show()
print (mnist.train.labels[30])


55000
5000
10000
[ 0.  0.  0.  0.  0.  0.  0.  0.  1.  0.]

Declare Variables


In [4]:
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

Implement Model


In [5]:
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Define function that runs the model for given number of batches and returns the training time and accuracy on the training, validation and test data sets


In [16]:
def train_and_test_model(batches, verbose=False):
    start = time.time()
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        for _ in range(batches):
            batch_xs, batch_ys = mnist.train.next_batch(100)
            sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
        train_time = time.time() - start
        train_accuracy = sess.run(accuracy, feed_dict={x: mnist.train.images, y_: mnist.train.labels})
        validation_accuracy = sess.run(accuracy, feed_dict={x: mnist.validation.images, y_: mnist.validation.labels})
        test_accuracy = sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
        if verbose:
            print(batches, train_time, train_accuracy, validation_accuracy, test_accuracy)
        return (train_time, train_accuracy, validation_accuracy, test_accuracy)

The tutorial runs a 1000 batches each of size 100 which is about 2 epochs with 55000 test images. Lets run the model for 1, 2 and 5 epochs and see how accuracy changes with longer training time and also record how long it takes to train.


In [17]:
one_epoch = train_and_test_model(550,verbose=True)
two_epoch = train_and_test_model(1100,verbose=True)
five_epoch = train_and_test_model(2750,verbose=True)
ten_epoch = train_and_test_model(5500,verbose=True)
twenty_epoch = train_and_test_model(11000,verbose=True)
twentyfive_epoch = train_and_test_model(13750,verbose=True)
print(one_epoch,two_epoch,five_epoch,ten_epoch,twenty_epoch,twentyfive_epoch)


550 0.670133113861084 0.912527 0.9182 0.9185
1100 1.654205083847046 0.918 0.9234 0.9185
2750 3.115687131881714 0.924782 0.9232 0.9246
5500 6.5319788455963135 0.928273 0.9262 0.9219
11000 12.81721806526184 0.927345 0.9236 0.9228
13750 15.79944109916687 0.932582 0.9256 0.9268
(0.670133113861084, 0.91252726, 0.91820002, 0.91850001) (1.654205083847046, 0.91799998, 0.92339998, 0.91850001) (3.115687131881714, 0.9247818, 0.92320001, 0.92460001) (6.5319788455963135, 0.92827272, 0.92619997, 0.92189997) (12.81721806526184, 0.92734545, 0.92360002, 0.9228) (15.79944109916687, 0.93258184, 0.92559999, 0.92680001)
Num Epochs Train Time Training Accuracy Validation Accuracy Test Accuracy
1 0.6701 0.9125 0.9182 0.9185
2 1.6542 0.9180 0.9234 0.9185
5 3.1157 0.9248 0.9232 0.9246
10 6.5320 0.9282 0.9262 0.9219
20 12.8172 0.9273 0.9236 0.9228
25 15.7994 0.9326 0.9256 0.9268

It looks like training the model for more than 5 epochs does not improve training accuracy significantly and validation/test accuracy has plateaued. So running 5 epochs in 3.2 seconds for training and 92.5% accuracy looks like the benchmark to aim for while developing a hand coded model using numpy.


In [ ]: