This Notebook implements the TensorFlow tutorial which uses SoftMax Logistic Regression on the MNIST dataset
In [ ]:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import matplotlib.pyplot as plt
import time
Import MNIST Data
In [2]:
mnist = input_data.read_data_sets("../datasets/MNIST/", one_hot=True)
Look at sizes of training, validation and test sets Each image is 28 X 28 pixels Labels are in one hot encoding for use with softmax
In [3]:
print(mnist.train.num_examples)
print(mnist.validation.num_examples)
print(mnist.test.num_examples)
plt.imshow(mnist.train.images[30].reshape(28,28),cmap="Greys")
plt.show()
print (mnist.train.labels[30])
Declare Variables
In [4]:
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
Implement Model
In [5]:
y = tf.nn.softmax(tf.matmul(x, W) + b)
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
Define function that runs the model for given number of batches and returns the training time and accuracy on the training, validation and test data sets
In [16]:
def train_and_test_model(batches, verbose=False):
start = time.time()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for _ in range(batches):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
train_time = time.time() - start
train_accuracy = sess.run(accuracy, feed_dict={x: mnist.train.images, y_: mnist.train.labels})
validation_accuracy = sess.run(accuracy, feed_dict={x: mnist.validation.images, y_: mnist.validation.labels})
test_accuracy = sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})
if verbose:
print(batches, train_time, train_accuracy, validation_accuracy, test_accuracy)
return (train_time, train_accuracy, validation_accuracy, test_accuracy)
The tutorial runs a 1000 batches each of size 100 which is about 2 epochs with 55000 test images. Lets run the model for 1, 2 and 5 epochs and see how accuracy changes with longer training time and also record how long it takes to train.
In [17]:
one_epoch = train_and_test_model(550,verbose=True)
two_epoch = train_and_test_model(1100,verbose=True)
five_epoch = train_and_test_model(2750,verbose=True)
ten_epoch = train_and_test_model(5500,verbose=True)
twenty_epoch = train_and_test_model(11000,verbose=True)
twentyfive_epoch = train_and_test_model(13750,verbose=True)
print(one_epoch,two_epoch,five_epoch,ten_epoch,twenty_epoch,twentyfive_epoch)
Num Epochs | Train Time | Training Accuracy | Validation Accuracy | Test Accuracy |
---|---|---|---|---|
1 | 0.6701 | 0.9125 | 0.9182 | 0.9185 |
2 | 1.6542 | 0.9180 | 0.9234 | 0.9185 |
5 | 3.1157 | 0.9248 | 0.9232 | 0.9246 |
10 | 6.5320 | 0.9282 | 0.9262 | 0.9219 |
20 | 12.8172 | 0.9273 | 0.9236 | 0.9228 |
25 | 15.7994 | 0.9326 | 0.9256 | 0.9268 |
It looks like training the model for more than 5 epochs does not improve training accuracy significantly and validation/test accuracy has plateaued. So running 5 epochs in 3.2 seconds for training and 92.5% accuracy looks like the benchmark to aim for while developing a hand coded model using numpy.
In [ ]: