This notebook maps to the simple MNIST linear classifier example here, and is used with this readme.

First, do some imports...

In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse

from six.moves import xrange
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets

DATA_DIR = "/tmp/MNIST_data"

Then load the data set that we'll use (via a predefined utility).

In [2]:
print("Downloading and reading data sets...")
mnist = read_data_sets(DATA_DIR, one_hot=True)

Downloading and reading data sets...
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Next, create the model graph. It includes input 'placeholders'.

In [3]:
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b

y_ = tf.placeholder(tf.float32, [None, 10])

Define loss and optimizer. The raw formulation of cross-entropy,

  tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)),

can be numerically unstable. So here we use tf.nn.softmax_cross_entropy_with_logits on the raw outputs of y, and then average across the batch.

In [4]:
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

Next, create a session and initialize the graph variables. We'll also set the number of steps we'll train with.

In [5]:
NUM_STEPS = 10000
init = tf.initialize_all_variables()
sess = tf.Session()

Now we're ready to train the model. (Wait for the "training finished" message-- it won't take long).

In [6]:
# Train the model
print("training for %s steps" % NUM_STEPS)
for _ in xrange(NUM_STEPS):
    batch_xs, batch_ys = mnist.train.next_batch(100), feed_dict={x: batch_xs, y_: batch_ys})
print("training finished.")

training for 10000 steps
training finished.

In [7]:
# Test the trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("accuracy: %s " %,
                                 feed_dict={x: mnist.test.images,
                                            y_: mnist.test.labels}))

accuracy: 0.9242 

This accuracy is okay but not great :) ... as we'll see next, we can create a more accurate model using some hidden layers.