Copyright 2016 Google Inc. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


This notebook maps to the simple MNIST linear classifier example here, and is used with this readme.

First, do some imports...


In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import argparse

from six.moves import xrange
import tensorflow as tf
from tensorflow.contrib.learn.python.learn.datasets.mnist import read_data_sets

DATA_DIR = "/tmp/MNIST_data"

Then load the data set that we'll use (via a predefined utility).


In [2]:
print("Downloading and reading data sets...")
mnist = read_data_sets(DATA_DIR, one_hot=True)


Downloading and reading data sets...
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Next, create the model graph. It includes input 'placeholders'.


In [3]:
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b

y_ = tf.placeholder(tf.float32, [None, 10])

Define loss and optimizer. The raw formulation of cross-entropy,

  tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)),
                                reduction_indices=[1]))

can be numerically unstable. So here we use tf.nn.softmax_cross_entropy_with_logits on the raw outputs of y, and then average across the batch.


In [4]:
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

Next, create a session and initialize the graph variables. We'll also set the number of steps we'll train with.


In [5]:
NUM_STEPS = 10000
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

Now we're ready to train the model. (Wait for the "training finished" message-- it won't take long).


In [6]:
# Train the model
print("training for %s steps" % NUM_STEPS)
for _ in xrange(NUM_STEPS):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
print("training finished.")


training for 10000 steps
training finished.

In [7]:
# Test the trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print("accuracy: %s " % sess.run(accuracy,
                                 feed_dict={x: mnist.test.images,
                                            y_: mnist.test.labels}))


accuracy: 0.9242 

This accuracy is okay but not great :) ... as we'll see next, we can create a more accurate model using some hidden layers.