Handwritten Number Recognition with TFLearn and MNIST

In this notebook, we'll be building a neural network that recognizes handwritten numbers 0-9.

This kind of neural network is used in a variety of real-world applications including: recognizing phone numbers and sorting postal mail by address. To build the network, we'll be using the MNIST data set, which consists of images of handwritten numbers and their correct labels 0-9.

We'll be using TFLearn, a high-level library built on top of TensorFlow to build the neural network. We'll start off by importing all the modules we'll need, then load the data, and finally build the network.


In [1]:
# Import Numpy, TensorFlow, TFLearn, and MNIST data
import numpy as np
import tensorflow as tf
import tflearn
import tflearn.datasets.mnist as mnist

Retrieving training and test data

The MNIST data set already contains both training and test data. There are 55,000 data points of training data, and 10,000 points of test data.

Each MNIST data point has:

  1. an image of a handwritten digit and
  2. a corresponding label (a number 0-9 that identifies the image)

We'll call the images, which will be the input to our neural network, X and their corresponding labels Y.

We're going to want our labels as one-hot vectors, which are vectors that holds mostly 0's and one 1. It's easiest to see this in a example. As a one-hot vector, the number 0 is represented as [1, 0, 0, 0, 0, 0, 0, 0, 0, 0], and 4 is represented as [0, 0, 0, 0, 1, 0, 0, 0, 0, 0].

Flattened data

For this example, we'll be using flattened data or a representation of MNIST images in one dimension rather than two. So, each handwritten number image, which is 28x28 pixels, will be represented as a one dimensional array of 784 pixel values.

Flattening the data throws away information about the 2D structure of the image, but it simplifies our data so that all of the training data can be contained in one array whose shape is [55000, 784]; the first dimension is the number of training images and the second dimension is the number of pixels in each image. This is the kind of data that is easy to analyze using a simple neural network.


In [2]:
# Retrieve the training and test data
trainX, trainY, testX, testY = mnist.load_data(one_hot=True)


Extracting mnist/train-images-idx3-ubyte.gz
Extracting mnist/train-labels-idx1-ubyte.gz
Extracting mnist/t10k-images-idx3-ubyte.gz
Extracting mnist/t10k-labels-idx1-ubyte.gz

Visualize the training data

Provided below is a function that will help you visualize the MNIST data. By passing in the index of a training example, the function show_digit will display that training image along with it's corresponding label in the title.


In [3]:
# Visualizing the data
import matplotlib.pyplot as plt
%matplotlib inline

# Function for displaying a training image by it's index in the MNIST set
def show_digit(index):
    label = trainY[index].argmax(axis=0)
    # Reshape 784 array into 28x28 image
    image = trainX[index].reshape([28,28])
    plt.title('Training data, index: %d,  Label: %d' % (index, label))
    plt.imshow(image, cmap='gray_r')
    plt.show()
    
# Display the first (index 0) training image
show_digit(0)


Building the network

TFLearn lets you build the network by defining the layers in that network.

For this example, you'll define:

  1. The input layer, which tells the network the number of inputs it should expect for each piece of MNIST data.
  2. Hidden layers, which recognize patterns in data and connect the input to the output layer, and
  3. The output layer, which defines how the network learns and outputs a label for a given image.

Let's start with the input layer; to define the input layer, you'll define the type of data that the network expects. For example,

net = tflearn.input_data([None, 100])

would create a network with 100 inputs. The number of inputs to your network needs to match the size of your data. For this example, we're using 784 element long vectors to encode our input data, so we need 784 input units.

Adding layers

To add new hidden layers, you use

net = tflearn.fully_connected(net, n_units, activation='ReLU')

This adds a fully connected layer where every unit (or node) in the previous layer is connected to every unit in this layer. The first argument net is the network you created in the tflearn.input_data call, it designates the input to the hidden layer. You can set the number of units in the layer with n_units, and set the activation function with the activation keyword. You can keep adding layers to your network by repeated calling tflearn.fully_connected(net, n_units).

Then, to set how you train the network, use:

net = tflearn.regression(net, optimizer='sgd', learning_rate=0.1, loss='categorical_crossentropy')

Again, this is passing in the network you've been building. The keywords:

  • optimizer sets the training method, here stochastic gradient descent
  • learning_rate is the learning rate
  • loss determines how the network error is calculated. In this example, with categorical cross-entropy.

Finally, you put all this together to create the model with tflearn.DNN(net).

Exercise: Below in the build_model() function, you'll put together the network using TFLearn. You get to choose how many layers to use, how many hidden units, etc.

Hint: The final output layer must have 10 output nodes (one for each digit 0-9). It's also recommended to use a softmax activation layer as your final output layer.


In [4]:
# Define the neural network
def build_model():
    # This resets all parameters and variables, leave this here
    tf.reset_default_graph()
    
    #### Your code ####
    net = tflearn.input_data([None, 28*28])
    net = tflearn.fully_connected(net, 128, activation='relu')   # Hidden
    net = tflearn.fully_connected(net, 32, activation='relu')   # Hidden
    net = tflearn.fully_connected(net, 10, activation='softmax')   # Output
    net = tflearn.regression(net, optimizer='sgd',
                             learning_rate=0.1,
                             loss='categorical_crossentropy')
    
    # This model assumes that your network is named "net"    
    model = tflearn.DNN(net)
    return model

In [5]:
# Build the model
model = build_model()

Training the network

Now that we've constructed the network, saved as the variable model, we can fit it to the data. Here we use the model.fit method. You pass in the training features trainX and the training targets trainY. Below I set validation_set=0.1 which reserves 10% of the data set as the validation set. You can also set the batch size and number of epochs with the batch_size and n_epoch keywords, respectively.

Too few epochs don't effectively train your network, and too many take a long time to execute. Choose wisely!


In [11]:
# Training
with tf.device('/gpu:0'):
    model.fit(trainX, trainY, validation_set=0.1, show_metric=True, batch_size=100, n_epoch=10)


---------------------------------
Run id: RLUHT9
Log directory: /tmp/tflearn_logs/
---------------------------------
Training samples: 49500
Validation samples: 5500
--
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1138     try:
-> 1139       return fn(*args)
   1140     except errors.OpError as e:

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1116       # Ensure any changes to the graph are reflected in the runtime.
-> 1117       self._extend_graph()
   1118       with errors.raise_exception_on_not_ok_status() as status:

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in _extend_graph(self)
   1165           tf_session.TF_ExtendGraph(
-> 1166               self._session, graph_def.SerializeToString(), status)
   1167         self._opened = True

~/anaconda3/envs/tflearn/lib/python3.5/contextlib.py in __exit__(self, type, value, traceback)
     65             try:
---> 66                 next(self.gen)
     67             except StopIteration:

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py in raise_exception_on_not_ok_status()
    465           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466           pywrap_tensorflow.TF_GetCode(status))
    467   finally:

InvalidArgumentError: Cannot assign a device for operation 'Merge_15/MergeSummary': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
	 [[Node: Merge_15/MergeSummary = MergeSummary[N=2, _device="/device:GPU:0"](Loss/Validation, Accuracy/Validation)]]

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-11-488e7ebafc1b> in <module>()
      1 # Training
      2 with tf.device('/gpu:0'):
----> 3     model.fit(trainX, trainY, validation_set=0.1, show_metric=True, batch_size=100, n_epoch=10)

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/models/dnn.py in fit(self, X_inputs, Y_targets, n_epoch, validation_set, show_metric, batch_size, shuffle, snapshot_epoch, snapshot_step, excl_trainops, validation_batch_size, run_id, callbacks)
    214                          excl_trainops=excl_trainops,
    215                          run_id=run_id,
--> 216                          callbacks=callbacks)
    217 
    218     def fit_batch(self, X_inputs, Y_targets):

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/helpers/trainer.py in fit(self, feed_dicts, n_epoch, val_feed_dicts, show_metric, snapshot_step, snapshot_epoch, shuffle_all, dprep_dict, daug_dict, excl_trainops, run_id, callbacks)
    337                                                        (bool(self.best_checkpoint_path) | snapshot_epoch),
    338                                                        snapshot_step,
--> 339                                                        show_metric)
    340 
    341                             # Update training state

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/helpers/trainer.py in _train(self, training_step, snapshot_epoch, snapshot_step, show_metric)
    814 
    815         feed_batch = self.train_dflow.next()
--> 816         tflearn.is_training(True, session=self.session)
    817         _, train_summ_str = self.session.run([self.train, self.summ_op],
    818                                              feed_batch)

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/config.py in is_training(is_training, session)
     93     init_training_mode()
     94     if is_training:
---> 95         tf.get_collection('is_training_ops')[0].eval(session=session)
     96     else:
     97         tf.get_collection('is_training_ops')[1].eval(session=session)

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in eval(self, feed_dict, session)
    604 
    605     """
--> 606     return _eval_using_default_session(self, feed_dict, self.graph, session)
    607 
    608 

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/framework/ops.py in _eval_using_default_session(tensors, feed_dict, graph, session)
   3926                        "the tensor's graph is different from the session's "
   3927                        "graph.")
-> 3928   return session.run(tensors, feed_dict)
   3929 
   3930 

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    787     try:
    788       result = self._run(None, fetches, feed_dict, options_ptr,
--> 789                          run_metadata_ptr)
    790       if run_metadata:
    791         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
    995     if final_fetches or final_targets:
    996       results = self._do_run(handle, final_targets, final_fetches,
--> 997                              feed_dict_string, options, run_metadata)
    998     else:
    999       results = []

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1130     if handle is None:
   1131       return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
-> 1132                            target_list, options, run_metadata)
   1133     else:
   1134       return self._do_call(_prun_fn, self._session, handle, feed_dict,

~/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1150         except KeyError:
   1151           pass
-> 1152       raise type(e)(node_def, op, message)
   1153 
   1154   def _extend_graph(self):

InvalidArgumentError: Cannot assign a device for operation 'Merge_15/MergeSummary': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
	 [[Node: Merge_15/MergeSummary = MergeSummary[N=2, _device="/device:GPU:0"](Loss/Validation, Accuracy/Validation)]]

Caused by op 'Merge_15/MergeSummary', defined at:
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2698, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2802, in run_ast_nodes
    if self.run_code(code, result):
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-11-488e7ebafc1b>", line 3, in <module>
    model.fit(trainX, trainY, validation_set=0.1, show_metric=True, batch_size=100, n_epoch=10)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/models/dnn.py", line 216, in fit
    callbacks=callbacks)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/helpers/trainer.py", line 288, in fit
    self.summ_writer, self.coord)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/helpers/trainer.py", line 794, in initialize_fit
    val_feed_dict)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/helpers/trainer.py", line 955, in create_testing_summaries
    te_summ_collection)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tflearn/helpers/summarizer.py", line 99, in summarize
    return merge_summary(tf.get_collection(summary_collection))
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/summary/summary.py", line 301, in merge
    val = _gen_logging_ops._merge_summary(inputs=inputs, name=name)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 215, in _merge_summary
    result = _op_def_lib.apply_op("MergeSummary", inputs=inputs, name=name)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/adrsta/anaconda3/envs/tflearn/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'Merge_15/MergeSummary': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
	 [[Node: Merge_15/MergeSummary = MergeSummary[N=2, _device="/device:GPU:0"](Loss/Validation, Accuracy/Validation)]]

In [9]:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()


Out[9]:
[name: "/cpu:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 12033015852942285969, name: "/gpu:0"
 device_type: "GPU"
 memory_limit: 246349824
 locality {
   bus_id: 1
 }
 incarnation: 3420385210092450779
 physical_device_desc: "device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:23:00.0"]

Testing

After you're satisified with the training output and accuracy, you can then run the network on the test data set to measure it's performance! Remember, only do this after you've done the training and are satisfied with the results.

A good result will be higher than 95% accuracy. Some simple models have been known to get up to 99.7% accuracy!


In [23]:
# Compare the labels that our model predicts with the actual labels

# Find the indices of the most confident prediction for each item. That tells us the predicted digit for that sample.
predictions = np.array(model.predict(testX)).argmax(axis=1)

# Calculate the accuracy, which is the percentage of times the predicated labels matched the actual labels
actual = testY.argmax(axis=1)
test_accuracy = np.mean(predictions == actual, axis=0)

# Print out the result
print("Test accuracy: ", test_accuracy)


Test accuracy:  0.9721

In [ ]: