TensorFlow's Deep MNIST tutorial

https://www.tensorflow.org/get_started/mnist/pros

  • start tf.session
  • define a model
  • define a training loss function
  • train using TensorFlow

In [1]:
#load mnist data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)


Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Initiate a tf.session

We're going to eventually define a graph which will represent a "dataflow computation". Before we start buiding our graph by creating nodes, we first initial a tf.session. A session allows us to execute graphs. It also allows for the specification of resource allocation (more than one CPU/GPU/machine). The session also holds the values of our intermediate results during training and the values of variables during training.


In [3]:
#Start TensorFlow InteractiveSession
import tensorflow as tf
sess = tf.InteractiveSession()

In [4]:
# create placeholder nodes for the input images and target output

#x will consist of a 2d tensor floating point numbers. 784  = 28*28 pixels 
# None indicates the batch size, because we specify 'None' can be a variable size
x = tf.placeholder(tf.float32, shape=[None, 784])

#y_ is another 2d tensor, where each row isicallly a one-hot 10 diminsional vector
#shape option allows TF to automatically catch bugs due to inconsistent
# tensor shapes
y_ = tf.placeholder(tf.float32, shape=[None, 10])

Variables

A Variable is a value that "lives in TensorFlow's computation graph".


In [7]:
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

Before we can use them, gotta intialize them


In [8]:
sess.run(tf.global_variables_initializer())

Let's add in a regression model.


In [9]:
#x: input images
#W: weight matrix
#b: bias
y = tf.matmul(x,W) + b

Specify a corss-entropy loss function. So, the cross-entropy between the target and the softmax activation function applied to the model's prediction.

Note that tf.nn.softmax_cross_entropy_with_logits internally applies the softmax on the model's unnormalized model prediction and sums across all classes, and tf.reduce_mean takes the average over these sums.


In [11]:
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

Train the model


In [15]:
%%time
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

for _ in range(1000):
    batch = mnist.train.next_batch(100)
    train_step.run(feed_dict={x: batch[0], y_: batch[1]})


CPU times: user 1.85 s, sys: 324 ms, total: 2.18 s
Wall time: 1.38 s

In [15]:
batch = mnist.train.next_batch(100)
batch[1].shape


Out[15]:
(100, 10)

Evaluate the model


In [20]:
#tf.argmax gives an index of the highest entry in a tensor along some axis
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

#we can take this list of booleans and calculate the fraction correct
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

#print the accuracy
print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))


0.9225

Now let's build a multilayer convolutional network


In [21]:
#because we're gonna need 
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

Convolution and pooling


In [24]:
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

In [25]:
#first convo layer

W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

#reshape x to a 4d tensor
x_image = tf.reshape(x, [-1,28,28,1])

#convolve x_image with the weight tensor, add bias, 
#apply the ReLU function, and finally max pool

h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

In [26]:
#second convo layer

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

In [27]:
# densely connected layer
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

In [28]:
# dropout

keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

In [29]:
# readout layer

W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

In [30]:
%% time
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess.run(tf.global_variables_initializer())
for i in range(20000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={
        x:batch[0], y_: batch[1], keep_prob: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy))
  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

print("test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))


step 0, training accuracy 0.1
step 100, training accuracy 0.86
step 200, training accuracy 0.84
step 300, training accuracy 0.92
step 400, training accuracy 0.96
step 500, training accuracy 0.98
step 600, training accuracy 0.92
step 700, training accuracy 1
step 800, training accuracy 0.94
step 900, training accuracy 0.94
step 1000, training accuracy 1
step 1100, training accuracy 0.9
step 1200, training accuracy 0.94
step 1300, training accuracy 0.98
step 1400, training accuracy 0.98
step 1500, training accuracy 0.96
step 1600, training accuracy 1
step 1700, training accuracy 0.98
step 1800, training accuracy 1
step 1900, training accuracy 0.92
step 2000, training accuracy 0.98
step 2100, training accuracy 0.98
step 2200, training accuracy 0.98
step 2300, training accuracy 0.98
step 2400, training accuracy 0.98
step 2500, training accuracy 0.96
step 2600, training accuracy 0.98
step 2700, training accuracy 0.94
step 2800, training accuracy 1
step 2900, training accuracy 0.98
step 3000, training accuracy 0.98
step 3100, training accuracy 0.98
step 3200, training accuracy 0.98
step 3300, training accuracy 1
step 3400, training accuracy 0.98
step 3500, training accuracy 0.98
step 3600, training accuracy 1
step 3700, training accuracy 1
step 3800, training accuracy 0.98
step 3900, training accuracy 1
step 4000, training accuracy 1
step 4100, training accuracy 0.98
step 4200, training accuracy 0.98
step 4300, training accuracy 1
step 4400, training accuracy 0.98
step 4500, training accuracy 0.98
step 4600, training accuracy 0.98
step 4700, training accuracy 1
step 4800, training accuracy 0.98
step 4900, training accuracy 0.98
step 5000, training accuracy 1
step 5100, training accuracy 1
step 5200, training accuracy 1
step 5300, training accuracy 0.98
step 5400, training accuracy 0.98
step 5500, training accuracy 1
step 5600, training accuracy 1
step 5700, training accuracy 1
step 5800, training accuracy 0.98
step 5900, training accuracy 1
step 6000, training accuracy 1
step 6100, training accuracy 1
step 6200, training accuracy 1
step 6300, training accuracy 1
step 6400, training accuracy 1
step 6500, training accuracy 1
step 6600, training accuracy 1
step 6700, training accuracy 1
step 6800, training accuracy 1
step 6900, training accuracy 1
step 7000, training accuracy 1
step 7100, training accuracy 1
step 7200, training accuracy 1
step 7300, training accuracy 1
step 7400, training accuracy 1
step 7500, training accuracy 0.98
step 7600, training accuracy 1
step 7700, training accuracy 1
step 7800, training accuracy 1
step 7900, training accuracy 1
step 8000, training accuracy 0.98
step 8100, training accuracy 1
step 8200, training accuracy 0.98
step 8300, training accuracy 0.98
step 8400, training accuracy 1
step 8500, training accuracy 1
step 8600, training accuracy 0.98
step 8700, training accuracy 1
step 8800, training accuracy 1
step 8900, training accuracy 1
step 9000, training accuracy 1
step 9100, training accuracy 1
step 9200, training accuracy 1
step 9300, training accuracy 1
step 9400, training accuracy 1
step 9500, training accuracy 1
step 9600, training accuracy 1
step 9700, training accuracy 1
step 9800, training accuracy 1
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
<ipython-input-30-cb1fb92d3a16> in <module>()
     11         x:batch[0], y_: batch[1], keep_prob: 1.0})
     12     print("step %d, training accuracy %g"%(i, train_accuracy))
---> 13   train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
     14 
     15 print("test accuracy %g"%accuracy.eval(feed_dict={

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in run(self, feed_dict, session)
   1550         none, the default session will be used.
   1551     """
-> 1552     _run_using_default_session(self, feed_dict, self.graph, session)
   1553 
   1554 

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in _run_using_default_session(operation, feed_dict, graph, session)
   3774                        "the operation's graph is different from the session's "
   3775                        "graph.")
-> 3776   session.run(operation, feed_dict)
   3777 
   3778 

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
    776     try:
    777       result = self._run(None, fetches, feed_dict, options_ptr,
--> 778                          run_metadata_ptr)
    779       if run_metadata:
    780         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
    980     if final_fetches or final_targets:
    981       results = self._do_run(handle, final_targets, final_fetches,
--> 982                              feed_dict_string, options, run_metadata)
    983     else:
    984       results = []

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1030     if handle is None:
   1031       return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
-> 1032                            target_list, options, run_metadata)
   1033     else:
   1034       return self._do_call(_prun_fn, self._session, handle, feed_dict,

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
   1037   def _do_call(self, fn, *args):
   1038     try:
-> 1039       return fn(*args)
   1040     except errors.OpError as e:
   1041       message = compat.as_text(e.message)

/Users/jdstokes/anaconda/envs/datasci/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1019         return tf_session.TF_Run(session, options,
   1020                                  feed_dict, fetch_list, target_list,
-> 1021                                  status, run_metadata)
   1022 
   1023     def _prun_fn(session, handle, feed_dict, fetch_list):

KeyboardInterrupt: 

In [ ]:


In [ ]: