Serving a Keras Resnet Model

Keras is a high level API that can be used to build deep neural nets with only a few lines of code, and supports a number of backends for computation (TensorFlow, Theano, and CNTK). Keras also contains a library of pre-trained models, including a Resnet model with 50-layers, trained on the ImageNet dataset, which we will use for this exercise.

This notebook teaches how to create a servable version of the Imagenet Resnet50 model in Keras using the TensorFlow backend. The servable model can be served using TensorFlow Serving, which runs very efficiently in C++ and supports multiple platforms (different OSes, as well as hardware with different types of accelerators such as GPUs). The model will need to handle RPC prediction calls coming from a client that sends requests containing a batch of jpeg images.

See https://github.com/keras-team/keras/blob/master/keras/applications/resnet50.py for the implementation of ResNet50.

Preamble

Import the required libraries.



In [0]:

    
# Import Keras libraries
import keras.applications.resnet50 as resnet50
from keras.preprocessing import image
from keras import backend as K
import numpy as np
import os

# Import TensorFlow saved model libraries
import tensorflow as tf
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import utils
from tensorflow.python.saved_model import tag_constants, signature_constants
from tensorflow.python.saved_model.signature_def_utils_impl import build_signature_def, predict_signature_def
from tensorflow.contrib.session_bundle import exporter

Constants



In [0]:

    
_DEFAULT_IMAGE_SIZE = 224  # Default width and height of input images to ResNet model
_LABEL_CLASSES = 1001  # Number of classes that model predicts

Setting the Output Directory

Set a version number and directory for the output of the servable model. Note that with TensorFlow, if you've successfully saved the servable in a directory, trying to save another servable will fail hard. You always want to increment your version number, or otherwise delete the output directory and re-run the servable creation code.



In [0]:

    
VERSION_NUMBER = 1  #Increment this if you want to generate more than one servable model
SERVING_DIR = "keras_servable/" + str(VERSION_NUMBER)
SAMPLE_DIR = "../client"

Build the Servable Model from Keras

Keras has a prepackaged ImageNet-trained ResNet50 model which takes in a 4d input tensor (i.e. a tensor of RGB-color images) and outputs a list of class probabilities for all of the classes.

We will create a servable model whose input takes in a batch of jpeg-encoded images, and outputs a dictionary containing the top k classes and probabilities for each image in the batch. We've refactored the input preprocessing and output postprocessing into helper functions.

Helper Functions for Building a TensorFlow Graph

TensorFlow is essentially a computation graph with variables and states. The graph must be built before it can ingest and process data. Typically, a TensorFlow graph will contain a set of input nodes (called placeholders) from which data can be ingested, and a set of TensorFlow functions that take existing nodes as inputs and produces a dependent node that performs a computation on the input nodes. Each node can be referenced as an "output" node through which processed data can be read.

It is often useful to create helper functions for building a TensorFlow graphs for two reasons:

Modularity: you can reuse functions in different places; for instance, a different image model or ResNet architecture can reuse functions.
Testability: you can attach placeholders at the input of graph building helper functions, and read the output to ensure that your result matches expected behavior.

Helper function: convert JPEG strings to Normalized 3D Tensors

The client (resnet_client.py) sends jpeg encoded images into an array of jpegs (each entry a string) to send to the server. These jpegs are all appropriately resized to 224x224x3, and do not need resizing on the server side to enter into the ResNet model. However, the ResNet50 model was trained with pixel values normalized using a predefined method in the Keras resnet50 package (resnet50.preprocess_input()). We will need to extract the raw 3D tensor from each jpeg string and normalize the values.

Exercise: Add a command in the helper function to build a node that decodes a jpeg string into a 3D RGB image tensor.

Useful References:

tf.image module



In [0]:

    
# Preprocessing helper function similar to `resnet_training_to_serving_solution.ipynb`.

def build_jpeg_to_image_graph(jpeg_image):
  """Build graph elements to preprocess an image by subtracting out the mean from all channels.
  Args:
    image: A jpeg-formatted byte stream represented as a string.
  Returns:
    A 3d tensor of image pixels normalized for the Keras ResNet50 model.
      The canned ResNet50 pretrained model was trained after running
      keras.applications.resnet50.preprocess_input in 'caffe' mode, which
      flips the RGB channels and subtracts out the channel means [103.939, 116.779, 123.68].
      There is no normalizing on the range.
  """
  image = ???
  image = tf.to_float(image)
  image = resnet50.preprocess_input(image)  
  return image

Unit test the helper function

Exercise: We are going to construct an input placeholder node in our TensorFlow graph to read data into TensorFlow, and use the helper function to attach computational elements to the input node, resulting in an output node where data is collected. Next, we will then run the graph by providing sample input into the placeholder (Input data can be python floats, ints, strings, numpy arrays, ndarrays, etc.), and returning the value at the output node.

A placeholder can store a Tensor of arbitrary dimension, and arbitrary length in any dimension.

An example of a placeholder that holds a 1d tensor of floating values is:

x = tf.placeholder(dtype=tf.float32, shape=[10], 'my_input_node')

An example of a 2d tensor (matrix) of dimensions 10x20 holding string values is:

x = tf.placeholder(dtype=tf.string, shape=[10, 20], 'my_string_matrix')

Note that we assigned a Python variable x to be a pointer to the placeholder, but simply calling tf.placeholder() with a named element would create an element in the TensorFlow graph that can be referenced in a global dictionary as 'my_input_node'. However, it helps to keep a Python pointer to keep track of the element without having to and pass it into helper functions.

Any dependent node in the graph can serve as an output node. For instance, passing an input node x through y = build_jpeg_to_image_graph(x) would return a node referenced by python variable y which is the result of processing the input through the graph built by the helper function. When we run the test graph with real data below, you will see how to return the output of y.

Remember: TensorFlow helper functions are used to help construct a computational graph! build_jpeg_to_image_graph() does not return a 3D array. It returns a graph node that returns a 3D array after processing a jpeg-encoded string!**

Useful References:

TensorFlow shapes, TensorFlow data types



In [0]:

    
# Defining input test graph nodes: only needs to be run once!
test_jpeg_ph = ???  # A placeholder for a single string, which is a dimensionless (0D) tensor.
test_decoded_tensor = ??? # Output node, which returns output of the jpeg to image graph.

# Print the graph elements to check shapes. ? indicates that TensorFlow does not know the length of those dimensions.
print(test_jpeg_ph)
print(test_decoded_tensor)

Run the Test Graph

To run data through a test graph, a TensorFlow session must be created. TensorFlow will only run a portion of the graph that is required to map a set of inputs defined by a dictionary with entries of type {placeholder_node : python data}, and an output graph node (or list of graph nodes), e.g.:

with tf.Session() as sess:
    sess.run(output_node,
             {placeholder_1: input_data_1, placeholder_2: input_data_2, ...})

with tf.Session() as sess:
    sess.run([output_node_1, output_node_2, ...],
             {placeholder_1: input_data_1, placeholder_2: input_data_2, ...})

Exercise: Let's read a jpeg image as an encoded string, pass it into the input placeholder, and return a 3D tensor result which is the normalized image. The expected image size is 224x224x3. Please add more potentially useful assert statements to test the output.

Hint: The ??? below are just examples of possible assertions you can make about the range of output values in the preprocessed image. The more robust the assertions, the better you can validate your code! See the documentation for build_jpeg_to_image_graph() above for a description of how the resnet50.preprocess_image() function transforms images to fill in the ??? below.



In [0]:

    
# Run the graph! Validate the result of the function using a sample image SAMPLE_DIR/cat_sample.jpg
ERROR_TOLERANCE = 1e-4

with open(os.path.join(SAMPLE_DIR, "cat_sample.jpg"), "rb") as imageFile:
    jpeg_str = imageFile.read()
    with tf.Session() as sess:
        result = sess.run(test_decoded_tensor, feed_dict={test_jpeg_ph: jpeg_str})
        assert result.shape == (224, 224, 3)
        # TODO: Replace with assert statements to check max and min normalized pixel values
        assert result.max() <= ??? + ERROR_TOLERANCE  # Max pixel value after subtracting mean
        assert result.min() >= ??? - ERROR_TOLERANCE  # Min pixel value after subtracting mean
        print('Hooray! JPEG decoding test passed!')

Remarks

The approach above uses vanilla TensorFlow to perform unit testing. You may notice that the code is more verbose than ideal, since you have to create a session, feed input through a dictionary, etc. We encourage the student to investigate some options below at a later time:

TensorFlow Eager was introduced in TensorFlow 1.5 as a way to execute TensorFlow graphs in a way similar to numpy operations. After testing individual parts of the graph using Eager, you will need to rebuild a graph with the Eager option turned off in order to build a performance optimized TensorFlow graph. Also, keep in mind that you will need another virtual environment with TensorFlow 1.5 in order to run eager execution, which may not be compatible with TensorFlow Serving 1.4 used in this tutorial.

TensorFlow unit testing is a more software engineer oriented approach to run tests. By writing test classes that can be invoked individually when building the project, calling tf.test.main() will run all tests and return a list of ones that succeeded and failed, allowing you to inspect errors. Because we are in a notebook environment, such a test would not succeed due to an already running kernel that tf.test cannot access. The tests must be run from the command line, e.g. python test_my_graph.py.

We've provided both eager execution and unit test examples in the testing directory showing how to unit test various components in this notebook. Note that because these examples contain the solution to exercises below, please complete all notebook exercises prior to reading through these examples.

Now that we know how to run TensorFlow tests, let's create and test more helper functions!

Helper Function: Preprocessing Server Input

The server receives a client request in the form of a dictionary {'images': 1D_tensor_of_jpeg_encoded_strings}, which must be preprocessed into a 4D tensor before feeding into the Keras ResNet50 model.

Exercise: We will be using a ResNet client to send requests to our server. You will need to create a graph to preprocess client requests to be compliant with the client. Using tf.map_fn and build_jpeg_to_image_graph, fill in the missing line (marked ???) to convert the client request into an array of 3D floating-point, preprocessed tensor. The following lines stack and reshape this array into a 4D tensor.

Useful References:



In [0]:

    
def preprocess_input(jpeg_tensor):
    processed_images = ???  # Convert a 1D-tensor of JPEGs to a list of 3D tensors
    processed_images = tf.stack(processed_images)  # Convert list of 3D tensors to tensor of tensors (4D tensor)
    processed_images = tf.reshape(tensor=processed_images,  # Reshape to ensure TF graph knows the final dimensions
                                shape=[-1, _DEFAULT_IMAGE_SIZE, _DEFAULT_IMAGE_SIZE, 3])
    return processed_images

Unit Test the Input Preprocessing Helper Function

Exercise: Construct a TensorFlow unit test graph for the input function.

Hint: the input node test_jpeg_tensor should be a tf.placeholder. You need to define the shape parameter in tf.placeholder. None inside an array indicates that the length can vary along that dimension.



In [0]:

    
# Build a Test Input Preprocessing Network: only needs to be run once!
test_jpeg_tensor = ???  # A placeholder for a 1D tensor (of arbitrary length) of jpeg-encoded strings.
test_processed_images = ???  # Output node, which returns a 4D tensor after processing.

# Print the graph elements to check shapes. ? indicates that TensorFlow does not know the length of those dimensions.
print(test_jpeg_tensor)
print(test_processed_images)



In [0]:

    
# Run test network using a sample image SAMPLE_DIR/cat_sample.jpg

with open(os.path.join(SAMPLE_DIR, "cat_sample.jpg"), "rb") as imageFile:
    jpeg_str = imageFile.read()
    with tf.Session() as sess:
        result = sess.run(test_processed_images, feed_dict={test_jpeg_tensor: np.array([jpeg_str, jpeg_str])})  # Duplicate for length 2 array
        assert result.shape == (2, 224, 224, 3)  # 4D tensor with first dimension length 2, since we have 2 images
        # TODO: add a test for min and max normalized pixel values
        assert result.max() <= ??? + ERROR_TOLERANCE  # Max pixel value after subtracting mean
        assert result.min() >= ??? - ERROR_TOLERANCE  # Min pixel value after subtracting mean
        # TODO: add a test to verify that the resulting tensor for image 0 and image 1 are identical.
        assert result[0].all() == result[1].all()
        print('Hooray! Input unit test succeeded!')

Helper Function: Postprocess Server Output

Exercise: The Keras model returns a 1D tensor of probabilities for each class. We want to wrote a postprocess_output() that returns only the top k classes and probabilities.

Useful References:

tf.nn.top_k



In [0]:

    
TOP_K = 5

def postprocess_output(model_output):
    '''Return top k classes and probabilities.'''
    top_k_probs, top_k_classes = ???
    return {'classes': top_k_classes, 'probabilities': top_k_probs}

Unit Test the Output Postprocessing Helper Function

Exercise: Fill in the shape field for the model output, which should be a 1D tensor of probabilities.

Hint: how many image classes are there?



In [0]:

    
# Build Test Output Postprocessing Network: only needs to be run once!
test_model_output = tf.placeholder(dtype=tf.float32, shape=???, name='test_logits_tensor')
test_prediction_output = postprocess_output(test_model_output)

# Print the graph elements to check shapes.
print(test_model_output)
print(test_prediction_output)



In [0]:

    
# Import numpy testing framework for float comparisons
import numpy.testing as npt

# Run test network
# Input a tensor with clear winners, and perform checks

# Be very specific about what is expected from your mock model.
model_probs = np.ones(???)  # TODO: use the same dimensions as your test_model_output placeholder.
model_probs[2] = 2.5  # TODO: you can create your own tests as well
model_probs[5] = 3.5
model_probs[10] = 4
model_probs[49] = 3
model_probs[998] = 2
TOTAL_WEIGHT = np.sum(model_probs)
model_probs = model_probs / TOTAL_WEIGHT

with tf.Session() as sess:
    result = sess.run(test_prediction_output, {test_model_output: model_probs})
    classes = result['classes']
    probs = result['probabilities']
    # Check values
    assert len(probs) == 5
    npt.assert_almost_equal(probs[0], model_probs[10])
    npt.assert_almost_equal(probs[1], model_probs[5])
    npt.assert_almost_equal(probs[2], model_probs[49])
    npt.assert_almost_equal(probs[3], model_probs[2])
    npt.assert_almost_equal(probs[4], model_probs[998])
    assert len(classes) == 5
    assert classes[0] == 10
    assert classes[1] == 5
    assert classes[2] == 49
    assert classes[3] == 2
    assert classes[4] == 998
    print('Hooray! Output unit test succeeded!')

Load the Keras Model and Build the Graph

The Keras Model uses TensorFlow as its backend, and therefore its inputs and outputs can be treated as elements of a TensorFlow graph. In other words, you can provide an input that is a TensorFlow tensor, and read the model output like a TensorFlow tensor!

Exercise: Build the end to end network by filling in the TODOs below.

Useful References:

Keras ResNet50 API
Keras Model class API: ResNet50 model inherits this class.



In [0]:

    
# TODO: Create a placeholder for your arbitrary-length 1D Tensor of JPEG strings
images = tf.placeholder(???)

# TODO: Call preprocess_input to return processed_images
processed_images = ???

# Load (and download if missing) the ResNet50 Keras Model (may take a while to run)
# TODO: Use processed_images as input
model = resnet50.ResNet50(???)
# Rename the model to 'resnet' for serving
model.name = 'resnet'

# TODO: Call postprocess_output on the output of the model to create predictions to send back to the client
predictions = ???

Creating the Input-Output Signature

Exercise: The final step to creating a servable model is to define the end-to-end input and output API. Edit the inputs and outputs parameters to predict_signature_def below to ensure that the signature correctly handles client request. The inputs parameter should be a dictionary {'images': tensor_of_strings}, and the outputs parameter a dictionary {'classes': tensor_of_top_k_classes, 'probabilities': tensor_of_top_k_probs}.



In [0]:

    
# Create a saved model builder as an endpoint to dataflow execution
builder = saved_model_builder.SavedModelBuilder(SERVING_DIR)

# TODO: set the inputs and outputs parameters in predict_signature_def()
signature = predict_signature_def(inputs=???,
                                  outputs=???)

Export the Servable Model



In [0]:

    
with K.get_session() as sess:
    builder.add_meta_graph_and_variables(sess=sess,
                                         tags=[tag_constants.SERVING],
                                         signature_def_map={'predict': signature})
    builder.save()