Keras is a high level API that can be used to build deep neural nets with only a few lines of code, and supports a number of backends for computation (TensorFlow, Theano, and CNTK). Keras also contains a library of pre-trained models, including a Resnet model with 50-layers, trained on the ImageNet dataset, which we will use for this exercise.
This notebook teaches how to create a servable version of the Imagenet Resnet50 model in Keras using the TensorFlow backend. The servable model can be served using TensorFlow Serving, which runs very efficiently in C++ and supports multiple platforms (different OSes, as well as hardware with different types of accelerators such as GPUs). The model will need to handle RPC prediction calls coming from a client that sends requests containing a batch of jpeg images.
See https://github.com/keras-team/keras/blob/master/keras/applications/resnet50.py for the implementation of ResNet50.
In [0]:
# Import Keras libraries
import keras.applications.resnet50 as resnet50
from keras.preprocessing import image
from keras import backend as K
import numpy as np
import os
# Import TensorFlow saved model libraries
import tensorflow as tf
from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import utils
from tensorflow.python.saved_model import tag_constants, signature_constants
from tensorflow.python.saved_model.signature_def_utils_impl import build_signature_def, predict_signature_def
from tensorflow.contrib.session_bundle import exporter
In [0]:
_DEFAULT_IMAGE_SIZE = 224 # Default width and height of input images to ResNet model
_LABEL_CLASSES = 1001 # Number of classes that model predicts
Set a version number and directory for the output of the servable model. Note that with TensorFlow, if you've successfully saved the servable in a directory, trying to save another servable will fail hard. You always want to increment your version number, or otherwise delete the output directory and re-run the servable creation code.
In [0]:
VERSION_NUMBER = 1 #Increment this if you want to generate more than one servable model
SERVING_DIR = "keras_servable/" + str(VERSION_NUMBER)
SAMPLE_DIR = "../client"
Keras has a prepackaged ImageNet-trained ResNet50 model which takes in a 4d input tensor (i.e. a tensor of RGB-color images) and outputs a list of class probabilities for all of the classes.
We will create a servable model whose input takes in a batch of jpeg-encoded images, and outputs a dictionary containing the top k classes and probabilities for each image in the batch. We've refactored the input preprocessing and output postprocessing into helper functions.
TensorFlow is essentially a computation graph with variables and states. The graph must be built before it can ingest and process data. Typically, a TensorFlow graph will contain a set of input nodes (called placeholders) from which data can be ingested, and a set of TensorFlow functions that take existing nodes as inputs and produces a dependent node that performs a computation on the input nodes. Each node can be referenced as an "output" node through which processed data can be read.
It is often useful to create helper functions for building a TensorFlow graphs for two reasons:
The client (resnet_client.py) sends jpeg encoded images into an array of jpegs (each entry a string) to send to the server. These jpegs are all appropriately resized to 224x224x3, and do not need resizing on the server side to enter into the ResNet model. However, the ResNet50 model was trained with pixel values normalized using a predefined method in the Keras resnet50 package (resnet50.preprocess_input()). We will need to extract the raw 3D tensor from each jpeg string and normalize the values.
Exercise: Add a command in the helper function to build a node that decodes a jpeg string into a 3D RGB image tensor.
Useful References:
In [0]:
# Preprocessing helper function similar to `resnet_training_to_serving_solution.ipynb`.
def build_jpeg_to_image_graph(jpeg_image):
"""Build graph elements to preprocess an image by subtracting out the mean from all channels.
Args:
image: A jpeg-formatted byte stream represented as a string.
Returns:
A 3d tensor of image pixels normalized for the Keras ResNet50 model.
The canned ResNet50 pretrained model was trained after running
keras.applications.resnet50.preprocess_input in 'caffe' mode, which
flips the RGB channels and subtracts out the channel means [103.939, 116.779, 123.68].
There is no normalizing on the range.
"""
image = ???
image = tf.to_float(image)
image = resnet50.preprocess_input(image)
return image
Exercise: We are going to construct an input placeholder node in our TensorFlow graph to read data into TensorFlow, and use the helper function to attach computational elements to the input node, resulting in an output node where data is collected. Next, we will then run the graph by providing sample input into the placeholder (Input data can be python floats, ints, strings, numpy arrays, ndarrays, etc.), and returning the value at the output node.
A placeholder can store a Tensor of arbitrary dimension, and arbitrary length in any dimension.
An example of a placeholder that holds a 1d tensor of floating values is:
x = tf.placeholder(dtype=tf.float32, shape=[10], 'my_input_node')
An example of a 2d tensor (matrix) of dimensions 10x20 holding string values is:
x = tf.placeholder(dtype=tf.string, shape=[10, 20], 'my_string_matrix')
Note that we assigned a Python variable x to be a pointer to the placeholder, but simply calling tf.placeholder() with a named element would create an element in the TensorFlow graph that can be referenced in a global dictionary as 'my_input_node'. However, it helps to keep a Python pointer to keep track of the element without having to and pass it into helper functions.
Any dependent node in the graph can serve as an output node. For instance, passing an input node x through y = build_jpeg_to_image_graph(x)
would return a node referenced by python variable y which is the result of processing the input through the graph built by the helper function. When we run the test graph with real data below, you will see how to return the output of y.
Remember: TensorFlow helper functions are used to help construct a computational graph! build_jpeg_to_image_graph() does not return a 3D array. It returns a graph node that returns a 3D array after processing a jpeg-encoded string!**
Useful References:
In [0]:
# Defining input test graph nodes: only needs to be run once!
test_jpeg_ph = ??? # A placeholder for a single string, which is a dimensionless (0D) tensor.
test_decoded_tensor = ??? # Output node, which returns output of the jpeg to image graph.
# Print the graph elements to check shapes. ? indicates that TensorFlow does not know the length of those dimensions.
print(test_jpeg_ph)
print(test_decoded_tensor)
To run data through a test graph, a TensorFlow session must be created. TensorFlow will only run a portion of the graph that is required to map a set of inputs defined by a dictionary with entries of type {placeholder_node : python data}, and an output graph node (or list of graph nodes), e.g.:
with tf.Session() as sess:
sess.run(output_node,
{placeholder_1: input_data_1, placeholder_2: input_data_2, ...})
or
with tf.Session() as sess:
sess.run([output_node_1, output_node_2, ...],
{placeholder_1: input_data_1, placeholder_2: input_data_2, ...})
Exercise: Let's read a jpeg image as an encoded string, pass it into the input placeholder, and return a 3D tensor result which is the normalized image. The expected image size is 224x224x3. Please add more potentially useful assert statements to test the output.
Hint: The ??? below are just examples of possible assertions you can make about the range of output values in the preprocessed image. The more robust the assertions, the better you can validate your code! See the documentation for build_jpeg_to_image_graph() above for a description of how the resnet50.preprocess_image() function transforms images to fill in the ??? below.
In [0]:
# Run the graph! Validate the result of the function using a sample image SAMPLE_DIR/cat_sample.jpg
ERROR_TOLERANCE = 1e-4
with open(os.path.join(SAMPLE_DIR, "cat_sample.jpg"), "rb") as imageFile:
jpeg_str = imageFile.read()
with tf.Session() as sess:
result = sess.run(test_decoded_tensor, feed_dict={test_jpeg_ph: jpeg_str})
assert result.shape == (224, 224, 3)
# TODO: Replace with assert statements to check max and min normalized pixel values
assert result.max() <= ??? + ERROR_TOLERANCE # Max pixel value after subtracting mean
assert result.min() >= ??? - ERROR_TOLERANCE # Min pixel value after subtracting mean
print('Hooray! JPEG decoding test passed!')
The approach above uses vanilla TensorFlow to perform unit testing. You may notice that the code is more verbose than ideal, since you have to create a session, feed input through a dictionary, etc. We encourage the student to investigate some options below at a later time:
TensorFlow Eager was introduced in TensorFlow 1.5 as a way to execute TensorFlow graphs in a way similar to numpy operations. After testing individual parts of the graph using Eager, you will need to rebuild a graph with the Eager option turned off in order to build a performance optimized TensorFlow graph. Also, keep in mind that you will need another virtual environment with TensorFlow 1.5 in order to run eager execution, which may not be compatible with TensorFlow Serving 1.4 used in this tutorial.
TensorFlow unit testing is a more software engineer oriented approach to run tests. By writing test classes that can be invoked individually when building the project, calling tf.test.main() will run all tests and return a list of ones that succeeded and failed, allowing you to inspect errors. Because we are in a notebook environment, such a test would not succeed due to an already running kernel that tf.test cannot access. The tests must be run from the command line, e.g. python test_my_graph.py
.
We've provided both eager execution and unit test examples in the testing directory showing how to unit test various components in this notebook. Note that because these examples contain the solution to exercises below, please complete all notebook exercises prior to reading through these examples.
Now that we know how to run TensorFlow tests, let's create and test more helper functions!
The server receives a client request in the form of a dictionary {'images': 1D_tensor_of_jpeg_encoded_strings}, which must be preprocessed into a 4D tensor before feeding into the Keras ResNet50 model.
Exercise: We will be using a ResNet client to send requests to our server. You will need to create a graph to preprocess client requests to be compliant with the client. Using tf.map_fn and build_jpeg_to_image_graph, fill in the missing line (marked ???) to convert the client request into an array of 3D floating-point, preprocessed tensor. The following lines stack and reshape this array into a 4D tensor.
Useful References:
In [0]:
def preprocess_input(jpeg_tensor):
processed_images = ??? # Convert a 1D-tensor of JPEGs to a list of 3D tensors
processed_images = tf.stack(processed_images) # Convert list of 3D tensors to tensor of tensors (4D tensor)
processed_images = tf.reshape(tensor=processed_images, # Reshape to ensure TF graph knows the final dimensions
shape=[-1, _DEFAULT_IMAGE_SIZE, _DEFAULT_IMAGE_SIZE, 3])
return processed_images
Exercise: Construct a TensorFlow unit test graph for the input function.
Hint: the input node test_jpeg_tensor should be a tf.placeholder. You need to define the shape
parameter in tf.placeholder. None
inside an array indicates that the length can vary along that dimension.
In [0]:
# Build a Test Input Preprocessing Network: only needs to be run once!
test_jpeg_tensor = ??? # A placeholder for a 1D tensor (of arbitrary length) of jpeg-encoded strings.
test_processed_images = ??? # Output node, which returns a 4D tensor after processing.
# Print the graph elements to check shapes. ? indicates that TensorFlow does not know the length of those dimensions.
print(test_jpeg_tensor)
print(test_processed_images)
In [0]:
# Run test network using a sample image SAMPLE_DIR/cat_sample.jpg
with open(os.path.join(SAMPLE_DIR, "cat_sample.jpg"), "rb") as imageFile:
jpeg_str = imageFile.read()
with tf.Session() as sess:
result = sess.run(test_processed_images, feed_dict={test_jpeg_tensor: np.array([jpeg_str, jpeg_str])}) # Duplicate for length 2 array
assert result.shape == (2, 224, 224, 3) # 4D tensor with first dimension length 2, since we have 2 images
# TODO: add a test for min and max normalized pixel values
assert result.max() <= ??? + ERROR_TOLERANCE # Max pixel value after subtracting mean
assert result.min() >= ??? - ERROR_TOLERANCE # Min pixel value after subtracting mean
# TODO: add a test to verify that the resulting tensor for image 0 and image 1 are identical.
assert result[0].all() == result[1].all()
print('Hooray! Input unit test succeeded!')
In [0]:
TOP_K = 5
def postprocess_output(model_output):
'''Return top k classes and probabilities.'''
top_k_probs, top_k_classes = ???
return {'classes': top_k_classes, 'probabilities': top_k_probs}
In [0]:
# Build Test Output Postprocessing Network: only needs to be run once!
test_model_output = tf.placeholder(dtype=tf.float32, shape=???, name='test_logits_tensor')
test_prediction_output = postprocess_output(test_model_output)
# Print the graph elements to check shapes.
print(test_model_output)
print(test_prediction_output)
In [0]:
# Import numpy testing framework for float comparisons
import numpy.testing as npt
# Run test network
# Input a tensor with clear winners, and perform checks
# Be very specific about what is expected from your mock model.
model_probs = np.ones(???) # TODO: use the same dimensions as your test_model_output placeholder.
model_probs[2] = 2.5 # TODO: you can create your own tests as well
model_probs[5] = 3.5
model_probs[10] = 4
model_probs[49] = 3
model_probs[998] = 2
TOTAL_WEIGHT = np.sum(model_probs)
model_probs = model_probs / TOTAL_WEIGHT
with tf.Session() as sess:
result = sess.run(test_prediction_output, {test_model_output: model_probs})
classes = result['classes']
probs = result['probabilities']
# Check values
assert len(probs) == 5
npt.assert_almost_equal(probs[0], model_probs[10])
npt.assert_almost_equal(probs[1], model_probs[5])
npt.assert_almost_equal(probs[2], model_probs[49])
npt.assert_almost_equal(probs[3], model_probs[2])
npt.assert_almost_equal(probs[4], model_probs[998])
assert len(classes) == 5
assert classes[0] == 10
assert classes[1] == 5
assert classes[2] == 49
assert classes[3] == 2
assert classes[4] == 998
print('Hooray! Output unit test succeeded!')
The Keras Model uses TensorFlow as its backend, and therefore its inputs and outputs can be treated as elements of a TensorFlow graph. In other words, you can provide an input that is a TensorFlow tensor, and read the model output like a TensorFlow tensor!
Exercise: Build the end to end network by filling in the TODOs below.
Useful References:
In [0]:
# TODO: Create a placeholder for your arbitrary-length 1D Tensor of JPEG strings
images = tf.placeholder(???)
# TODO: Call preprocess_input to return processed_images
processed_images = ???
# Load (and download if missing) the ResNet50 Keras Model (may take a while to run)
# TODO: Use processed_images as input
model = resnet50.ResNet50(???)
# Rename the model to 'resnet' for serving
model.name = 'resnet'
# TODO: Call postprocess_output on the output of the model to create predictions to send back to the client
predictions = ???
Exercise: The final step to creating a servable model is to define the end-to-end input and output API. Edit the inputs and outputs parameters to predict_signature_def below to ensure that the signature correctly handles client request. The inputs parameter should be a dictionary {'images': tensor_of_strings}, and the outputs parameter a dictionary {'classes': tensor_of_top_k_classes, 'probabilities': tensor_of_top_k_probs}.
In [0]:
# Create a saved model builder as an endpoint to dataflow execution
builder = saved_model_builder.SavedModelBuilder(SERVING_DIR)
# TODO: set the inputs and outputs parameters in predict_signature_def()
signature = predict_signature_def(inputs=???,
outputs=???)
In [0]:
with K.get_session() as sess:
builder.add_meta_graph_and_variables(sess=sess,
tags=[tag_constants.SERVING],
signature_def_map={'predict': signature})
builder.save()