In [ ]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
A SavedModel contains a complete TensorFlow program, including weights and computation. It does not require the original model building code to run, which makes it useful for sharing or deploying (with TFLite, TensorFlow.js, TensorFlow Serving, or TensorFlow Hub).
This document dives into some of the details of how to use the low-level tf.saved_model
api:
If you are using a tf.keras.Model
the keras.Model.save(output_path)
method may be all you need: See the Keras save and serialize
If you just want to save/load weights during training see the guide to training checkpoints.
In [ ]:
import os
import tempfile
from matplotlib import pyplot as plt
import numpy as np
import tensorflow as tf
tmpdir = tempfile.mkdtemp()
In [ ]:
physical_devices = tf.config.experimental.list_physical_devices('GPU')
if physical_devices:
tf.config.experimental.set_memory_growth(physical_devices[0], True)
In [ ]:
file = tf.keras.utils.get_file(
"grace_hopper.jpg",
"https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg")
img = tf.keras.preprocessing.image.load_img(file, target_size=[224, 224])
plt.imshow(img)
plt.axis('off')
x = tf.keras.preprocessing.image.img_to_array(img)
x = tf.keras.applications.mobilenet.preprocess_input(
x[tf.newaxis,...])
We'll use an image of Grace Hopper as a running example, and a Keras pre-trained image classification model since it's easy to use. Custom models work too, and are covered in detail later.
In [ ]:
labels_path = tf.keras.utils.get_file(
'ImageNetLabels.txt',
'https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
imagenet_labels = np.array(open(labels_path).read().splitlines())
In [ ]:
pretrained_model = tf.keras.applications.MobileNet()
result_before_save = pretrained_model(x)
decoded = imagenet_labels[np.argsort(result_before_save)[0,::-1][:5]+1]
print("Result before saving:\n", decoded)
The top prediction for this image is "military uniform".
In [ ]:
mobilenet_save_path = os.path.join(tmpdir, "mobilenet/1/")
tf.saved_model.save(pretrained_model, mobilenet_save_path)
The save-path follows a convention used by TensorFlow Serving where the last path component (1/
here) is a version number for your model - it allows tools like Tensorflow Serving to reason about the relative freshness.
We can load the SavedModel back into Python with tf.saved_model.load
and see how Admiral Hopper's image is classified.
In [ ]:
loaded = tf.saved_model.load(mobilenet_save_path)
print(list(loaded.signatures.keys())) # ["serving_default"]
Imported signatures always return dictionaries. To customize signature names and output dictionary keys, see Specifying signatures during export.
In [ ]:
infer = loaded.signatures["serving_default"]
print(infer.structured_outputs)
Running inference from the SavedModel gives the same result as the original model.
In [ ]:
labeling = infer(tf.constant(x))[pretrained_model.output_names[0]]
decoded = imagenet_labels[np.argsort(labeling)[0,::-1][:5]+1]
print("Result after saving and loading:\n", decoded)
SavedModels are usable from Python (more on that below), but production environments typically use a dedicated service for inference without running Python code. This is easy to set up from a SavedModel using TensorFlow Serving.
See the TensorFlow Serving REST tutorial for more details about serving, including instructions for installing tensorflow_model_server
in a notebook or on your local machine. As a quick sketch, to serve the mobilenet
model exported above just point the model server at the SavedModel directory:
nohup tensorflow_model_server \
--rest_api_port=8501 \
--model_name=mobilenet \
--model_base_path="/tmp/mobilenet" >server.log 2>&1
Then send a request.
!pip install requests
import json
import numpy
import requests
data = json.dumps({"signature_name": "serving_default",
"instances": x.tolist()})
headers = {"content-type": "application/json"}
json_response = requests.post('http://localhost:8501/v1/models/mobilenet:predict',
data=data, headers=headers)
predictions = numpy.array(json.loads(json_response.text)["predictions"])
The resulting predictions
are identical to the results from Python.
In [ ]:
!ls {mobilenet_save_path}
The saved_model.pb
file stores the actual TensorFlow program, or model, and a set of named signatures, each identifying a function that accepts tensor inputs and produces tensor outputs.
SavedModels may contain multiple variants of the model (multiple v1.MetaGraphDefs
, identified with the --tag_set
flag to saved_model_cli
), but this is rare. APIs which create multiple variants of a model include tf.Estimator.experimental_export_all_saved_models
and in TensorFlow 1.x tf.saved_model.Builder
.
In [ ]:
!saved_model_cli show --dir {mobilenet_save_path} --tag_set serve
The variables
directory contains a standard training checkpoint (see the guide to training checkpoints).
In [ ]:
!ls {mobilenet_save_path}/variables
The assets
directory contains files used by the TensorFlow graph, for example text files used to initialize vocabulary tables. It is unused in this example.
SavedModels may have an assets.extra
directory for any files not used by the TensorFlow graph, for example information for consumers about what to do with the SavedModel. TensorFlow itself does not use this directory.
In [ ]:
class CustomModule(tf.Module):
def __init__(self):
super(CustomModule, self).__init__()
self.v = tf.Variable(1.)
@tf.function
def __call__(self, x):
print('Tracing with', x)
return x * self.v
@tf.function(input_signature=[tf.TensorSpec([], tf.float32)])
def mutate(self, new_v):
self.v.assign(new_v)
module = CustomModule()
When you save a tf.Module
, any tf.Variable
attributes, tf.function
-decorated methods, and tf.Module
s found via recursive traversal are saved. (See the Checkpoint tutorial for more about this recursive traversal.) However, any Python attributes, functions, and data are lost. This means that when a tf.function
is saved, no Python code is saved.
If no Python code is saved, how does SavedModel know how to restore the function?
Briefly, tf.function
works by tracing the Python code to generate a ConcreteFunction (a callable wrapper around tf.Graph
). When saving a tf.function
, you're really saving the tf.function
's cache of ConcreteFunctions.
To learn more about the relationship between tf.function
and ConcreteFunctions, see the tf.function guide.
In [ ]:
module_no_signatures_path = os.path.join(tmpdir, 'module_no_signatures')
module(tf.constant(0.))
print('Saving model...')
tf.saved_model.save(module, module_no_signatures_path)
When you load a SavedModel in Python, all tf.Variable
attributes, tf.function
-decorated methods, and tf.Module
s are restored in the same object structure as the original saved tf.Module
.
In [ ]:
imported = tf.saved_model.load(module_no_signatures_path)
assert imported(tf.constant(3.)).numpy() == 3
imported.mutate(tf.constant(2.))
assert imported(tf.constant(3.)).numpy() == 6
Because no Python code is saved, calling a tf.function
with a new input signature will fail:
imported(tf.constant([3.]))
ValueError: Could not find matching function to call for canonicalized inputs ((,), {}). Only existing signatures are [((TensorSpec(shape=(), dtype=tf.float32, name=u'x'),), {})].
In [ ]:
optimizer = tf.optimizers.SGD(0.05)
def train_step():
with tf.GradientTape() as tape:
loss = (10. - imported(tf.constant(2.))) ** 2
variables = tape.watched_variables()
grads = tape.gradient(loss, variables)
optimizer.apply_gradients(zip(grads, variables))
return loss
In [ ]:
for _ in range(10):
# "v" approaches 5, "loss" approaches 0
print("loss={:.2f} v={:.2f}".format(train_step(), imported.v.numpy()))
A SavedModel from Keras provides more details than a plain __call__
to address more advanced cases of fine-tuning. TensorFlow Hub recommends to provide the following of those, if applicable, in SavedModels shared for the purpose of fine-tuning:
__call__
method takes an optional, Python-valued training=
argument that defaults to False
but can be set to True
.__call__
attribute, there are .variable
and .trainable_variable
attributes with the corresponding lists of variables. A variable that was originally trainable but is meant to be frozen during fine-tuning is omitted from .trainable_variables
..regularization_losses
attribute. It holds a list of zero-argument functions whose values are meant for addition to the total loss.Going back to the initial MobileNet example, we can see some of those in action:
In [ ]:
loaded = tf.saved_model.load(mobilenet_save_path)
print("MobileNet has {} trainable variables: {}, ...".format(
len(loaded.trainable_variables),
", ".join([v.name for v in loaded.trainable_variables[:5]])))
In [ ]:
trainable_variable_ids = {id(v) for v in loaded.trainable_variables}
non_trainable_variables = [v for v in loaded.variables
if id(v) not in trainable_variable_ids]
print("MobileNet also has {} non-trainable variables: {}, ...".format(
len(non_trainable_variables),
", ".join([v.name for v in non_trainable_variables[:3]])))
Tools like TensorFlow Serving and saved_model_cli
can interact with SavedModels. To help these tools determine which ConcreteFunctions to use, we need to specify serving signatures. tf.keras.Model
s automatically specify serving signatures, but we'll have to explicitly declare a serving signature for our custom modules.
By default, no signatures are declared in a custom tf.Module
.
In [ ]:
assert len(imported.signatures) == 0
To declare a serving signature, specify a ConcreteFunction using the signatures
kwarg. When specifying a single signature, its signature key will be 'serving_default'
, which is saved as the constant tf.saved_model.DEFAULT_SERVING_SIGNATURE_DEF_KEY
.
In [ ]:
module_with_signature_path = os.path.join(tmpdir, 'module_with_signature')
call = module.__call__.get_concrete_function(tf.TensorSpec(None, tf.float32))
tf.saved_model.save(module, module_with_signature_path, signatures=call)
In [ ]:
imported_with_signatures = tf.saved_model.load(module_with_signature_path)
list(imported_with_signatures.signatures.keys())
To export multiple signatures, pass a dictionary of signature keys to ConcreteFunctions. Each signature key corresponds to one ConcreteFunction.
In [ ]:
module_multiple_signatures_path = os.path.join(tmpdir, 'module_with_multiple_signatures')
signatures = {"serving_default": call,
"array_input": module.__call__.get_concrete_function(tf.TensorSpec([None], tf.float32))}
tf.saved_model.save(module, module_multiple_signatures_path, signatures=signatures)
In [ ]:
imported_with_multiple_signatures = tf.saved_model.load(module_multiple_signatures_path)
list(imported_with_multiple_signatures.signatures.keys())
By default, the output tensor names are fairly generic, like output_0
. To control the names of outputs, modify your tf.function
to return a dictionary that maps output names to outputs. The names of inputs are derived from the Python function arg names.
In [ ]:
class CustomModuleWithOutputName(tf.Module):
def __init__(self):
super(CustomModuleWithOutputName, self).__init__()
self.v = tf.Variable(1.)
@tf.function(input_signature=[tf.TensorSpec([], tf.float32)])
def __call__(self, x):
return {'custom_output_name': x * self.v}
module_output = CustomModuleWithOutputName()
call_output = module_output.__call__.get_concrete_function(tf.TensorSpec(None, tf.float32))
module_output_path = os.path.join(tmpdir, 'module_with_output_name')
tf.saved_model.save(module_output, module_output_path,
signatures={'serving_default': call_output})
In [ ]:
imported_with_output_name = tf.saved_model.load(module_output_path)
imported_with_output_name.signatures['serving_default'].structured_outputs
Estimators export SavedModels through tf.Estimator.export_saved_model
. See the guide to Estimator for details.
In [ ]:
input_column = tf.feature_column.numeric_column("x")
estimator = tf.estimator.LinearClassifier(feature_columns=[input_column])
def input_fn():
return tf.data.Dataset.from_tensor_slices(
({"x": [1., 2., 3., 4.]}, [1, 1, 0, 0])).repeat(200).shuffle(64).batch(16)
estimator.train(input_fn)
serving_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(
tf.feature_column.make_parse_example_spec([input_column]))
estimator_base_path = os.path.join(tmpdir, 'from_estimator')
estimator_path = estimator.export_saved_model(estimator_base_path, serving_input_fn)
This SavedModel accepts serialized tf.Example
protocol buffers, which are useful for serving. But we can also load it with tf.saved_model.load
and run it from Python.
In [ ]:
imported = tf.saved_model.load(estimator_path)
def predict(x):
example = tf.train.Example()
example.features.feature["x"].float_list.value.extend([x])
return imported.signatures["predict"](
examples=tf.constant([example.SerializeToString()]))
In [ ]:
print(predict(1.5))
print(predict(3.5))
tf.estimator.export.build_raw_serving_input_receiver_fn
allows you to create input functions which take raw tensors rather than tf.train.Example
s.
The C++ version of the SavedModel loader provides an API to load a SavedModel from a path, while allowing SessionOptions and RunOptions. You have to specify the tags associated with the graph to be loaded. The loaded version of SavedModel is referred to as SavedModelBundle and contains the MetaGraphDef and the session within which it is loaded.
const string export_dir = ...
SavedModelBundle bundle;
...
LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain},
&bundle);
You can use the SavedModel Command Line Interface (CLI) to inspect and
execute a SavedModel.
For example, you can use the CLI to inspect the model's SignatureDef
s.
The CLI enables you to quickly confirm that the input
Tensor dtype and shape match the model. Moreover, if you
want to test your model, you can use the CLI to do a sanity check by
passing in sample inputs in various formats (for example, Python
expressions) and then fetching the output.
Broadly speaking, you can install TensorFlow in either of the following two ways:
If you installed TensorFlow through a pre-built TensorFlow binary,
then the SavedModel CLI is already installed on your system
at pathname bin/saved_model_cli
.
If you built TensorFlow from source code, you must run the following
additional command to build saved_model_cli
:
$ bazel build tensorflow/python/tools:saved_model_cli
The SavedModel CLI supports the following two commands on a SavedModel:
show
, which shows the computations available from a SavedModel.run
, which runs a computation from a SavedModel.show
commandA SavedModel contains one or more model variants (technically, v1.MetaGraphDef
s), identified by their tag-sets. To serve a model, you might wonder what kind of SignatureDef
s are in each model variant, and what are their inputs and outputs. The show
command let you examine the contents of the SavedModel in hierarchical order. Here's the syntax:
usage: saved_model_cli show [-h] --dir DIR [--all]
[--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY]
For example, the following command shows all available tag-sets in the SavedModel:
$ saved_model_cli show --dir /tmp/saved_model_dir
The given SavedModel contains the following tag-sets:
serve
serve, gpu
The following command shows all available SignatureDef
keys for a tag set:
$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve
The given SavedModel `MetaGraphDef` contains `SignatureDefs` with the
following keys:
SignatureDef key: "classify_x2_to_y3"
SignatureDef key: "classify_x_to_y"
SignatureDef key: "regress_x2_to_y3"
SignatureDef key: "regress_x_to_y"
SignatureDef key: "regress_x_to_y2"
SignatureDef key: "serving_default"
If there are multiple tags in the tag-set, you must specify all tags, each tag separated by a comma. For example:
$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu
To show all inputs and outputs TensorInfo for a specific SignatureDef
, pass in
the SignatureDef
key to signature_def
option. This is very useful when you
want to know the tensor key value, dtype and shape of the input tensors for
executing the computation graph later. For example:
$ saved_model_cli show --dir \
/tmp/saved_model_dir --tag_set serve --signature_def serving_default
The given SavedModel SignatureDef contains the following input(s):
inputs['x'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: x:0
The given SavedModel SignatureDef contains the following output(s):
outputs['y'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y:0
Method name is: tensorflow/serving/predict
To show all available information in the SavedModel, use the --all
option.
For example:
$ saved_model_cli show --dir /tmp/saved_model_dir --all MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs: signature_def['classify_x2_to_y3']: The given SavedModel SignatureDef contains the following input(s): inputs['inputs'] tensor_info: dtype: DT_FLOAT shape: (-1, 1) name: x2:0 The given SavedModel SignatureDef contains the following output(s): outputs['scores'] tensor_info: dtype: DT_FLOAT shape: (-1, 1) name: y3:0 Method name is: tensorflow/serving/classify ... signature_def['serving_default']: The given SavedModel SignatureDef contains the following input(s): inputs['x'] tensor_info: dtype: DT_FLOAT shape: (-1, 1) name: x:0 The given SavedModel SignatureDef contains the following output(s): outputs['y'] tensor_info: dtype: DT_FLOAT shape: (-1, 1) name: y:0 Method name is: tensorflow/serving/predict
run
commandInvoke the run
command to run a graph computation, passing
inputs and then displaying (and optionally saving) the outputs.
Here's the syntax:
usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def
SIGNATURE_DEF_KEY [--inputs INPUTS]
[--input_exprs INPUT_EXPRS]
[--input_examples INPUT_EXAMPLES] [--outdir OUTDIR]
[--overwrite] [--tf_debug]
The run
command provides the following three ways to pass inputs to the model:
--inputs
option enables you to pass numpy ndarray in files.--input_exprs
option enables you to pass Python expressions.--input_examples
option enables you to pass tf.train.Example
.--inputs
To pass input data in files, specify the --inputs
option, which takes the
following general format:
bsh
--inputs <INPUTS>
where INPUTS is either of the following formats:
<input_key>=<filename>
<input_key>=<filename>[<variable_name>]
You may pass multiple INPUTS. If you do pass multiple inputs, use a semicolon to separate each of the INPUTS.
saved_model_cli
uses numpy.load
to load the filename.
The filename may be in any of the following formats:
.npy
.npz
A .npy
file always contains a numpy ndarray. Therefore, when loading from
a .npy
file, the content will be directly assigned to the specified input
tensor. If you specify a variable_name with that .npy
file, the
variable_name will be ignored and a warning will be issued.
When loading from a .npz
(zip) file, you may optionally specify a
variable_name to identify the variable within the zip file to load for
the input tensor key. If you don't specify a variable_name, the SavedModel
CLI will check that only one file is included in the zip file and load it
for the specified input tensor key.
When loading from a pickle file, if no variable_name
is specified in the
square brackets, whatever that is inside the pickle file will be passed to the
specified input tensor key. Otherwise, the SavedModel CLI will assume a
dictionary is stored in the pickle file and the value corresponding to
the variable_name will be used.
--input_exprs
To pass inputs through Python expressions, specify the --input_exprs
option.
This can be useful for when you don't have data
files lying around, but still want to sanity check the model with some simple
inputs that match the dtype and shape of the model's SignatureDef
s.
For example:
bsh
`<input_key>=[[1],[2],[3]]`
In addition to Python expressions, you may also pass numpy functions. For example:
bsh
`<input_key>=np.ones((32,32,3))`
(Note that the numpy
module is already available to you as np
.)
--input_examples
To pass tf.train.Example
as inputs, specify the --input_examples
option.
For each input key, it takes a list of dictionary, where each dictionary is an
instance of tf.train.Example
. The dictionary keys are the features and the
values are the value lists for each feature.
For example:
bsh
`<input_key>=[{"age":[22,24],"education":["BS","MS"]}]`
By default, the SavedModel CLI writes output to stdout. If a directory is
passed to --outdir
option, the outputs will be saved as .npy
files named after
output tensor keys under the given directory.
Use --overwrite
to overwrite existing output files.