In [ ]:
# Copyright 2019 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
TF 2.0 adds a lot of new features and more powerful representation. This notebook will demonstrate some of the newer features to build (custom) input pre-processing into the graph. What's the benefit to this:
1. Since it is part of the model, one does not need to re-implement the preprocessing on the inference
side.
2. Since it will be added as graph ops, the preprocessing will happen on the GPU (instead of upstream on
CPU) and be faster.
3. The preprocessing graph operations can be optimized by the Tensorflow compiler.
We will be using the following TF 2.0 features / recommendations:
1. [Recommentation] Use tf.keras for the model building.
2. [Recommendation] Put preprocessing into the model.
3. [Feature] Use @tf.function decorator to convert the Python code for preprocessing into graph ops.
4. [Feature] Use subclassing of layers to define a new layer for the preprocessing.
In [ ]:
# If not already installed
%pip install tensorflow==2.0.0-beta1
Now let's import what we will use in this demonstration.
In [ ]:
import tensorflow as tf
from tensorflow.keras import Model, Input, layers
from tensorflow.keras.layers import Flatten, Dense
# expected output: 2.0.0-beta1
print(tf.__version__)
We will start by subclassing the tf.keras layers class to make a new layer type, as follows:
1. Will take an input vector whose shape is specified at instantiation.
2. Will normalize the data between 0 and 1 (assumes pixel data between 0 .. 255).
3. Outputs the normalized input.
4. Has no trainable parameters.
Let's start by showing a basic template for subclassing layers and then explain it:
class NewLayer(layers.Layer):
def __init__(self):
super(NewLayer, self).__init__()
self.my_vars = blash, blah
def build(self, input_shape):
""" Handler for building the layer """
self.kernel = blah, blah
def call(self, inputs):
""" Handler for layer object as callable """
outputs = do something with inputs
return outputs
The first line in the above template class NewLayer(layers.Layer)
indicates we want to create a new class object named NewLayer
which is subclassed (derived) from the tf.keras layers
class. This will give us a custom layer definition.
This is the initializer (constructor) for the class object instantiation. We use the initializer to initialize layer specific variables.
This method handles the building of the layer when the model is compiled. A typical action is to define the shape of the kernel (trainable parameters) and initialization of the kernel.
This method handles calling the layer as a callable (function call) for execution in the graph.
In the code below, we subclass a custom layer for doing preprocessing of the input, and where the preprocessing is converted to graph operations in the model.
The first line in the code class Normalize(layers.Layer)
indicates we want to create a new class object named Normalize
which is subclassed (derived) from the tf.keras layers
class.
Since we won't have any constants or variables to preserve, we don't have any need to add anything to this method.
Our custom layer won't have any trainable parameters. We will tell the compile process to not set up any gradient descent updates on the kernel during training by setting the layers
class variable self.kernel
to None
.
This is where we add our preprocessing. The parameter inputs
is the input tensor to the layer during training and prediction. A TF tensor object implements polymorphism to overload operators. We use the overloaded division operator, which will broadcast the division operation across the entire tensor --thus each element will be divided by 255.0.
Finally, we add the decorator @tf.function
to tell TensorFlow AutoGraph to convert convert the Python code in this method to graph operations in the model.
In [ ]:
class Normalize(layers.Layer):
""" Custom Layer for Preprocessing Input """
def __init__(self):
""" Constructor """
super(Normalize, self).__init__()
def build(self, input_shape):
""" Handler for Input Shape """
self.kernel = None
@tf.function
def call(self, inputs):
""" Handler for layer object is callable """
inputs = inputs / 255.0
return inputs
In [ ]:
# Create the input vector for 28x28 MNIST images
inputs = Input((28, 28))
# The first layer is the preprocessing layer, which is bound to the input vector
x = Normalize()(inputs)
# Next layer, we flatten the preprocessed input into a 1D vector
x = Flatten()(x)
# Create a hidden dense layer of 128 nodes
x = Dense(128, activation='relu')(x)
# Create an output layer for classifying the 10 digits
outputs = Dense(10, activation='sigmoid')(x)
# Instantiate the model
model = Model(inputs, outputs)
# Compile the model
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['acc'])
We will get the tf.keras builtin dataset for MNIST. The dataset is pre-split into train and test data. The data is separated into numpy multi-dimensional arrays for images and labels. The image data is not preprocessed --i.e., all the values are between 0 and 255. The label data is not one-hot-encoded --hence why we compiled with loss='sparse_categorical_crossentropy'
In [ ]:
from tensorflow.keras.datasets import mnist
# Load the train and test data into memory
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# Expected output: (60000, 28, 28) , (60000,)
print(x_train.shape, y_train.shape)
In [ ]:
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.1, verbose=1)
In [ ]:
acc = model.evaluate(x_test, y_test)
print(acc)