In [ ]:
# Copyright 2019 Google LLC
# 
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TF 2.0 Transition - Building in Pre-processing into the Model

TF 2.0 adds a lot of new features and more powerful representation. This notebook will demonstrate some of the newer features to build (custom) input pre-processing into the graph. What's the benefit to this:

1. Since it is part of the model, one does not need to re-implement the preprocessing on the inference 
   side.
2. Since it will be added as graph ops, the preprocessing will happen on the GPU (instead of upstream on 
   CPU) and be faster.
3. The preprocessing graph operations can be optimized by the Tensorflow compiler.

Objective

We will be using the following TF 2.0 features / recommendations:

1. [Recommentation] Use tf.keras for the model building.
2. [Recommendation] Put preprocessing into the model.
3. [Feature] Use @tf.function decorator to convert the Python code for preprocessing into graph ops.
4. [Feature] Use subclassing of layers to define a new layer for the preprocessing.

Imports

If you haven't already, you need to install TF 2.0 beta. If you are running this notebook in colab (which is 1.13 as of this writing), you will need to install TF 2.0 via a cell, as in below:


In [ ]:
# If not already installed
%pip install tensorflow==2.0.0-beta1

Now let's import what we will use in this demonstration.


In [ ]:
import tensorflow as tf
from tensorflow.keras import Model, Input, layers
from tensorflow.keras.layers import Flatten, Dense

# expected output: 2.0.0-beta1
print(tf.__version__)

Layer Subclassing

We will start by subclassing the tf.keras layers class to make a new layer type, as follows:

1. Will take an input vector whose shape is specified at instantiation.
2. Will normalize the data between 0 and 1 (assumes pixel data between 0 .. 255).
3. Outputs the normalized input.
4. Has no trainable parameters.

Let's start by showing a basic template for subclassing layers and then explain it:

class NewLayer(layers.Layer):
    def __init__(self):
        super(NewLayer, self).__init__()
        self.my_vars = blash, blah

    def build(self, input_shape):
        """ Handler for building the layer """
        self.kernel = blah, blah

    def call(self, inputs):
        """ Handler for layer object as callable """
        outputs = do something with inputs
        return outputs

Subclassing

The first line in the above template class NewLayer(layers.Layer) indicates we want to create a new class object named NewLayer which is subclassed (derived) from the tf.keras layers class. This will give us a custom layer definition.

init() method

This is the initializer (constructor) for the class object instantiation. We use the initializer to initialize layer specific variables.

build() method

This method handles the building of the layer when the model is compiled. A typical action is to define the shape of the kernel (trainable parameters) and initialization of the kernel.

call() method

This method handles calling the layer as a callable (function call) for execution in the graph.

Subclassing Our Custom Layer

In the code below, we subclass a custom layer for doing preprocessing of the input, and where the preprocessing is converted to graph operations in the model.

The first line in the code class Normalize(layers.Layer) indicates we want to create a new class object named Normalize which is subclassed (derived) from the tf.keras layers class.

init() method

Since we won't have any constants or variables to preserve, we don't have any need to add anything to this method.

build() method

Our custom layer won't have any trainable parameters. We will tell the compile process to not set up any gradient descent updates on the kernel during training by setting the layers class variable self.kernel to None.

call() method

This is where we add our preprocessing. The parameter inputs is the input tensor to the layer during training and prediction. A TF tensor object implements polymorphism to overload operators. We use the overloaded division operator, which will broadcast the division operation across the entire tensor --thus each element will be divided by 255.0.

Finally, we add the decorator @tf.function to tell TensorFlow AutoGraph to convert convert the Python code in this method to graph operations in the model.


In [ ]:
class Normalize(layers.Layer):
    """ Custom Layer for Preprocessing Input """
    def __init__(self):
        """ Constructor """
        super(Normalize, self).__init__()
    
    def build(self, input_shape):
        """ Handler for Input Shape """
        self.kernel = None
    
    @tf.function
    def call(self, inputs):
        """ Handler for layer object is callable """
        inputs = inputs / 255.0
        return inputs

Build the Model

Let's build a model to train on the MNIST dataset. We will keep it really basic:

1. Use the Functional API method for defining the model.
2. Make the first layer of our model the custom preprocessing layer.
3. The remaining layers are a basic DNN for MNIST.

In [ ]:
# Create the input vector for 28x28 MNIST images
inputs = Input((28, 28))

# The first layer is the preprocessing layer, which is bound to the input vector
x = Normalize()(inputs)

# Next layer, we flatten the preprocessed input into a 1D vector
x = Flatten()(x)

# Create a hidden dense layer of 128 nodes
x = Dense(128, activation='relu')(x)

# Create an output layer for classifying the 10 digits
outputs = Dense(10, activation='sigmoid')(x)

# Instantiate the model
model = Model(inputs, outputs)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['acc'])

Get the Dataset

We will get the tf.keras builtin dataset for MNIST. The dataset is pre-split into train and test data. The data is separated into numpy multi-dimensional arrays for images and labels. The image data is not preprocessed --i.e., all the values are between 0 and 255. The label data is not one-hot-encoded --hence why we compiled with loss='sparse_categorical_crossentropy'


In [ ]:
from tensorflow.keras.datasets import mnist

# Load the train and test data into memory
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Expected output: (60000, 28, 28) , (60000,)
print(x_train.shape, y_train.shape)

Train the Model

Let's now train the model (with the preprocessing built into the model graph) on the unpreprocessed MNIST data.


In [ ]:
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.1, verbose=1)

Evaluate the Model

Let's now evaluate (prediction) using unpreprocessed test examples on the trained model.


In [ ]:
acc = model.evaluate(x_test, y_test)
print(acc)