Serving Function

This notebook demonstrates the Serving Function design pattern using Keras

Simple text classification model

This model uses transfer learning with TensorFlow Hub and Keras. It is based on https://www.tensorflow.org/tutorials/keras/text_classification_with_hub

It classifies movie reviews as positive or negative using the text of the review. The reviews come from an IMDB dataset that contains the text of 50,000 movie reviews from the Internet Movie Database. These are split into 25,000 reviews for training and 25,000 reviews for testing.


In [1]:
# Already installed if you are using Cloud AI Platform Notebooks
#!pip install -q tensorflow-hub
#!pip install -q tfds-nightly

In [2]:
import numpy as np
import tensorflow as tf


import tensorflow_hub as hub
import tensorflow_datasets as tfds
train_data, test_data = tfds.load(
    name="imdb_reviews", 
    split=('train', 'test'),
    as_supervised=True)

In [3]:
split = 3 # 1/4 records is validation
dataset_train = train_data.window(split, split + 1).flat_map(lambda *ds: ds[0] if len(ds) == 1 else tf.data.Dataset.zip(ds))
dataset_validation = train_data.skip(split).window(1, split + 1).flat_map(lambda *ds: ds[0] if len(ds) == 1 else tf.data.Dataset.zip(ds))

In [4]:
embedding = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim-with-oov/1"
hub_layer = hub.KerasLayer(embedding, input_shape=[], 
                           dtype=tf.string, trainable=True, name='full_text')

In [5]:
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(16, activation='relu', name='h1_dense'))
model.add(tf.keras.layers.Dense(1, name='positive_review_logits'))

model.summary()


Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
full_text (KerasLayer)       (None, 20)                389380    
_________________________________________________________________
h1_dense (Dense)             (None, 16)                336       
_________________________________________________________________
positive_review_logits (Dens (None, 1)                 17        
=================================================================
Total params: 389,733
Trainable params: 389,733
Non-trainable params: 0
_________________________________________________________________

In [6]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])



history = model.fit(dataset_train.shuffle(10000).batch(512),
                    epochs=10,
                    validation_data=dataset_validation.batch(512),
                    verbose=1)


Epoch 1/10
37/37 [==============================] - 12s 331ms/step - loss: 0.7690 - accuracy: 0.5320 - val_loss: 0.6935 - val_accuracy: 0.5750
Epoch 2/10
37/37 [==============================] - 11s 294ms/step - loss: 0.6586 - accuracy: 0.6024 - val_loss: 0.6317 - val_accuracy: 0.6261
Epoch 3/10
37/37 [==============================] - 11s 300ms/step - loss: 0.6114 - accuracy: 0.6423 - val_loss: 0.5916 - val_accuracy: 0.6547
Epoch 4/10
37/37 [==============================] - 11s 295ms/step - loss: 0.5732 - accuracy: 0.6747 - val_loss: 0.5569 - val_accuracy: 0.6843
Epoch 5/10
37/37 [==============================] - 11s 303ms/step - loss: 0.5367 - accuracy: 0.7116 - val_loss: 0.5227 - val_accuracy: 0.7294
Epoch 6/10
37/37 [==============================] - 11s 299ms/step - loss: 0.4965 - accuracy: 0.7439 - val_loss: 0.4854 - val_accuracy: 0.7560
Epoch 7/10
37/37 [==============================] - 11s 303ms/step - loss: 0.4550 - accuracy: 0.7765 - val_loss: 0.4493 - val_accuracy: 0.7870
Epoch 8/10
37/37 [==============================] - 11s 297ms/step - loss: 0.4134 - accuracy: 0.8061 - val_loss: 0.4161 - val_accuracy: 0.8061
Epoch 9/10
37/37 [==============================] - 11s 303ms/step - loss: 0.3761 - accuracy: 0.8296 - val_loss: 0.3886 - val_accuracy: 0.8267
Epoch 10/10
37/37 [==============================] - 11s 300ms/step - loss: 0.3419 - accuracy: 0.8490 - val_loss: 0.3658 - val_accuracy: 0.8392

In [7]:
results = model.evaluate(test_data.batch(512), verbose=2)

for name, value in zip(model.metrics_names, results):
  print("%s: %.3f" % (name, value))


loss: 0.381
accuracy: 0.827

Predict using the trained model in-memory


In [28]:
review1 = 'The film is based on a prize-winning novel.' # neutral
review2 = 'The film is fast moving and has several great action scenes.' # positive
review3 = 'The film was very boring. I walked out half-way.' # negative

logits = model.predict(x=tf.constant([review1, review2, review3]))
print(logits)


[[ 0.6965847]
 [ 1.61773  ]
 [-0.7543597]]

In [36]:
## how big is the model in memory?
import sys

# From https://goshippo.com/blog/measure-real-size-any-python-object/
def get_size(obj, seen=None):
    """Recursively finds size of objects"""
    size = sys.getsizeof(obj)
    if seen is None:
        seen = set()
    obj_id = id(obj)
    if obj_id in seen:
        return 0
    # Important mark as seen *before* entering recursion to gracefully handle
    # self-referential objects
    seen.add(obj_id)
    try:
        if isinstance(obj, dict):
            size += sum([get_size(v, seen) for v in obj.values()])
            size += sum([get_size(k, seen) for k in obj.keys()])
        elif hasattr(obj, '__dict__'):
            size += get_size(obj.__dict__, seen)
        elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
            size += sum([get_size(i, seen) for i in obj])
    except:
        pass
    return size
print('{} MB'.format(get_size(model)/(1000*1000)))


8.203439 MB

Export the model for serving

model.save() writes out a "serve" tag_set


In [8]:
import os, datetime, shutil
shutil.rmtree('export/default', ignore_errors=True)
export_path = os.path.join('export', 'default', 'sentiment_{}'.format(datetime.datetime.now().strftime("%Y%m%d_%H%M%S")))
model.save(export_path)


WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: export/default/sentiment_20200505_232612/assets
INFO:tensorflow:Assets written to: export/default/sentiment_20200505_232612/assets

In [9]:
!find export/default


export/default
export/default/sentiment_20200505_232612
export/default/sentiment_20200505_232612/variables
export/default/sentiment_20200505_232612/variables/variables.data-00000-of-00001
export/default/sentiment_20200505_232612/variables/variables.index
export/default/sentiment_20200505_232612/assets
export/default/sentiment_20200505_232612/assets/tokens.txt
export/default/sentiment_20200505_232612/saved_model.pb

Note how much smaller the model itself is ... the assets and variables are constants and can be shared in thread-safe way


In [42]:
!ls -lh {export_path}/saved_model.pb
!ls -lh {export_path}/assets/tokens.txt
!ls -lh {export_path}/variables/variables.*


-rw-r--r-- 1 jupyter jupyter 112K May  5 23:26 export/default/sentiment_20200505_232612/saved_model.pb
-rw-r--r-- 1 jupyter jupyter 148K May  5 23:26 export/default/sentiment_20200505_232612/assets/tokens.txt
-rw-r--r-- 1 jupyter jupyter 4.5M May  5 23:26 export/default/sentiment_20200505_232612/variables/variables.data-00000-of-00001
-rw-r--r-- 1 jupyter jupyter 1.6K May  5 23:26 export/default/sentiment_20200505_232612/variables/variables.index

In [10]:
!saved_model_cli show --dir {export_path} --tag_set serve --signature_def serving_default


The given SavedModel SignatureDef contains the following input(s):
  inputs['full_text_input'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_full_text_input:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['positive_review_logits'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: StatefulPartitionedCall_2:0
Method name is: tensorflow/serving/predict

In [22]:
## illustrates how we can load this model and do inference based on the signature above
restored = tf.keras.models.load_model(export_path)
review1 = 'The film is based on a prize-winning novel.' # neutral
review2 = 'The film is fast moving and has several great action scenes.' # positive
review3 = 'The film was very boring. I walked out half-way.' # negative

infer = restored.signatures['serving_default']
outputs = infer(full_text_input=tf.constant([review1, review2, review3])) # note input name
logit = outputs['positive_review_logits']  # note output name
print(logit)


tf.Tensor(
[[ 0.6965847]
 [ 1.61773  ]
 [-0.7543597]], shape=(3, 1), dtype=float32)

In [23]:
print(1 / (1 + np.exp(-logit))) # probability


[[0.6674301 ]
 [0.83448184]
 [0.31987208]]

Custom serving function

Let's write out a new signature. But this time, let's carry out the sigmoid operation, so that the model outputs a probability.


In [13]:
@tf.function(input_signature=[tf.TensorSpec([None], dtype=tf.string)])
def add_prob(reviews):
    logits = model(reviews, training=False) # the model is captured via closure
    probs = tf.sigmoid(logits)
    return {
        'positive_review_logits' : logits,
        'positive_review_probability' : probs
    }
shutil.rmtree('export/probs', ignore_errors=True)
probs_export_path = os.path.join('export', 'probs', 'sentiment_{}'.format(datetime.datetime.now().strftime("%Y%m%d_%H%M%S")))
model.save(probs_export_path, signatures={'serving_default': add_prob})


INFO:tensorflow:Assets written to: export/probs/sentiment_20200505_232617/assets
INFO:tensorflow:Assets written to: export/probs/sentiment_20200505_232617/assets

In [14]:
!saved_model_cli show --dir {probs_export_path} --tag_set serve --signature_def serving_default


The given SavedModel SignatureDef contains the following input(s):
  inputs['reviews'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: serving_default_reviews:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['positive_review_logits'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: StatefulPartitionedCall_2:0
  outputs['positive_review_probability'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 1)
      name: StatefulPartitionedCall_2:1
Method name is: tensorflow/serving/predict

In [15]:
restored = tf.keras.models.load_model(probs_export_path)
infer = restored.signatures['serving_default']
outputs = infer(reviews=tf.constant([review1, review2, review3])) # note input name
probs = outputs['positive_review_probability']  # note output name
print(probs)


tf.Tensor(
[[0.6674301 ]
 [0.83448184]
 [0.31987208]], shape=(3, 1), dtype=float32)

Deploy to Cloud AI Platform Predictions

We can deploy the model to AI Platform Predictions which will take care of scaling


In [16]:
!find export/probs | head -2 | tail -1


export/probs/sentiment_20200505_232617

In [17]:
%%bash

MODEL_LOCATION=$(find export/probs | head -2 | tail -1)
MODEL_NAME=imdb
MODEL_VERSION=v1

TFVERSION=2.1
REGION=us-central1
BUCKET=ai-analytics-solutions-kfpdemo

# create the model if it doesn't already exist
modelname=$(gcloud ai-platform models list | grep -w "$MODEL_NAME")
echo $modelname
if [ -z "$modelname" ]; then
   echo "Creating model $MODEL_NAME"
   gcloud ai-platform models create ${MODEL_NAME} --regions $REGION
else
   echo "Model $MODEL_NAME already exists"
fi

# delete the model version if it already exists
modelver=$(gcloud ai-platform versions list --model "$MODEL_NAME" | grep -w "$MODEL_VERSION")
echo $modelver
if [ "$modelver" ]; then
   echo "Deleting version $MODEL_VERSION"
   yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}
   sleep 10
fi


echo "Creating version $MODEL_VERSION from $MODEL_LOCATION"
gcloud ai-platform versions create ${MODEL_VERSION} \
       --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --staging-bucket gs://${BUCKET} \
       --runtime-version $TFVERSION


imdb v1
Model imdb already exists
v1 gs://ai-analytics-solutions-kfpdemo/26de3a7744a524256244830bbeb35625e78ad0bd0f7ecd5bda9c8c04d787320b/ READY
Deleting version v1
Creating version v1 from export/probs/sentiment_20200505_232617
This will delete version [v1]...

Do you want to continue (Y/n)?  
Deleting version [v1]......
..........................................................................................................done.
Creating version (this might take a few minutes)......
..................................................................................................................................................................................................................................................................................................done.

In [18]:
%%writefile input.json
{"reviews": "The film is based on a prize-winning novel."}
{"reviews": "The film is fast moving and has several great action scenes."}
{"reviews": "The film was very boring. I walked out half-way."}


Overwriting input.json

In [19]:
!gcloud ai-platform predict --model imdb --json-instances input.json --version v1


POSITIVE_REVIEW_LOGITS  POSITIVE_REVIEW_PROBABILITY
[0.6965846419334412]    [0.6674301028251648]
[1.6177300214767456]    [0.8344818353652954]
[-0.754359781742096]    [0.31987208127975464]

In [20]:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
import json

credentials = GoogleCredentials.get_application_default()
api = discovery.build("ml", "v1", credentials = credentials,
            discoveryServiceUrl = "https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json")

request_data = {"instances":
  [
      {"reviews": "The film is based on a prize-winning novel."},
      {"reviews": "The film is fast moving and has several great action scenes."},
      {"reviews": "The film was very boring. I walked out half-way."}
  ]
}

parent = "projects/{}/models/imdb".format("ai-analytics-solutions", "v1") # use default version

response = api.projects().predict(body = request_data, name = parent).execute()
print("response = {0}".format(response))


response = {'predictions': [{'positive_review_probability': [0.6674301028251648], 'positive_review_logits': [0.6965846419334412]}, {'positive_review_probability': [0.8344818353652954], 'positive_review_logits': [1.6177300214767456]}, {'positive_review_probability': [0.31987208127975464], 'positive_review_logits': [-0.754359781742096]}]}

In [21]:
print(response['predictions'][0]['positive_review_probability'][0])


0.6674301028251648

Stateful function


In [4]:
def stateless_fn(x):
    return 3*x + 15

class Stateless:
    def __init__(self):
        self.weight = 3
        self.bias = 15
    def __call__(self, x):
        return self.weight*x + self.bias

class State:
    def __init__(self):
        self.counter = 0
    def __call__(self, x):
        self.counter = self.counter + 1
        if self.counter % 2 == 0:
            return 3*x + 15
        else:
            return 3*x - 15

a1 = Stateless()
a = State()
print(stateless_fn(3))
print(stateless_fn(3))
print(a1(3))
print(a1(3))
print(a(3))
print(a(3))
print(a(3))
print(a(3))


24
24
24
24
-6
24
-6
24

Online prediction

Train model in BigQuery


In [ ]:
%%bigquery
CREATE OR REPLACE MODEL mlpatterns.neutral_3classes
OPTIONS(model_type='logistic_reg', input_label_cols=['health']) AS

SELECT 
  IF(apgar_1min = 10, 'Healthy', IF(apgar_1min >= 8, 'Neutral', 'NeedsAttention')) AS health,
  plurality,
  mother_age,
  gestation_weeks,
  ever_born
FROM `bigquery-public-data.samples.natality`
WHERE apgar_1min <= 10

This works, but it is too slow


In [48]:
%%bigquery
SELECT * FROM ML.PREDICT(MODEL mlpatterns.neutral_3classes,
    (SELECT 
     2 AS plurality,
     32 AS mother_age,
     41 AS gestation_weeks,
     1 AS ever_born
    )
)


Out[48]:
predicted_health predicted_health_probs plurality mother_age gestation_weeks ever_born
0 Neutral [{'prob': 0.6166873057666589, 'label': 'Neutra... 2 32 41 1

Better is to export the model and then deploy that ...


In [ ]:
%%bash
BUCKET=ai-analytics-solutions-kfpdemo
bq extract -m --destination_format=ML_TF_SAVED_MODEL mlpatterns.neutral_3classes  gs://${BUCKET}/export/baby_health

In [49]:
%%bash

TFVERSION=1.15
REGION=us-central1
BUCKET=ai-analytics-solutions-kfpdemo
MODEL_LOCATION=gs://${BUCKET}/export/baby_health
MODEL_NAME=babyhealth
MODEL_VERSION=v1

# create the model if it doesn't already exist
modelname=$(gcloud ai-platform models list | grep -w "$MODEL_NAME")
echo $modelname
if [ -z "$modelname" ]; then
   echo "Creating model $MODEL_NAME"
   gcloud ai-platform models create ${MODEL_NAME} --regions $REGION
else
   echo "Model $MODEL_NAME already exists"
fi

# delete the model version if it already exists
modelver=$(gcloud ai-platform versions list --model "$MODEL_NAME" | grep -w "$MODEL_VERSION")
echo $modelver
if [ "$modelver" ]; then
   echo "Deleting version $MODEL_VERSION"
   yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}
   sleep 10
fi


echo "Creating version $MODEL_VERSION from $MODEL_LOCATION"
gcloud ai-platform versions create ${MODEL_VERSION} \
       --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --staging-bucket gs://${BUCKET} \
       --runtime-version $TFVERSION


Creating model babyhealth

Creating version v1 from gs://ai-analytics-solutions-kfpdemo/export/baby_health
Created ml engine model [projects/ai-analytics-solutions/models/babyhealth].
Listed 0 items.
Creating version (this might take a few minutes)......
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................done.

In [50]:
%%writefile input.json
{"plurality": 2, "mother_age": 32, "gestation_weeks": 41, "ever_born": 1}


Overwriting input.json

In [51]:
!gcloud ai-platform predict --model babyhealth --json-instances input.json --version v1


HEALTH_PROBS                                                     HEALTH_VALUES                                PREDICTED_HEALTH
[0.020730846529093874, 0.6166873057666589, 0.36258184770424734]  [u'Healthy', u'Neutral', u'NeedsAttention']  [u'Neutral']

Copyright 2020 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License