Deploying and Making Predictions with a Trained Model

Learning Objectives

  • Deploy a model on Google CMLE
  • Make online and batch predictions with a deployed model

Introduction

In this notebook, we will deploy the model we trained to predict birthweight and we will use that deployed model to make predictions using our cloud-hosted machine learning model. Cloud ML Engine provides two ways to get predictions from trained models; i.e., online prediction and batch prediction; and we do both in this notebook.

Have a look at this blog post on Online vs Batch Prediction to see the trade-offs of both approaches.

As usual we start by setting our environment variables to reference our Project and Bucket.


In [ ]:
PROJECT = "cloud-training-demos"  # Replace with your PROJECT
BUCKET = "cloud-training-bucket"  # Replace with your BUCKET
REGION = "us-central1"            # Choose an available region for Cloud MLE
TFVERSION = "1.14"                # TF version for CMLE to use

In [ ]:
import os
os.environ["BUCKET"] = BUCKET
os.environ["PROJECT"] = PROJECT
os.environ["REGION"] = REGION
os.environ["TFVERSION"] = TFVERSION

In [ ]:
%%bash
if ! gsutil ls -r gs://${BUCKET} | grep -q gs://${BUCKET}/babyweight/trained_model/; then
    gsutil mb -l ${REGION} gs://${BUCKET}
    # copy canonical model if you didn't do previous notebook
    gsutil -m cp -R gs://cloud-training-demos/babyweight/trained_model gs://${BUCKET}/babyweight/trained_model
fi

Deploy trained model

Next we'll deploy the trained model to act as a REST web service using a simple gcloud call. To start, we'll check if our model and version already exists and if so, we'll delete them.


In [ ]:
%%bash
MODEL_NAME="babyweight"
MODEL_VERSION="ml_on_gcp"

# Check to see if the model and version already exist, 
# if so, delete them to deploy anew
if gcloud ai-platform models list | grep "$MODEL_NAME \+ $MODEL_VERSION"; then
    echo "Deleting the version '$MODEL_VERSION' of model '$MODEL_NAME'"
    yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model=$MODEL_NAME
    
    echo "Deleting the model '$MODEL_NAME'"
    yes | gcloud ai-platform models delete ${MODEL_NAME}
else 
    echo "The model '$MODEL_NAME' with version '$MODEL_VERSION' does not exist."
fi

We'll now deploy our model. This will take a few minutes. Once the cell below completes, you should be able to see your newly deployed model in the 'Models' portion of the ML Engine section of the GCP console.


In [ ]:
%%bash
MODEL_NAME="babyweight"
MODEL_VERSION="ml_on_gcp"
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/babyweight/trained_model/export/exporter/ | tail -1)

echo "Deploying the model '$MODEL_NAME', version '$MODEL_VERSION' from $MODEL_LOCATION"
echo "... this will take a few minutes"

gcloud ai-platform models create ${MODEL_NAME} --regions $REGION
gcloud ai-platform versions create ${MODEL_VERSION} \
  --model ${MODEL_NAME} \
  --origin ${MODEL_LOCATION} \
  --runtime-version $TFVERSION

Use the deployed model to make online predictions

To make online predictions, we'll send a JSON request to the endpoint of the service to make it predict a baby's weight. The order of the responses are the order of the instances.


In [ ]:
from oauth2client.client import GoogleCredentials
import requests
import json

MODEL_NAME = "babyweight"
MODEL_VERSION = "ml_on_gcp"

token = GoogleCredentials.get_application_default().get_access_token().access_token
api = "https://ml.googleapis.com/v1/projects/{}/models/{}/versions/{}:predict" \
         .format(PROJECT, MODEL_NAME, MODEL_VERSION)
headers = {"Authorization": "Bearer " + token }
data = {
  "instances": [
    {
      "is_male": "True",
      "mother_age": 26.0,
      "plurality": "Single(1)",
      "gestation_weeks": 39
    },
    {
      "is_male": "False",
      "mother_age": 29.0,
      "plurality": "Single(1)",
      "gestation_weeks": 38
    },
    {
      "is_male": "True",
      "mother_age": 26.0,
      "plurality": "Triplets(3)",
      "gestation_weeks": 39
    },
    {
      "is_male": "Unknown",
      "mother_age": 29.0,
      "plurality": "Multiple(2+)",
      "gestation_weeks": 38
    },
  ]
}
response = requests.post(api, json=data, headers=headers)
print(response.content)

When I ran the cell above, the predictions that I received for the four instances were 7.64, 7.17, 6.24 and 6.13 pounds, respectively. Your results might be different.

Use model for batch prediction

Batch prediction is commonly used when you want to make thousands to millions of predictions at a time. To perform batch prediction we'll create a file with one instance per line and submit the entire prediction job through a gcloud command.

To illustrate this, let's create a file inputs.json which has two instances on which we want to predict.


In [ ]:
%%writefile inputs.json
{"is_male": "True", "mother_age": 26.0, "plurality": "Single(1)", "gestation_weeks": 39}
{"is_male": "False", "mother_age": 26.0, "plurality": "Single(1)", "gestation_weeks": 39}

When making batch predictions, we specify the Google Cloud Storage location of the input json file as well as the locatin to deposit the predictions. The cell below submits a batch prediction job to the cloud. We can monitor the status from the 'Jobs' portion of the ML Engine section of the GCP console. Once the jobs shows that it's completed there, we can examine the predictions uploaded to the OUTPUT location we specify below.


In [ ]:
%%bash
INPUT=gs://${BUCKET}/babyweight/batchpred/inputs.json
OUTPUT=gs://${BUCKET}/babyweight/batchpred/outputs

gsutil cp inputs.json $INPUT
gsutil -m rm -rf $OUTPUT 
gcloud ai-platform jobs submit prediction babypred_$(date -u +%y%m%d_%H%M%S) \
    --data-format=TEXT \
    --region ${REGION} \
    --input-paths=$INPUT \
    --output-path=$OUTPUT \
    --model=babyweight \
    --version=ml_on_gcp

Check the AI Platform jobs submitted to the GCP console to make sure the prediction job has completed, then let's have a look at the results of our predictions.


In [ ]:
!gsutil ls gs://$BUCKET/babyweight/batchpred/outputs

In [ ]:
!gsutil cat gs://$BUCKET/babyweight/batchpred/outputs/prediction.results*

Copyright 2017 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License