Deploying and Making Predictions with a Trained Model

Learning Objectives

Deploy a model on Google CMLE
Make online and batch predictions with a deployed model

Introduction

In this notebook, we will deploy the model we trained to predict birthweight and we will use that deployed model to make predictions using our cloud-hosted machine learning model. Cloud ML Engine provides two ways to get predictions from trained models; i.e., online prediction and batch prediction; and we do both in this notebook.

Have a look at this blog post on Online vs Batch Prediction to see the trade-offs of both approaches.

As usual we start by setting our environment variables to reference our Project and Bucket.



In [ ]:

    
PROJECT = "cloud-training-demos"  # Replace with your PROJECT
BUCKET = "cloud-training-bucket"  # Replace with your BUCKET
REGION = "us-central1"            # Choose an available region for Cloud MLE
TFVERSION = "1.14"                # TF version for CMLE to use



In [ ]:

    
import os
os.environ["BUCKET"] = BUCKET
os.environ["PROJECT"] = PROJECT
os.environ["REGION"] = REGION
os.environ["TFVERSION"] = TFVERSION



In [ ]:

    
%%bash
if ! gsutil ls -r gs://${BUCKET} | grep -q gs://${BUCKET}/babyweight/trained_model/; then
    gsutil mb -l ${REGION} gs://${BUCKET}
    # copy canonical model if you didn't do previous notebook
    gsutil -m cp -R gs://cloud-training-demos/babyweight/trained_model gs://${BUCKET}/babyweight/trained_model
fi

Deploy trained model

Next we'll deploy the trained model to act as a REST web service using a simple gcloud call. To start, we'll check if our model and version already exists and if so, we'll delete them.



In [ ]:

    
%%bash
MODEL_NAME="babyweight"
MODEL_VERSION="ml_on_gcp"

# Check to see if the model and version already exist, 
# if so, delete them to deploy anew
if gcloud ai-platform models list | grep "$MODEL_NAME \+ $MODEL_VERSION"; then
    echo "Deleting the version '$MODEL_VERSION' of model '$MODEL_NAME'"
    yes | gcloud ai-platform versions delete ${MODEL_VERSION} --model=$MODEL_NAME
    
    echo "Deleting the model '$MODEL_NAME'"
    yes |gcloud ai-platform models delete ${MODEL_NAME}
else 
    echo "The model '$MODEL_NAME' with version '$MODEL_VERSION' does not exist."
fi

We'll now deploy our model. This will take a few minutes. Once the cell below completes, you should be able to see your newly deployed model in the 'Models' portion of the AI Platform section of the GCP console.

Let's have a look at the contents of the exporter bucket to see which model binaries we have. We can deploy a model by specifying any of these locations. To make sure we grab the model trained from the most recent training job, we'll use tail -1



In [ ]:

    
%%bash
gsutil ls gs://${BUCKET}/babyweight/trained_model/export/exporter/

Exercise 1

After completing the TODOs in the code cell below, we will be able to deploy our saved model to the cloud and make predictions. There are two TODOs below.

For the first TODO, write a gcloud command to create a model called babyweight.
In the second TODO, write a gcloud to create a version called ml_on_gcp.

Look up the Cloud AI-Platform documentation to remind your self how to create these commands. You'll need to provide use the MODEL_NAME, MODEL_VERSION, MODEL_LOCATION, REGION and TFVERSION provided for you.



In [ ]:

    
%%bash
MODEL_NAME="babyweight"
MODEL_VERSION="ml_on_gcp"
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/babyweight/trained_model/export/exporter/ | tail -1)

echo "Deploying the model '$MODEL_NAME', version '$MODEL_VERSION' from $MODEL_LOCATION"
echo "... this will take a few minutes"

gcloud # TODO: Your code goes here
gcloud # TODO: Your code goes here

Use the deployed model to make online predictions

To make online predictions, we'll send a JSON request to the endpoint of the service to make it predict a baby's weight. The order of the responses are the order of the instances.

Exercise 2

In the cell below we'll make online predictions with the model we just deployed. In order to do that, we need to set up the right token and api to create the correct request post at the bottom. Complete the TODOs below. You will need to

Specify the correct MODEL_NAME and MODEL_VERSION we want to use for prediction
Use GoogleCredentials library to create an access token
Create a variable called api which specifies the Google prediciton API using the Project, model name, and model version

Add an addtional instance to prediction with the following properties

'is_male': 'Unknown',
'mother_age': 29.0,
'plurality': 'Multiple(2+)',
'gestation_weeks': 38

Create a variable called response which will post a request to our model API to make prediction



In [ ]:

    
from oauth2client.client import GoogleCredentials
import requests
import json

MODEL_NAME = # TODO: Your code goes here
MODEL_VERSION = # TODO: Your code goes here

token = # TODO: Your code goes here
api = # TODO: Your code goes here

headers = {"Authorization": "Bearer " + token }
data = {
  "instances": [
    {
      "is_male": "True",
      "mother_age": 26.0,
      "plurality": "Single(1)",
      "gestation_weeks": 39
    },
    {
      "is_male": "False",
      "mother_age": 29.0,
      "plurality": "Single(1)",
      "gestation_weeks": 38
    },
    {
      "is_male": "True",
      "mother_age": 26.0,
      "plurality": "Triplets(3)",
      "gestation_weeks": 39
    },
    # TODO: Your code goes here
  ]
}
response = # TODO: Your code goes here
print(response.content)

Use model for batch prediction

Batch prediction is commonly used when you want to make thousands to millions of predictions at a time. To perform batch prediction we'll create a file with one instance per line and submit the entire prediction job through a gcloud command.

To illustrate this, let's create a file inputs.json which has two instances on which we want to predict.



In [ ]:

    
%%writefile inputs.json
{"is_male": "True", "mother_age": 26.0, "plurality": "Single(1)", "gestation_weeks": 39}
{"is_male": "False", "mother_age": 26.0, "plurality": "Single(1)", "gestation_weeks": 39}

Exercise 3

In the cells below we'll write the inputs.json file we just created to our Cloud storage bucket, then submit a batch prediction job to the cloud pointing at that file. We'll also need to specify the output location in GCS where we'd like the final predicitons to be deposited. In the TODOs below, you will need to

Use gsutil to copy the inputs.json file to the location specified by INPUT
Use gsutil to clear out the directory specified by OUTPUT. This will ensure the only thing is that location are our predictions
Complete the glcoud command to submit a batch prediction job.
Specify the values of all the arguments for the gcloud command

Have a look at the documentation for submitting batch predictions via gcloud to remind yourself of the format.



In [ ]:

    
%%bash
INPUT=gs://${BUCKET}/babyweight/batchpred/inputs.json
OUTPUT=gs://${BUCKET}/babyweight/batchpred/outputs

gsutil # TODO: Your code goes here
gsutil # TODO: Your code goes here

gcloud ai-platform # TODO: Your code goes here
    --data-format= # TODO: Your code goes here
    --region= # TODO: Your code goes here
    --input-paths= # TODO: Your code goes here
    --output-path= # TODO: Your code goes here
    --model= # TODO: Your code goes here
    --version= # TODO: Your code goes here

Check the ML Engine jobs submitted to the GCP console to make sure the prediction job has completed, then let's have a look at the results of our predictions.



In [ ]:

    
!gsutil ls gs://$BUCKET/babyweight/batchpred/outputs



In [ ]:

    
!gsutil cat gs://$BUCKET/babyweight/batchpred/outputs/prediction.results*

Copyright 2017 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License