In [ ]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
|
This tutorial shows how to train a Keras model on tabular data and deploy it to the AI Explanations service to get feature attributions on your deployed model.
If you've already got a trained model and want to deploy it to AI Explanations, skip to the Export the model as a TF 1 SavedModel section.
The dataset used for this tutorial was created by combining two BigQuery Public Datasets: London Bikeshare data and NOAA weather data.
The goal is to train a model using the Keras Sequential API that predicts how long a bike trip took based on the trip start time, distance, day of week, and various weather data during that day.
This tutorial focuses more on deploying the model to AI Explanations than on the design of the model itself.
This tutorial uses billable components of Google Cloud Platform (GCP):
Learn about AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
This tutorial assumes you are running the notebook either in Colab or Cloud AI Platform Notebooks.
The following steps are required, regardless of your notebook environment.
Enable the AI Platform Training & Prediction and Compute Engine APIs.
Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.
Note: Jupyter runs lines prefixed with !
as shell commands, and it interpolates Python variables prefixed with $
into these commands.
In [ ]:
PROJECT_ID = "<your-project-id>"
If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.
In [ ]:
import sys, os
import warnings
import googleapiclient
warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
# If you are running this notebook in Colab, follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.
def install_dlvm_packages():
!pip install tabulate
if 'google.colab' in sys.modules:
from google.colab import auth as google_auth
google_auth.authenticate_user()
!pip install witwidget --quiet
!pip install tensorflow==1.15.0 --quiet
!gcloud config set project $PROJECT_ID
elif "DL_PATH" in os.environ:
install_dlvm_packages()
The following steps are required, regardless of your notebook environment.
When you submit a training job using the Cloud SDK, you upload a Python package containing your training code to a Cloud Storage bucket. AI Platform runs the code from this package. In this tutorial, AI Platform also saves the trained model that results from your job in the same bucket. You can then create an AI Platform model version based on this output in order to serve online predictions.
Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets.
You may also change the REGION
variable, which is used for operations
throughout the rest of this notebook. Make sure to choose a region where Cloud
AI Platform services are
available. You may
not use a Multi-Regional Storage bucket for training with AI Platform.
In [ ]:
BUCKET_NAME = "<your-bucket-name>"
REGION = "us-central1"
Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.
In [ ]:
!gsutil mb -l $REGION gs://$BUCKET_NAME
In [ ]:
import tensorflow as tf
import pandas as pd
import numpy as np
import json
import time
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import MinMaxScaler
from tabulate import tabulate
# Should be 1.15.0
print(tf.__version__)
In this section you'll download the data to train your model from a public GCS bucket. The original data is from the BigQuery datasets linked above. For your convenience, we've joined the London bike and NOAA weather tables, done some preprocessing, and provided a subset of that dataset here.
In [ ]:
# Copy the data to your notebook instance
!gsutil cp 'gs://explanations_sample_data/bike-data.csv' ./
In [ ]:
data = pd.read_csv('bike-data.csv')
# Shuffle the data
data = data.sample(frac=1, random_state=2)
# Drop rows with null values
data = data[data['wdsp'] != 999.9]
data = data[data['dewp'] != 9999.9]
# Rename some columns for readability
data=data.rename(columns = {'day_of_week':'weekday'})
data=data.rename(columns = {'max':'max_temp'})
data=data.rename(columns = {'dewp': 'dew_point'})
# Drop columns we won't use to train this model
data = data.drop(columns=['start_station_name', 'end_station_name', 'bike_id', 'snow_ice_pellets'])
# Convert trip duration from seconds to minutes so it's easier to understand
data['duration'] = data['duration'].apply(lambda x:float(x / 60))
In [ ]:
# Preview the first 5 rows
data.head()
In [ ]:
# Save duration to its own DataFrame and remove it from the original DataFrame
labels = data['duration']
data = data.drop(columns=['duration'])
In [ ]:
# Use 80/20 train/test split
train_size = int(len(data) * .8)
print ("Train size: %d" % train_size)
print ("Test size: %d" % (len(data) - train_size))
# Split our data into train and test sets
train_data = data[:train_size]
train_labels = labels[:train_size]
test_data = data[train_size:]
test_labels = labels[train_size:]
In [ ]:
# Build our model
model = tf.keras.Sequential(name="bike_predict")
model.add(tf.keras.layers.Dense(64, input_dim=len(train_data.iloc[0]), activation='relu'))
model.add(tf.keras.layers.Dense(32, activation='relu'))
model.add(tf.keras.layers.Dense(1))
In [ ]:
# Compile the model and see a summary
optimizer = tf.keras.optimizers.Adam(0.001)
model.compile(loss='mean_squared_logarithmic_error', optimizer=optimizer)
model.summary()
In [ ]:
batch_size = 256
epochs = 3
input_train = tf.data.Dataset.from_tensor_slices(train_data)
output_train = tf.data.Dataset.from_tensor_slices(train_labels)
input_train = input_train.batch(batch_size).repeat()
output_train = output_train.batch(batch_size).repeat()
train_dataset = tf.data.Dataset.zip((input_train, output_train))
In [ ]:
# This will take about a minute to run
# To keep training time short, we're not using the full dataset
model.fit(train_dataset, steps_per_epoch=train_size // batch_size, epochs=epochs)
In [ ]:
# Run evaluation
results = model.evaluate(test_data, test_labels)
print(results)
In [ ]:
# Send test instances to model for prediction
predict = model.predict(test_data[:5])
In [ ]:
# Preview predictions on the first 5 examples from our test dataset
for i, val in enumerate(predict):
print('Predicted duration: {}'.format(round(val[0])))
print('Actual duration: {} \n'.format(test_labels.iloc[i]))
AI Explanations currently supports TensorFlow 1.x. In order to deploy our model in a format compatible with AI Explanations, we'll follow the steps below to convert our Keras model to a TF Estimator, and then use the export_saved_model
method to generate the SavedModel and save it in GCS.
In [ ]:
## Convert our Keras model to an estimator
keras_estimator = tf.keras.estimator.model_to_estimator(keras_model=model, model_dir='savedmodel_export')
In [ ]:
# We need this serving input function to export our model in the next cell
serving_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(
{'dense_input': model.input}
)
In [ ]:
export_path = keras_estimator.export_saved_model(
'gs://' + BUCKET_NAME + '/explanations',
serving_input_receiver_fn=serving_fn
).decode('utf-8')
print(export_path)
Use TensorFlow's saved_model_cli
to inspect the model's SignatureDef. We'll use this information when we deploy our model to AI Explanations in the next section.
In [ ]:
!saved_model_cli show --dir $export_path --all
We need to tell AI Explanations the names of the input and output tensors our model is expecting, which we print below.
The value for input_baselines
tells the explanations service what the baseline input should be for our model. Here we're using the median for all of our input features. That means the baseline prediction for this model will be the trip duration our model predicts for the median of each feature in our dataset.
Since this model accepts a single numpy array with all numerical feature, we can optionally pass an index_feature_mapping
list to AI Explanations to make the API response easier to parse. When we provide a list of feature names via this parameter, the service will return a key / value mapping of each feature with its corresponding attribution value.
In [ ]:
# Print the names of our tensors
print('Model input tensor: ', model.input.name)
print('Model output tensor: ', model.output.name)
In [ ]:
explanation_metadata = {
"inputs": {
"data": {
"input_tensor_name": model.input.name,
"input_baselines": [train_data.median().values.tolist()],
"encoding": "bag_of_features",
"index_feature_mapping": train_data.columns.tolist()
}
},
"outputs": {
"duration": {
"output_tensor_name": model.output.name
}
},
"framework": "tensorflow"
}
Since this is a regression model (predicting a numerical value), the baseline prediction will be the same for every example we send to the model. If this were instead a classification model, each class would have a different baseline prediction.
In [ ]:
# Write the json to a local file
with open('explanation_metadata.json', 'w') as output_file:
json.dump(explanation_metadata, output_file)
In [ ]:
!gsutil cp explanation_metadata.json $export_path
In [ ]:
MODEL = 'bike'
In [ ]:
# Create the model if it doesn't exist yet (you only need to run this once)
!gcloud ai-platform models create $MODEL --enable-logging --regions=us-central1
In [ ]:
# Each time you create a version the name should be unique
VERSION = 'v1'
In [ ]:
# Create the version with gcloud
explain_method = 'integrated-gradients'
!gcloud beta ai-platform versions create $VERSION \
--model $MODEL \
--origin $export_path \
--runtime-version 1.15 \
--framework TENSORFLOW \
--python-version 3.7 \
--machine-type n1-standard-4 \
--explanation-method $explain_method \
--num-integral-steps 25
In [ ]:
# Make sure the model deployed correctly. State should be `READY` in the following log
!gcloud ai-platform versions describe $VERSION --model $MODEL
Now that your model is deployed, you can use the AI Platform Prediction API to get feature attributions. We'll pass it a single test example here and see which features were most important in the model's prediction. Here we'll use the AI Platform Prediction API to get our prediction and explanation. You can also use gcloud
.
In [ ]:
# Format data for prediction to our model
prediction_json = {'dense_input': test_data.iloc[0].values.tolist()}
In [ ]:
# This is adapted from a sample in the docs
# Find it here: https://cloud.google.com/ai-platform/prediction/docs/online-predict#python
def predict_json(project, model, instances, version=None):
"""Send json data to a deployed model for prediction.
Args:
project (str): project where the AI Platform Model is deployed.
model (str): model name.
instances ([Mapping[str: Any]]): Keys should be the names of Tensors
your deployed model expects as inputs. Values should be datatypes
convertible to Tensors, or (potentially nested) lists of datatypes
convertible to tensors.
version: str, version of the model to target.
Returns:
Mapping[str: any]: dictionary of prediction results defined by the
model.
"""
service = googleapiclient.discovery.build('ml', 'v1')
name = 'projects/{}/models/{}'.format(project, model)
if version is not None:
name += '/versions/{}'.format(version)
response = service.projects().explain(
name=name,
body={'instances': instances}
).execute()
if 'error' in response:
raise RuntimeError(response['error'])
return response
In [ ]:
response = predict_json(PROJECT_ID, MODEL, prediction_json, VERSION)
print(response)
In [ ]:
explanations = response['explanations'][0]['attributions_by_label'][0]
predicted = round(explanations['example_score'], 2)
print('Predicted duration: ' + str(predicted) + ' minutes')
print('Actual duration: ' + str(test_labels.iloc[0]) + ' minutes')
Next let's look at the feature attributions for this particular example. Positive attribution values mean a particular feature pushed our model prediction up by that amount, and vice versa for negative attribution values.
In [ ]:
feature_names = test_data.columns.tolist()
attributions = explanations['attributions']
rows = []
for i,val in enumerate(feature_names):
rows.append([val, test_data.iloc[1].tolist()[i], attributions[val][0]])
print(tabulate(rows,headers=['Feature name', 'Feature value', 'Attribution value']))
To better make sense of the feature attributions we're getting, we should compare them with our model's baseline. In most cases, the sum of your attribution values + the baseline should be very close to your model's predicted value for each input. Also note that for regression models, the baseline_score
returned from AI Explanations will be the same for each example sent to your model. For classification models, each class will have its own baseline.
In this section we'll send 10 test examples to our model for prediction in order to compare the feature attributions with the baseline. Then we'll run each test example's attributions through two sanity checks in the sanity_check_explanations
method.
In [ ]:
# Prepare 10 test examples to our model for prediction
pred_batch = []
for i in range(10):
pred_batch.append({'dense_input': test_data.iloc[i].values.tolist()})
In [ ]:
# Make the request using the method we defined above
batch_explain = predict_json(PROJECT_ID, MODEL, pred_batch, VERSION)
In the function below we perform two sanity checks for models using Integrated Gradient (IG) explanations and one sanity check for models using Sampled Shapley.
In [ ]:
def sanity_check_explanations(example, mean_tgt_value=None, variance_tgt_value=None):
passed_test = 0
total_test = 1
# `attributions` is a dict where keys are the feature names
# and values are the feature attributions for each feature
attribution_vals = [x[0] for x in example['attributions_by_label'][0]['attributions'].values()]
baseline_score = example['attributions_by_label'][0]['baseline_score']
sum_with_baseline = np.sum(attribution_vals) + baseline_score
predicted_val = example['attributions_by_label'][0]['example_score']
# Sanity check 1
# The prediction at the input is equal to that at the baseline.
# Please use a different baseline. Some suggestions are: random input, training
# set mean.
if abs(predicted_val - baseline_score) <= 0.05:
print('Warning: example score and baseline score are too close.')
print('You might not get attributions.')
else:
passed_test += 1
# Sanity check 2 (only for models using Integrated Gradient explanations)
# Ideally, the sum of the integrated gradients must be equal to the difference
# in the prediction probability at the input and baseline. Any discrepency in
# these two values is due to the errors in approximating the integral.
if explain_method == 'integrated-gradients':
total_test += 1
want_integral = predicted_val - baseline_score
got_integral = sum(attribution_vals)
if abs(want_integral-got_integral)/abs(want_integral) > 0.05:
print('Warning: Integral approximation error exceeds 5%.')
print('Please try increasing the number of integrated gradient steps.')
else:
passed_test += 1
print(passed_test, ' out of ', total_test, ' sanity checks passed.')
In [ ]:
for i in batch_explain['explanations']:
sanity_check_explanations(i)
In this section we'll use the What-If Tool to better understand how our model is making predictions. See the cell below the What-if Tool for visualization ideas.
The What-If-Tool expects data with keys for each feature name, but our model expects a flat list. The functions below convert data to the format required by the What-If Tool.
In [ ]:
# This is the number of data points we'll send to the What-if Tool
WHAT_IF_TOOL_SIZE = 500
from witwidget.notebook.visualization import WitWidget, WitConfigBuilder
def create_list(ex_dict):
new_list = []
for i in feature_names:
new_list.append(ex_dict[i])
return new_list
def example_dict_to_input(example_dict):
return { 'dense_input': create_list(example_dict) }
from collections import OrderedDict
wit_data = test_data.iloc[:WHAT_IF_TOOL_SIZE].copy()
wit_data['duration'] = test_labels[:WHAT_IF_TOOL_SIZE]
wit_data_dict = wit_data.to_dict(orient='records', into=OrderedDict)
In [ ]:
config_builder = WitConfigBuilder(
wit_data_dict
).set_ai_platform_model(
PROJECT_ID,
MODEL,
VERSION,
adjust_example=example_dict_to_input
).set_target_feature('duration').set_model_type('regression')
WitWidget(config_builder)
On the x-axis, you'll see the predicted trip duration for the test inputs you passed to the What-If Tool. Each circle represents one of your test examples. If you click on a circle, you'll be able to see the feature values for that example along with the attribution values for each feature.
distance
, click Run inference and see how that affects the model's predictionTo clean up all GCP resources used in this project, you can delete the GCP project you used for the tutorial.
Alternatively, you can clean up individual resources by running the following commands:
In [ ]:
# Delete model version resource
!gcloud ai-platform versions delete $VERSION --quiet --model $MODEL
# Delete model resource
!gcloud ai-platform models delete $MODEL --quiet
# Delete Cloud Storage objects that were created
!gsutil -m rm -r $BUCKET_NAME
If your Cloud Storage bucket doesn't contain any other objects and you would like to delete it, run gsutil rm -r gs://$BUCKET_NAME
.