By deploying or using this software you agree to comply with the AI Hub Terms of Service and the Google APIs Terms of Service. To the extent of a direct conflict of terms, the AI Hub Terms of Service will control.

Overview

This notebook provides an example workflow of using the PCA ML container for training a clustering ML model.

Dataset

The notebook uses the Iris dataset. It consists of 3 different types of irises (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x5 table. Since this algorithm is unsupervised, we disregard the labels.

Objective

The goal of this notebook is to go through a common training workflow:

Create a dataset
Train an ML model using the AI Platform Training service
Monitor the training job with TensorBoard
Identify if the model was trained successfully by looking at the generated "Run Report"
Deploy the model for serving using the AI Platform Prediction service
Use the endpoint for online predictions
Interactively inspect the deployed ML model with the What-If Tool

Costs

This tutorial uses billable components of Google Cloud Platform (GCP):

Cloud AI Platform
Cloud Storage

Learn about Cloud AI Platform pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.

Set up your local development environment

If you are using Colab or AI Platform Notebooks, your environment already meets all the requirements to run this notebook. You can skip this step.

Otherwise, make sure your environment meets this notebook's requirements. You need the following:

The Google Cloud SDK
Git
Python 3
virtualenv
Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to Setting up a Python development environment and the Jupyter installation guide provide detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:

Install and initialize the Cloud SDK.
Install Python 3.
Install virtualenv and create a virtual environment that uses Python 3.
Activate that environment and run pip install jupyter in a shell to install Jupyter.
Run jupyter notebook in a shell to launch Jupyter.
Open this notebook in the Jupyter Notebook Dashboard.

Set up your GCP project

The following steps are required, regardless of your notebook environment.

Select or create a GCP project.. When you first create an account, you get a $300 free credit towards your compute/storage costs.
Make sure that billing is enabled for your project.
Enable the AI Platform APIs and Compute Engine APIs.
Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.

Note: Jupyter runs lines prefixed with ! as shell commands, and it interpolates Python variables prefixed with $ into these commands.



In [ ]:

    
PROJECT_ID = "[your-project-id]" #@param {type:"string"}
! gcloud config set project $PROJECT_ID

Authenticate your GCP account

If you are using AI Platform Notebooks, your environment is already authenticated. Skip this step.

If you are using Colab, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

Otherwise, follow these steps:

In the GCP Console, go to the Create service account key page.
From the Service account drop-down list, select New service account.
In the Service account name field, enter a name.
From the Role drop-down list, select Machine Learning Engine > AI Platform Admin and Storage > Storage Object Admin.
Click Create. A JSON file that contains your key downloads to your local environment.
Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.



In [ ]:

    
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

if 'google.colab' in sys.modules:
  from google.colab import auth as google_auth
  google_auth.authenticate_user()

# If you are running this notebook locally, replace the string below with the
# path to your service account key and run this cell to authenticate your GCP
# account.
else:
  %env GOOGLE_APPLICATION_CREDENTIALS ''

Create a Cloud Storage bucket

The following steps are required, regardless of your notebook environment.

You need to have a "workspace" bucket that will hold the dataset and the output from the ML Container. Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets.

You may also change the REGION variable, which is used for operations throughout the rest of this notebook. Make sure to choose a region where Cloud AI Platform services are available. You may not use a Multi-Regional Storage bucket for training with AI Platform.



In [ ]:

    
BUCKET_NAME = "[your-bucket-name]" #@param {type:"string"}
REGION = 'us-central1' #@param {type:"string"}

Only if your bucket doesn't already exist: Run the following cell to create your Cloud Storage bucket.



In [ ]:

    
! gsutil mb -l $REGION gs://$BUCKET_NAME

Finally, validate access to your Cloud Storage bucket by examining its contents:



In [ ]:

    
! gsutil ls -al gs://$BUCKET_NAME

PIP Install Packages and dependencies



In [ ]:

    
! pip install witwidget

Import libraries and define constants



In [ ]:

    
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os
import time
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from IPython.core.display import HTML
from googleapiclient import discovery

Create a dataset



In [4]:

    
# load Iris dataset
iris = datasets.load_iris()
names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
data = pd.DataFrame(iris.data, columns=names)

# split
training, validation = train_test_split(data, test_size=50)

# standardization 
data_mean = training.mean(axis=0)
data_std = training.std(axis=0)
training = (training - data_mean) / data_std
validation = (validation - data_mean) / data_std

print('Training data head')
display(training.head())

training_data = os.path.join('gs://', BUCKET_NAME, 'data/train.csv')
validation_data = os.path.join('gs://', BUCKET_NAME, 'data/valid.csv')

print('Copy the data in bucket ...')
with tf.io.gfile.GFile(training_data, 'w') as f:
  training.to_csv(f, index=False)
with tf.io.gfile.GFile(validation_data, 'w') as f:
  validation.to_csv(f, index=False)









    



Training data head






    







  
    
      
      sepal_length
      sepal_width
      petal_length
      petal_width
    
  
  
    
      116
      0.862523
      -0.061862
      0.970659
      0.741731
    
    
      148
      0.478611
      0.821882
      0.911724
      1.412374
    
    
      30
      -1.312980
      0.159074
      -1.327805
      -1.404326
    
    
      118
      2.398172
      -0.945606
      1.795748
      1.412374
    
    
      37
      -1.185009
      1.263754
      -1.445675
      -1.538455
    
  








    



Copy the data in bucket ...

	sepal_length	sepal_width	petal_length	petal_width
116	0.862523	-0.061862	0.970659	0.741731
148	0.478611	0.821882	0.911724	1.412374
30	-1.312980	0.159074	-1.327805	-1.404326
118	2.398172	-0.945606	1.795748	1.412374
37	-1.185009	1.263754	-1.445675	-1.538455

Cloud training

Accelerator and distribution support

GPU	Multi-GPU Node	TPU	Workers	Parameter Server
Yes	No	No	Yes	Yes

To have distribution and/or accelerators to your AI Platform training call, use parameters similar to the examples as shown below.

--master-machine-type standard_gpu
    --worker-machine-type standard_gpu
    --worker-count 2

AI Platform training



In [ ]:

    
output_location = os.path.join('gs://', BUCKET_NAME, 'output')

job_name = "pca_{}".format(time.strftime("%Y%m%d%H%M%S"))
!gcloud ai-platform jobs submit training $job_name \
    --master-image-uri gcr.io/aihub-c2t-containers/kfp-components/oob_algorithm/pca:latest \
    --region $REGION \
    --scale-tier CUSTOM \
    --master-machine-type standard \
    -- \
    --output-location $output_location \
    --training-data $training_data \
    --validation-data $validation_data \
    --data-type csv \
    --number-components 2

Local training snippet

Note that the training can also be done locally with Docker

docker run \
    -v /tmp:/tmp \
    -it gcr.io/aihub-c2t-containers/kfp-components/oob_algorithm/pca:latest \
    --output-location /tmp/pca \
    --training-data /tmp/train.csv \
    --validation-data /tmp/valid.csv \
    --data-type csv \
    --number-components 2

Monitor the training with TensorBoard



In [ ]:

    
try:
  %load_ext tensorboard
  %tensorboard --logdir {output_location}
except:
  !tensorboard --logdir {output_location}

Inspect the Run Report

The "Run Report" will help you identify if the model was successfully trained.



In [5]:

    
if not tf.io.gfile.exists(os.path.join(output_location, 'report.html')):
  raise RuntimeError('The file report.html was not found. Did the training job finish?')

with tf.io.gfile.GFile(os.path.join(output_location, 'report.html')) as f:
  display(HTML(f.read()))









    




        
        






    






temp_input_nb

















+ Table of Contents



















Runtime arguments¶











value




training_data
gs://aihub-content-test/pca/data/train.csv


validation_data
gs://aihub-content-test/pca/data/valid.csv


number_components
2


output_location
gs://aihub-content-test/pca/output


data_type
csv


fresh_start
False


projection_dimension
0


projection_type
sparse


projection_sparsity_coefficient
1


batch_size
64


sparse_data_flag
False


training_steps
-1


num_gpus
0


remainder
None









Tensorboard snippet¶





To see the training progress, you can need to install the latest tensorboard
with the command: pip install -U tensorboard
and then run one of the following commands.
Local tensorboard¶
tensorboard --logdir gs://aihub-content-test/pca/output

Publicly shared tensorboard¶
tensorboard dev upload --logdir gs://aihub-content-test/pca/output






Datasets¶





Data reading snippet¶





import tensorflow as tf
import pandas as pd

sample = pd.DataFrame()
for filename in tf.io.gfile.glob('gs://aihub-content-test/pca/data/valid.csv'):
  with tf.io.gfile.GFile(filename, 'r') as f:
    sample = sample.append(
      pd.read_csv(f, nrows=sample_size-len(sample)))
  if len(sample) >= sample_size:
    break






Training dataset sample¶











sepal_length
sepal_width
petal_length
petal_width











0
0.7053
0.1411
0.9918
0.7776


1
0.5839
-1.8053
0.3677
0.1263


...
...
...
...
...


98
2.2835
1.8443
1.6727
1.2987


99
0.5839
-1.3187
0.6514
0.3869



100 rows × 4 columns






Validation dataset sample¶











sepal_length
sepal_width
petal_length
petal_width











0
0.4625
-0.3455
0.3109
0.1263


1
0.5839
0.6277
0.5379
0.5171


...
...
...
...
...


48
-0.1445
3.3041
-1.2778
-1.0460


49
-0.8729
1.8443
-1.2211
-1.3065



50 rows × 4 columns






Dataset inspection¶





You can use AI Platform to create a detailed inspection report for your
dataset with the following console snippet:
DATA=gs://aihub-content-test/pca/data/valid.csv
#DATA=gs://aihub-content-test/pca/data/train.csv
OUTPUT_LOCATION=gs://aihub-content-test/pca/output
# can be one of: tfrecord, parquet, avro, csv, json, bigquery
DATA_TYPE=tfrecord
MAX_SAMPLE_SIZE=10000
JOB_NAME=tabular_data_inspection_$(date '+%Y%m%d_%H%M%S')

gcloud ai-platform jobs submit training $JOB_NAME \
  --stream-logs \
  --master-image-uri gcr.io/kf-pipeline-contrib/kfp-components/oob_algorithm/tabular_data_inspection:latest \
  -- \
  --output-location $OUTPUT_LOCATION \
  --data $DATA \
  --data-type $DATA_TYPE \
  --max-sample-size $MAX_SAMPLE_SIZE






Predictions¶





Local predictions snippet¶





import tensorflow as tf

# The input data should have format: {f1: [[1],[2]], f2: [[4,2],[3,1], ...]}
saved_model = 'gs://aihub-content-test/pca/output/export/1581023709'
predict_fn = tf.contrib.predictor.from_saved_model(saved_model)
predictions = predict_fn(estimator_input)






Deploy for serving snippet¶





MODEL_NAME='REPLACE_WITH_YOUR_MODEL_NAME'
MODEL_VERSION='v1'

# create model name
gcloud ai-platform models create $MODEL_NAME

# create version name
gcloud ai-platform versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --origin gs://aihub-content-test/pca/output/export/1581023709 \
  --runtime-version=1.15 \
  --framework=tensorflow \
  --python-version=3.7












Training transformation sample¶











result



0
1




0
1.2770
0.5201


1
1.0897
-1.4078


...
...
...


98
2.2382
2.7737


99
1.2498
-0.9211



100 rows × 2 columns






Validation transformation sample¶











result



0
1




0
0.5454
-0.1206


1
0.6567
0.8701


...
...
...


48
-2.4631
2.7874


49
-2.4948
1.1447



50 rows × 2 columns









Scatter plot of the first two principal components¶



























Prediction tables¶









Training data and transformation sample¶











Predictions
Data Features:



result
sepal_length
sepal_width
petal_length
petal_width



0
1








26
1.1482
-0.2591
0.4625
-0.5888
0.5947
0.7776


86
0.6685
-0.0272
0.7053
-0.3455
0.3109
0.1263


2
1.4993
0.8095
0.7053
0.3844
0.8784
1.4289


55
0.2998
-1.1274
-0.3873
-1.0754
0.3677
-0.0039


75
0.8904
0.0737
0.9481
-0.3455
0.4812
0.1263


93
0.8221
0.3057
0.9481
-0.1022
0.3677
0.2566


16
1.2592
1.1641
0.4625
0.8710
0.9351
1.4289


73
-2.1993
0.2954
-0.7515
0.8710
-1.3346
-1.3065


54
0.0002
-2.7486
-0.9943
-2.5352
-0.1430
-0.2644


95
0.8087
0.5751
1.0695
0.1411
0.3677
0.2566


53
-2.1987
-0.5246
-1.1157
0.1411
-1.2778
-1.4367


92
0.4016
-1.5561
-1.1157
-1.3187
0.4244
0.6474


78
0.1892
-0.2377
-0.5087
-0.1022
0.4244
0.3869


13
-2.3107
-0.6503
-1.4799
0.1411
-1.2778
-1.3065


7
0.5803
0.0773
0.3411
-0.1022
0.4812
0.2566


30
1.1485
-0.6466
-0.0231
-0.8321
0.7649
0.9079


22
-2.7638
1.8427
-0.7515
2.5742
-1.2778
-1.4367


24
1.2270
-0.0363
1.1909
-0.5888
0.5947
0.2566


33
-2.3900
-0.3860
-1.3585
0.3844
-1.3913
-1.3065


8
1.7387
1.0319
0.5839
0.6277
1.2756
1.6894


43
1.4367
-0.3522
-0.0231
-0.5888
0.7649
1.5592


62
1.9304
0.7487
1.0695
0.1411
1.0486
1.5592


3
-2.2279
0.2513
-0.8729
0.8710
-1.2778
-1.3065


71
-2.8503
0.4507
-1.4799
1.3576
-1.5615
-1.3065


45
-2.3458
1.9069
-0.0231
2.3309
-1.4480
-1.3065


48
0.0207
-1.0287
-0.7515
-0.8321
0.0840
0.2566


6
-2.2807
0.1213
-1.2371
0.8710
-1.0509
-1.3065


99
1.2498
-0.9211
0.5839
-1.3187
0.6514
0.3869


82
-2.4247
-0.9715
-1.7227
-0.1022
-1.3913
-1.3065


76
-2.4186
-0.4301
-1.4799
0.3844
-1.3346
-1.3065









Validation data and transformation sample¶











Predictions
Data Features:



result
sepal_length
sepal_width
petal_length
petal_width



0
1








28
0.2468
-0.5818
-0.1445
-0.5888
0.1975
0.1263


11
-2.2632
0.4858
-0.8729
1.1143
-1.3346
-1.1762


10
2.2445
0.5150
1.7979
-0.3455
1.4458
0.7776


41
0.3785
-0.5717
-0.1445
-0.5888
0.4244
0.1263


2
-2.1547
-0.5545
-1.2371
0.1411
-1.2211
-1.3065


27
2.7129
0.9911
2.2835
-0.1022
1.3323
1.4289


38
-2.0200
1.4417
-0.1445
1.8443
-1.1643
-1.1762


31
0.2931
-0.8487
-0.2659
-0.8321
0.2542
0.1263


22
2.1171
2.8309
2.5263
1.8443
1.5025
1.0381


4
1.2766
-1.1270
-0.1445
-1.3187
0.7081
1.0381


33
2.1006
2.1227
1.6765
1.3576
1.3323
1.6894


35
-1.8692
0.0766
-0.8729
0.6277
-1.1643
-0.9157


26
-2.6065
0.5865
-1.1157
1.3576
-1.3346
-1.4367


34
-2.6145
2.2174
-0.3873
2.8175
-1.3346
-1.3065


18
1.3177
-0.4326
0.5839
-0.8321
0.6514
0.7776


7
0.0571
-0.1482
-0.1445
-0.1022
0.2542
-0.0039


14
1.3230
-1.7268
0.2197
-2.0486
0.7081
0.3869


45
0.6637
0.4918
0.0983
0.3844
0.5947
0.7776


48
-2.4631
2.7874
-0.1445
3.3041
-1.2778
-1.0460


29
-1.9774
0.3964
-0.5087
0.8710
-1.1643
-1.3065


15
1.8029
0.4090
0.8267
-0.1022
1.1621
1.2987


30
0.4600
-0.3976
0.3411
-0.5888
0.1407
0.1263


32
1.9432
1.0114
1.1909
0.3844
1.2188
1.4289


16
-0.2906
-2.0854
-0.9943
-1.8053
-0.2565
-0.2644


42
2.0103
-0.6687
1.0695
-1.3187
1.1621
0.7776


20
-0.4272
-1.9094
-1.1157
-1.5620
-0.2565
-0.2644


43
0.1795
-1.6023
-0.3873
-1.5620
0.0272
-0.1342


8
1.5955
1.1537
1.0695
0.6277
1.1053
1.1684


13
1.8160
0.5272
1.1909
-0.1022
0.9918
1.1684


25
-1.7657
-2.4696
-1.6013
-1.8053
-1.3913
-1.1762

	value
training_data	gs://aihub-content-test/pca/data/train.csv
validation_data	gs://aihub-content-test/pca/data/valid.csv
number_components	2
output_location	gs://aihub-content-test/pca/output
data_type	csv
fresh_start	False
projection_dimension	0
projection_type	sparse
projection_sparsity_coefficient	1
batch_size	64
sparse_data_flag	False
training_steps	-1
num_gpus	0
remainder	None

	sepal_length	sepal_width	petal_length	petal_width
0	0.7053	0.1411	0.9918	0.7776
1	0.5839	-1.8053	0.3677	0.1263
...	...	...	...	...
98	2.2835	1.8443	1.6727	1.2987
99	0.5839	-1.3187	0.6514	0.3869

	sepal_length	sepal_width	petal_length	petal_width
0	0.4625	-0.3455	0.3109	0.1263
1	0.5839	0.6277	0.5379	0.5171
...	...	...	...	...
48	-0.1445	3.3041	-1.2778	-1.0460
49	-0.8729	1.8443	-1.2211	-1.3065

	result
0	1.2770	0.5201
1	1.0897	-1.4078
...	...	...
98	2.2382	2.7737
99	1.2498	-0.9211

	result
0	0.5454	-0.1206
1	0.6567	0.8701
...	...	...
48	-2.4631	2.7874
49	-2.4948	1.1447

	Predictions	Data Features:
26	1.1482	-0.2591	0.4625	-0.5888	0.5947	0.7776
86	0.6685	-0.0272	0.7053	-0.3455	0.3109	0.1263
2	1.4993	0.8095	0.7053	0.3844	0.8784	1.4289
55	0.2998	-1.1274	-0.3873	-1.0754	0.3677	-0.0039
75	0.8904	0.0737	0.9481	-0.3455	0.4812	0.1263
93	0.8221	0.3057	0.9481	-0.1022	0.3677	0.2566
16	1.2592	1.1641	0.4625	0.8710	0.9351	1.4289
73	-2.1993	0.2954	-0.7515	0.8710	-1.3346	-1.3065
54	0.0002	-2.7486	-0.9943	-2.5352	-0.1430	-0.2644
95	0.8087	0.5751	1.0695	0.1411	0.3677	0.2566
53	-2.1987	-0.5246	-1.1157	0.1411	-1.2778	-1.4367
92	0.4016	-1.5561	-1.1157	-1.3187	0.4244	0.6474
78	0.1892	-0.2377	-0.5087	-0.1022	0.4244	0.3869
13	-2.3107	-0.6503	-1.4799	0.1411	-1.2778	-1.3065
7	0.5803	0.0773	0.3411	-0.1022	0.4812	0.2566
30	1.1485	-0.6466	-0.0231	-0.8321	0.7649	0.9079
22	-2.7638	1.8427	-0.7515	2.5742	-1.2778	-1.4367
24	1.2270	-0.0363	1.1909	-0.5888	0.5947	0.2566
33	-2.3900	-0.3860	-1.3585	0.3844	-1.3913	-1.3065
8	1.7387	1.0319	0.5839	0.6277	1.2756	1.6894
43	1.4367	-0.3522	-0.0231	-0.5888	0.7649	1.5592
62	1.9304	0.7487	1.0695	0.1411	1.0486	1.5592
3	-2.2279	0.2513	-0.8729	0.8710	-1.2778	-1.3065
71	-2.8503	0.4507	-1.4799	1.3576	-1.5615	-1.3065
45	-2.3458	1.9069	-0.0231	2.3309	-1.4480	-1.3065
48	0.0207	-1.0287	-0.7515	-0.8321	0.0840	0.2566
6	-2.2807	0.1213	-1.2371	0.8710	-1.0509	-1.3065
99	1.2498	-0.9211	0.5839	-1.3187	0.6514	0.3869
82	-2.4247	-0.9715	-1.7227	-0.1022	-1.3913	-1.3065
76	-2.4186	-0.4301	-1.4799	0.3844	-1.3346	-1.3065

	Predictions	Data Features:
28	0.2468	-0.5818	-0.1445	-0.5888	0.1975	0.1263
11	-2.2632	0.4858	-0.8729	1.1143	-1.3346	-1.1762
10	2.2445	0.5150	1.7979	-0.3455	1.4458	0.7776
41	0.3785	-0.5717	-0.1445	-0.5888	0.4244	0.1263
2	-2.1547	-0.5545	-1.2371	0.1411	-1.2211	-1.3065
27	2.7129	0.9911	2.2835	-0.1022	1.3323	1.4289
38	-2.0200	1.4417	-0.1445	1.8443	-1.1643	-1.1762
31	0.2931	-0.8487	-0.2659	-0.8321	0.2542	0.1263
22	2.1171	2.8309	2.5263	1.8443	1.5025	1.0381
4	1.2766	-1.1270	-0.1445	-1.3187	0.7081	1.0381
33	2.1006	2.1227	1.6765	1.3576	1.3323	1.6894
35	-1.8692	0.0766	-0.8729	0.6277	-1.1643	-0.9157
26	-2.6065	0.5865	-1.1157	1.3576	-1.3346	-1.4367
34	-2.6145	2.2174	-0.3873	2.8175	-1.3346	-1.3065
18	1.3177	-0.4326	0.5839	-0.8321	0.6514	0.7776
7	0.0571	-0.1482	-0.1445	-0.1022	0.2542	-0.0039
14	1.3230	-1.7268	0.2197	-2.0486	0.7081	0.3869
45	0.6637	0.4918	0.0983	0.3844	0.5947	0.7776
48	-2.4631	2.7874	-0.1445	3.3041	-1.2778	-1.0460
29	-1.9774	0.3964	-0.5087	0.8710	-1.1643	-1.3065
15	1.8029	0.4090	0.8267	-0.1022	1.1621	1.2987
30	0.4600	-0.3976	0.3411	-0.5888	0.1407	0.1263
32	1.9432	1.0114	1.1909	0.3844	1.2188	1.4289
16	-0.2906	-2.0854	-0.9943	-1.8053	-0.2565	-0.2644
42	2.0103	-0.6687	1.0695	-1.3187	1.1621	0.7776
20	-0.4272	-1.9094	-1.1157	-1.5620	-0.2565	-0.2644
43	0.1795	-1.6023	-0.3873	-1.5620	0.0272	-0.1342
8	1.5955	1.1537	1.0695	0.6277	1.1053	1.1684
13	1.8160	0.5272	1.1909	-0.1022	0.9918	1.1684
25	-1.7657	-2.4696	-1.6013	-1.8053	-1.3913	-1.1762

Deployment parameters



In [ ]:

    
#@markdown ---
model = 'pca_iris' #@param {type:"string"}
version = 'v1' #@param {type:"string"}
#@markdown ---

Deploy the model for serving

https://cloud.google.com/ai-platform/prediction/docs/deploying-models



In [ ]:

    
# the exact location of the model is in model_uri.txt
with tf.io.gfile.GFile(os.path.join(output_location, 'model_uri.txt')) as f:
  model_uri = f.read()

# create a model
! gcloud ai-platform models create {model} --regions {REGION}

# create a version
! gcloud ai-platform versions create {version} \
  --model $model \
  --runtime-version 1.15 \
  --origin $model_uri \
  --project $PROJECT_ID

Use the endpoint for online predictions



In [11]:

    
# format the data for serving
instances = validation.to_dict(orient='records')
display(instances[:2])

# make a REST call for online inference
service = discovery.build('ml', 'v1')
name = 'projects/{project}/models/{model}/versions/{version}'.format(project=PROJECT_ID,
                                                                     model=model,
                                                                     version=version)
body = {'instances': instances}

response = service.projects().predict(name=name, body=body).execute()
if 'error' in response:
    raise RuntimeError(response['error'])

reduced_dimention = [row['result'] for row in response['predictions']]
print('First 5 rows of the result:')
reduced_dimention[:5]









    





[{'sepal_length': 0.46253253969737423,
  'sepal_width': -0.3454944346334629,
  'petal_length': 0.31094454340312366,
  'petal_width': 0.12634912850538776},
 {'sepal_length': 0.5839321564158446,
  'sepal_width': 0.6277293248974221,
  'petal_length': 0.5379113634054036,
  'petal_width': 0.517119629037516}]






    



/Users/evo/Library/Python/3.7/lib/python/site-packages/google/auth/_default.py:66: UserWarning:

Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/







    



First 5 rows of the result:






    Out[11]:





[[0.5453819036483765, -0.12061499804258347],
 [0.6567033529281616, 0.8700851202011108],
 [-2.1546521186828613, -0.5544523000717163],
 [1.4599521160125732, 0.3652043044567108],
 [1.276591181755066, -1.126971960067749]]

Inspect the ML model



In [ ]:

    
import witwidget
from witwidget.notebook.visualization import WitWidget, WitConfigBuilder

config_builder = WitConfigBuilder(examples=validation.to_dict(orient='records'))
config_builder.set_ai_platform_model(project=PROJECT_ID,
                                     model=model,
                                     version=version)
config_builder.set_predict_output_tensor('result')
config_builder.set_model_type('regression')
WitWidget(config_builder)

Cleaning up

To clean up all GCP resources used in this project, you can delete the GCP project you used for the tutorial.



In [ ]:

    
# Delete model version resource
! gcloud ai-platform versions delete $version --quiet --model $model 

# Delete model resource
! gcloud ai-platform models delete $model --quiet

# If training job is still running, cancel it
! gcloud ai-platform jobs cancel $job_name --quiet

# Delete Cloud Storage objects that were created
! gsutil -m rm -r $BUCKET_NAME