In [ ]:
# Copyright 2019 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

Train and deploy Xgboost (Scikit-learn) on Kubeflow from Notebooks

This notebook introduces you the usage of Kubeflow Fairing to train and deploy a model to Kubeflow on Google Kubernetes Engine (GKE), and Google Cloud AI Platform training. This notebook demonstrate how to:

  • Train an XGBoost model in a local notebook,
  • Use Kubeflow Fairing to train an XGBoost model remotely on Kubeflow cluster,
  • Use Kubeflow Fairing to train an XGBoost model remotely on AI Platform training,
  • Use Kubeflow Fairing to deploy a trained model to Kubeflow, and Call the deployed endpoint for predictions.

You need Python 3.6 to use Kubeflow Fairing.

Setups

  • Pre-conditions

    • Deployed a kubeflow cluster through https://deploy.kubeflow.cloud/
    • Have the following environment variable ready:
      • PROJECT_ID # project host the kubeflow cluster or for running AI platform training
      • DEPLOYMENT_NAME # kubeflow deployment name, the same the cluster name after delpoyed
      • GCP_BUCKET # google cloud storage bucket
  • Create service account

    export SA_NAME = [service account name]
    gcloud iam service-accounts create ${SA_NAME}
    gcloud projects add-iam-policy-binding ${PROJECT_ID} \
      --member serviceAccount:${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com \
      --role 'roles/editor'
    gcloud iam service-accounts keys create ~/key.json \
      --iam-account ${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
    
  • Authorize for Source Repository

    gcloud auth configure-docker
    
  • Update local kubeconfig (for submiting job to kubeflow cluster)

    export CLUSTER_NAME=${DEPLOYMENT_NAME} # this is the deployment name or the kubenete cluster name
    export ZONE=us-central1-c
    gcloud container clusters get-credentials ${CLUSTER_NAME} --region ${ZONE}
    
  • Set the environmental variable: GOOGLE_APPLICATION_CREDENTIALS

    export GOOGLE_APPLICATION_CREDENTIALS = ....
    
    os.environ['GOOGLE_APPLICATION_CREDENTIALS']=...
    
  • Install the lastest version of fairing

    pip install git+https://github.com/kubeflow/fairing@master
    
  • Upload training file

    # upload the train.csv to GCS bucket that can be accessed from both CMLE and Kubeflow cluster
    gsutil cp ./train.csv ${GCP_Bucket}/train.csv
    

Please not that the above configuration is required for notebook service running outside Kubeflow environment. And the examples demonstrated in the notebook is fully tested on notebook service outside Kubeflow cluster also.

The environemt variables, e.g. service account, projects and etc, should have been pre-configured while setting up the cluster.

Set up your notebook for training an XGBoost model

Import the libraries required to train this model.


In [26]:
import argparse
import logging
import joblib
import sys
import pandas as pd
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer
from xgboost import XGBClassifier

In [2]:
logging.basicConfig(format='%(message)s')
logging.getLogger().setLevel(logging.INFO)

In [3]:
import os
import fairing

# Setting up google container repositories (GCR) for storing output containers
# You can use any docker container registry istead of GCR
# For local notebook, GCP_PROJECT should be set explicitly
GCP_PROJECT = fairing.cloud.gcp.guess_project_name()
GCP_Bucket = os.environ['GCP_BUCKET'] # e.g., 'gs://kubeflow-demo-g/'

# This is for local notebook instead of that in kubeflow cluster
# os.environ['GOOGLE_APPLICATION_CREDENTIALS']=


/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/google/auth/_default.py:66: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK. We recommend that most server applications use service accounts instead. If your application continues to use end user credentials from Cloud SDK, you might receive a "quota exceeded" or "API not enabled" error. For more information about service accounts, see https://cloud.google.com/docs/authentication/
  warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)

Define the model logic

Define a function to split the input file into training and testing datasets.


In [4]:
def gcs_copy(src_path, dst_path):
    import subprocess
    print(subprocess.run(['gsutil', 'cp', src_path, dst_path], stdout=subprocess.PIPE).stdout[:-1].decode('utf-8'))
    
def gcs_download(src_path, file_name):
    import subprocess
    print(subprocess.run(['gsutil', 'cp', src_path, file_name], stdout=subprocess.PIPE).stdout[:-1].decode('utf-8'))

In [37]:
def read_input(source_path, test_size=0.25):
    """Read input data and split it into train and test."""
    
    file_name = source_path.split('/')[-1]
    gcs_download(source_path, file_name)
    data = pd.read_csv(file_name)
    data.dropna(axis=0, inplace=True)

    y = data.Class
    X = data.drop(['Class', 'Amount', 'Time'], axis=1).select_dtypes(exclude=['object'])

    train_X, test_X, train_y, test_y = train_test_split(X.values,
                                                      y.values,
                                                      test_size=test_size,
                                                      shuffle=True)

    imputer = SimpleImputer()
    train_X = imputer.fit_transform(train_X)
    test_X = imputer.transform(test_X)

    return (train_X, train_y), (test_X, test_y)

Define functions to train, evaluate, and save the trained model.


In [38]:
def train_model(train_X,
                train_y,
                test_X,
                test_y,
                n_estimators,
                learning_rate):
    """Train the model using XGBRegressor."""
    model = XGBClassifier(n_estimators=n_estimators, learning_rate=learning_rate)

    model.fit(train_X,
            train_y,
            early_stopping_rounds=40,
            eval_set=[(test_X, test_y)])

    print("Best loss on eval: %.2f with %d rounds",
               model.best_score,
               model.best_iteration+1)
    return model

def eval_model(model, test_X, test_y):
    """Evaluate the model performance."""
    predictions = model.predict_proba(test_X)
    logging.info("auc=%.2f", roc_auc_score(test_y, predictions[:,1]))

def save_model(model, model_file):
    """Save XGBoost model for serving."""
    joblib.dump(model, model_file)
    gcs_copy(model_file, GCP_Bucket + model_file)
    logging.info("Model export success: %s", model_file)

Define a class for your model, with methods for training and prediction.


In [39]:
class FraudServe(object):
    
    def __init__(self):
        self.train_input = GCP_Bucket + "train_fraud.csv"
        self.n_estimators = 50
        self.learning_rate = 0.1
        self.model_file = "trained_fraud_model.joblib"
        self.model = None

    def train(self):
        (train_X, train_y), (test_X, test_y) = read_input(self.train_input)
        model = train_model(train_X,
                          train_y,
                          test_X,
                          test_y,
                          self.n_estimators,
                          self.learning_rate)

        eval_model(model, test_X, test_y)
        save_model(model, self.model_file)

    def predict(self, X, feature_names):
        """Predict using the model for given ndarray."""
        if not self.model:
            self.model = joblib.load(self.model_file)
        # Do any preprocessing
        prediction = self.model.predict(data=X)
        # Do any postprocessing
        return [[prediction.item(0), prediction.item(0)]]

Train an XGBoost model in a notebook

Call FraudServe().train() to train your model, and then evaluate and save your trained model.


In [40]:
FraudServe().train()


[0]	validation_0-error:0.037534
Will train until validation_0-error hasn't improved in 40 rounds.
[1]	validation_0-error:0.024129
[2]	validation_0-error:0.021448
[3]	validation_0-error:0.021448
[4]	validation_0-error:0.021448
[5]	validation_0-error:0.02681
[6]	validation_0-error:0.032172
[7]	validation_0-error:0.02681
[8]	validation_0-error:0.02681
[9]	validation_0-error:0.029491
[10]	validation_0-error:0.02681
[11]	validation_0-error:0.018767
[12]	validation_0-error:0.021448
[13]	validation_0-error:0.018767
[14]	validation_0-error:0.018767
[15]	validation_0-error:0.018767
[16]	validation_0-error:0.018767
[17]	validation_0-error:0.018767
[18]	validation_0-error:0.018767
[19]	validation_0-error:0.018767
[20]	validation_0-error:0.018767
[21]	validation_0-error:0.021448
[22]	validation_0-error:0.018767
[23]	validation_0-error:0.021448
[24]	validation_0-error:0.024129
[25]	validation_0-error:0.021448
[26]	validation_0-error:0.021448
[27]	validation_0-error:0.018767
[28]	validation_0-error:0.018767
[29]	validation_0-error:0.018767
[30]	validation_0-error:0.018767
[31]	validation_0-error:0.018767
[32]	validation_0-error:0.018767
[33]	validation_0-error:0.021448
[34]	validation_0-error:0.021448
[35]	validation_0-error:0.021448
[36]	validation_0-error:0.021448
[37]	validation_0-error:0.024129
[38]	validation_0-error:0.024129
[39]	validation_0-error:0.024129
[40]	validation_0-error:0.024129
[41]	validation_0-error:0.024129
[42]	validation_0-error:0.024129
[43]	validation_0-error:0.024129
[44]	validation_0-error:0.024129
[45]	validation_0-error:0.024129
[46]	validation_0-error:0.024129
[47]	validation_0-error:0.02681
[48]	validation_0-error:0.02681
[49]	validation_0-error:0.02681
auc=0.99
Best loss on eval: %.2f with %d rounds 0.018767 12
Model export success: trained_fraud_model.joblib

Make Use of Fairing

Spicify a image registry that will hold the image built by fairing


In [ ]:
# In this demo, I use gsutil, therefore i compile a special image to install GoogleCloudSDK as based image
base_image = 'gcr.io/{}/fairing-predict-example:latest'.format(GCP_PROJECT)
!docker build --build-arg PY_VERSION=3.6.4 . -t {base_image}
!docker push {base_image}

In [11]:
DOCKER_REGISTRY = 'gcr.io/{}/fairing-job-xgboost'.format(GCP_PROJECT)
BASE_IMAGE = base_image

Train an XGBoost model remotely on Kubeflow

Import the TrainJob and GKEBackend classes. Kubeflow Fairing packages the FraudServe class, the training data, and the training job's software prerequisites as a Docker image. Then Kubeflow Fairing deploys and runs the training job on Kubeflow.


In [12]:
from fairing import TrainJob
from fairing.backends import GKEBackend

train_job = TrainJob(FraudServe, BASE_IMAGE, input_files=["requirements.txt"],
                     docker_registry=DOCKER_REGISTRY, backend=GKEBackend())
train_job.submit()


Using preprocessor: <class 'fairing.preprocessors.function.FunctionPreProcessor'>
Using docker registry: gcr.io/gojek-kubeflow/fairing-job-xgboost
Using builder: <class 'fairing.builders.docker.docker.DockerBuilder'>
Building the docker image.
Building image using docker
Docker command: ['python', '/app/function_shim.py', '--serialized_fn_file', '/app/pickled_fn.p']
/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py already exists in Fairing context, skipping...
Creating docker context: /tmp/fairing_context_dip8f0em
/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py already exists in Fairing context, skipping...
Context: /tmp/fairing_context_dip8f0em, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py at /app/fairing/__init__.py
Context: /tmp/fairing_context_dip8f0em, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/runtime_config.py at /app/fairing/runtime_config.py
Context: /tmp/fairing_context_dip8f0em, Adding requirements.txt at /app/requirements.txt
Context: /tmp/fairing_context_dip8f0em, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/functions/function_shim.py at /app/function_shim.py
Context: /tmp/fairing_context_dip8f0em, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/cloudpickle/__init__.py at /app/cloudpickle/__init__.py
Context: /tmp/fairing_context_dip8f0em, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/cloudpickle/cloudpickle.py at /app/cloudpickle/cloudpickle.py
Context: /tmp/fairing_context_dip8f0em, Adding /var/folders/1c/dnk5c85905ngk3qvcc9fhlnm00hm6n/T/tmpw4wpdn9j at /app/pickled_fn.p
Context: /tmp/fairing_context_dip8f0em, Adding /var/folders/1c/dnk5c85905ngk3qvcc9fhlnm00hm6n/T/tmpcnblrwou at /app/HousingServe.py
Context: /tmp/fairing_context_dip8f0em, Adding /tmp/fairing_dockerfile_1mwzspba at Dockerfile
Building docker image gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:C0F59817...
Build output: Step 1/7 : FROM gcr.io/gojek-kubeflow/fairing-predict-example:latest
Build output: 
Build output: ---> 07b0c0a773a2
Build output: Step 2/7 : WORKDIR /app/
Build output: 
Build output: ---> Using cache
Build output: ---> e38aad2dc182
Build output: Step 3/7 : ENV FAIRING_RUNTIME 1
Build output: 
Build output: ---> Using cache
Build output: ---> 597bd070338a
Build output: Step 4/7 : COPY /app//requirements.txt /app/
Build output: 
Build output: ---> Using cache
Build output: ---> 05e78d5eb908
Build output: Step 5/7 : RUN if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi
Build output: 
Build output: ---> Using cache
Build output: ---> e31aa3ffcc59
Build output: Step 6/7 : COPY /app/ /app/
Build output: 
Build output: ---> 2ad6caba9fab
Build output: Step 7/7 : CMD python /app/function_shim.py --serialized_fn_file /app/pickled_fn.p
Build output: 
Build output: ---> Running in 22adbd7dcf8c
Build output: ---> 3837c52fb3b9
Push finished: {'ID': 'sha256:3837c52fb3b9bb8545fcbddb83b4c6b7ecaca5a6d01d7a3637ccd803c84dbad3'}
Build output: Successfully built 3837c52fb3b9
Build output: Successfully tagged gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:C0F59817
Publishing image gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:C0F59817...
Push output: The push refers to repository [gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job] None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Pushing [>                                                  ]     512B/47.07kB
Push output: Pushing [==================================================>]  56.32kB
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Pushed None
Push output: C0F59817: digest: sha256:fe0da235a59bdea843178a21920be6056f466926abd850c20c52c52374ad3d04 size: 3473 None
Push finished: {'Tag': 'C0F59817', 'Digest': 'sha256:fe0da235a59bdea843178a21920be6056f466926abd850c20c52c52374ad3d04', 'Size': 3473}
Not able to find gcp credentials secret: user-gcp-sa
Training job fairing-job-ppql8 launched.
Waiting for fairing-job-ppql8-2wk8f to start...
Pod started running True
Copying gs://kubeflow-demo-g/train.csv...
/ [1 files][449.9 KiB/449.9 KiB]                                                
Operation completed over 1 objects/449.9 KiB.

[0]	validation_0-rmse:177514
Will train until validation_0-rmse hasn't improved in 40 rounds.
[1]	validation_0-rmse:161858
[2]	validation_0-rmse:147237
[3]	validation_0-rmse:134132
[4]	validation_0-rmse:122224
[5]	validation_0-rmse:111538
[6]	validation_0-rmse:102142
[7]	validation_0-rmse:93392.3
[8]	validation_0-rmse:85824.6
[9]	validation_0-rmse:79667.6
[10]	validation_0-rmse:73463.4
[11]	validation_0-rmse:68059.4
[12]	validation_0-rmse:63350.5
[13]	validation_0-rmse:59732.1
[14]	validation_0-rmse:56260.7
[15]	validation_0-rmse:53392.6
[16]	validation_0-rmse:50770.8
[17]	validation_0-rmse:48107.8
[18]	validation_0-rmse:45923.9
[19]	validation_0-rmse:44154.2
[20]	validation_0-rmse:42488.1
[21]	validation_0-rmse:41263.3
[22]	validation_0-rmse:40212.8
[23]	validation_0-rmse:39089.1
[24]	validation_0-rmse:37691.1
[25]	validation_0-rmse:36875.2
[26]	validation_0-rmse:36276.2
[27]	validation_0-rmse:35444.1
[28]	validation_0-rmse:34831.5
[29]	validation_0-rmse:34205.4
[30]	validation_0-rmse:33831.9
[31]	validation_0-rmse:33183.6
[32]	validation_0-rmse:33019.4
[33]	validation_0-rmse:32680
[34]	validation_0-rmse:32438.5
[35]	validation_0-rmse:32130.4
[36]	validation_0-rmse:31644.2
[37]	validation_0-rmse:31248.9
[38]	validation_0-rmse:31059.8
[39]	validation_0-rmse:30862.4
[40]	validation_0-rmse:30754
[41]	validation_0-rmse:30561.6
[42]	validation_0-rmse:30416.9
[43]	validation_0-rmse:30156.4
[44]	validation_0-rmse:29852.9
[45]	validation_0-rmse:29486.6
[46]	validation_0-rmse:29158.8
[47]	validation_0-rmse:29017
[48]	validation_0-rmse:28973.9
[49]	validation_0-rmse:28787.7
mean_absolute_error=18173.15
Copying file://trained_ames_model.joblib [Content-Type=application/octet-stream]...
AccessDeniedException: 403 Insufficient Permission                              
Model export success: trained_ames_model.joblib
Best RMSE on eval: %.2f with %d rounds 28787.720703 50

Cleaning up job fairing-job-ppql8...

Train an XGBoost model remotely on Cloud ML Engine

Import the TrainJob and GCPManagedBackend classes. Kubeflow Fairing packages the FraudServe class, the training data, and the training job's software prerequisites as a Docker image. Then Kubeflow Fairing deploys and runs the training job on Cloud ML Engine.


In [13]:
from fairing import TrainJob
from fairing.backends import GCPManagedBackend
train_job = TrainJob(FraudServe, BASE_IMAGE, input_files=["requirements.txt"],
                     docker_registry=DOCKER_REGISTRY, backend=GCPManagedBackend())
train_job.submit()


Using preprocessor: <class 'fairing.preprocessors.function.FunctionPreProcessor'>
Using docker registry: gcr.io/gojek-kubeflow/fairing-job-xgboost
Using builder: <class 'fairing.builders.docker.docker.DockerBuilder'>
Building the docker image.
Building image using docker
Docker command: ['python', '/app/function_shim.py', '--serialized_fn_file', '/app/pickled_fn.p']
/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py already exists in Fairing context, skipping...
Creating docker context: /tmp/fairing_context_t8_nr827
/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py already exists in Fairing context, skipping...
Context: /tmp/fairing_context_t8_nr827, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py at /app/fairing/__init__.py
Context: /tmp/fairing_context_t8_nr827, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/runtime_config.py at /app/fairing/runtime_config.py
Context: /tmp/fairing_context_t8_nr827, Adding requirements.txt at /app/requirements.txt
Context: /tmp/fairing_context_t8_nr827, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/functions/function_shim.py at /app/function_shim.py
Context: /tmp/fairing_context_t8_nr827, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/cloudpickle/__init__.py at /app/cloudpickle/__init__.py
Context: /tmp/fairing_context_t8_nr827, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/cloudpickle/cloudpickle.py at /app/cloudpickle/cloudpickle.py
Context: /tmp/fairing_context_t8_nr827, Adding /var/folders/1c/dnk5c85905ngk3qvcc9fhlnm00hm6n/T/tmpi9ejaors at /app/pickled_fn.p
Context: /tmp/fairing_context_t8_nr827, Adding /var/folders/1c/dnk5c85905ngk3qvcc9fhlnm00hm6n/T/tmp8aqop6nc at /app/HousingServe.py
Context: /tmp/fairing_context_t8_nr827, Adding /tmp/fairing_dockerfile_7yyuw6a7 at Dockerfile
Building docker image gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:11CD9CDA...
Build output: Step 1/7 : FROM gcr.io/gojek-kubeflow/fairing-predict-example:latest
Build output: 
Build output: ---> 07b0c0a773a2
Build output: Step 2/7 : WORKDIR /app/
Build output: 
Build output: ---> Using cache
Build output: ---> e38aad2dc182
Build output: Step 3/7 : ENV FAIRING_RUNTIME 1
Build output: 
Build output: ---> Using cache
Build output: ---> 597bd070338a
Build output: Step 4/7 : COPY /app//requirements.txt /app/
Build output: 
Build output: ---> Using cache
Build output: ---> 05e78d5eb908
Build output: Step 5/7 : RUN if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi
Build output: 
Build output: ---> Using cache
Build output: ---> e31aa3ffcc59
Build output: Step 6/7 : COPY /app/ /app/
Build output: 
Build output: ---> Using cache
Build output: ---> 2ad6caba9fab
Build output: Step 7/7 : CMD python /app/function_shim.py --serialized_fn_file /app/pickled_fn.p
Build output: 
Build output: ---> Using cache
Build output: ---> 3837c52fb3b9
Push finished: {'ID': 'sha256:3837c52fb3b9bb8545fcbddb83b4c6b7ecaca5a6d01d7a3637ccd803c84dbad3'}
Build output: Successfully built 3837c52fb3b9
Build output: Successfully tagged gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:11CD9CDA
Publishing image gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:11CD9CDA...
Push output: The push refers to repository [gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job] None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: 11CD9CDA: digest: sha256:fe0da235a59bdea843178a21920be6056f466926abd850c20c52c52374ad3d04 size: 3473 None
Push finished: {'Tag': '11CD9CDA', 'Digest': 'sha256:fe0da235a59bdea843178a21920be6056f466926abd850c20c52c52374ad3d04', 'Size': 3473}
file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth
Traceback (most recent call last):
  File "/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect
    from google.appengine.api import memcache
ModuleNotFoundError: No module named 'google.appengine'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module>
    from oauth2client.contrib.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/googleapiclient/discovery_cache/file_cache.py", line 37, in <module>
    from oauth2client.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.locked_file'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/googleapiclient/discovery_cache/__init__.py", line 41, in autodetect
    from . import file_cache
  File "/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/googleapiclient/discovery_cache/file_cache.py", line 41, in <module>
    'file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth')
ImportError: file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth
URL being requested: GET https://www.googleapis.com/discovery/v1/apis/ml/v1/rest
URL being requested: POST https://ml.googleapis.com/v1/projects/gojek-kubeflow/jobs?alt=json
Creating training job with the following options: {'jobId': 'fairing_job_7df6e38a', 'trainingInput': {'scaleTier': 'BASIC', 'masterConfig': {'imageUri': 'gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:11CD9CDA'}, 'region': 'us-central1'}}
Job submitted successfully.
Access job logs at the following URL:
https://console.cloud.google.com/mlengine/jobs/fairing_job_7df6e38a?project=gojek-kubeflow

Deploy the trained model to Kubeflow for predictions

Import the PredictionEndpoint and KubeflowGKEBackend classes. Kubeflow Fairing packages the FraudServe class, the trained model, and the prediction endpoint's software prerequisites as a Docker image. Then Kubeflow Fairing deploys and runs the prediction endpoint on Kubeflow.

This part only works for fairing version >=0.5.2


In [14]:
from fairing import PredictionEndpoint
from fairing.backends import KubeflowGKEBackend
# The trained_ames_model.joblib is exported during the above local training
endpoint = PredictionEndpoint(FraudServe, BASE_IMAGE, input_files=['trained_fraud_model.joblib', "requirements.txt"],
                              docker_registry=DOCKER_REGISTRY, backend=KubeflowGKEBackend())
endpoint.create()


Using preprocessor: <class 'fairing.preprocessors.function.FunctionPreProcessor'>
Using docker registry: gcr.io/gojek-kubeflow/fairing-job-xgboost
Using builder: <class 'fairing.builders.docker.docker.DockerBuilder'>
Building the docker image.
Building image using docker
Docker command: ['python', '/app/function_shim.py', '--serialized_fn_file', '/app/pickled_fn.p']
/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py already exists in Fairing context, skipping...
Creating docker context: /tmp/fairing_context_yonwsrbc
/Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py already exists in Fairing context, skipping...
Context: /tmp/fairing_context_yonwsrbc, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/__init__.py at /app/fairing/__init__.py
Context: /tmp/fairing_context_yonwsrbc, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/runtime_config.py at /app/fairing/runtime_config.py
Context: /tmp/fairing_context_yonwsrbc, Adding requirements.txt at /app/requirements.txt
Context: /tmp/fairing_context_yonwsrbc, Adding trained_ames_model.joblib at /app/trained_ames_model.joblib
Context: /tmp/fairing_context_yonwsrbc, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/fairing/functions/function_shim.py at /app/function_shim.py
Context: /tmp/fairing_context_yonwsrbc, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/cloudpickle/__init__.py at /app/cloudpickle/__init__.py
Context: /tmp/fairing_context_yonwsrbc, Adding /Users/luoshixin/LocalSim/virtualPython36/lib/python3.6/site-packages/cloudpickle/cloudpickle.py at /app/cloudpickle/cloudpickle.py
Context: /tmp/fairing_context_yonwsrbc, Adding /var/folders/1c/dnk5c85905ngk3qvcc9fhlnm00hm6n/T/tmp7nk9d31l at /app/pickled_fn.p
Context: /tmp/fairing_context_yonwsrbc, Adding /var/folders/1c/dnk5c85905ngk3qvcc9fhlnm00hm6n/T/tmp40474wwu at /app/HousingServe.py
Context: /tmp/fairing_context_yonwsrbc, Adding /tmp/fairing_dockerfile_tv9eejm2 at Dockerfile
Building docker image gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:28D8F94A...
Build output: Step 1/7 : FROM gcr.io/gojek-kubeflow/fairing-predict-example:latest
Build output: 
Build output: ---> 07b0c0a773a2
Build output: Step 2/7 : WORKDIR /app/
Build output: 
Build output: ---> Using cache
Build output: ---> e38aad2dc182
Build output: Step 3/7 : ENV FAIRING_RUNTIME 1
Build output: 
Build output: ---> Using cache
Build output: ---> 597bd070338a
Build output: Step 4/7 : COPY /app//requirements.txt /app/
Build output: 
Build output: ---> Using cache
Build output: ---> 05e78d5eb908
Build output: Step 5/7 : RUN if [ -e requirements.txt ];then pip install --no-cache -r requirements.txt; fi
Build output: 
Build output: ---> Using cache
Build output: ---> e31aa3ffcc59
Build output: Step 6/7 : COPY /app/ /app/
Build output: 
Build output: ---> 67e07b1b0768
Build output: Step 7/7 : CMD python /app/function_shim.py --serialized_fn_file /app/pickled_fn.p
Build output: 
Build output: ---> Running in 005665229ac6
Build output: ---> d93b497cc2fb
Push finished: {'ID': 'sha256:d93b497cc2fb1e2aad5df6b33b1ef1fc2fd85d9b5ac44bde28424983d0d07d8b'}
Build output: Successfully built d93b497cc2fb
Build output: Successfully tagged gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:28D8F94A
Publishing image gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job:28D8F94A...
Push output: The push refers to repository [gcr.io/gojek-kubeflow/fairing-job-xgboost/fairing-job] None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Preparing None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Waiting None
Push output: Pushing [>                                                  ]  1.024kB/82.13kB
Push output: Pushing [==================================================>]  92.16kB
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Layer already exists None
Push output: Pushed None
Push output: 28D8F94A: digest: sha256:d450620b10c82d58557203ef25211b4cb0c11cc7463401c475e24d87af89e610 size: 3473 None
Push finished: {'Tag': '28D8F94A', 'Digest': 'sha256:d450620b10c82d58557203ef25211b4cb0c11cc7463401c475e24d87af89e610', 'Size': 3473}
Deploying the endpoint.
Waiting for prediction endpoint to come up...
Cluster endpoint: http://35.239.163.72:5000/predict
Prediction endpoint: http://35.239.163.72:5000/predict

Deploy to GCP


In [ ]:
# Deploy model to gcp
# from fairing.deployers.gcp.gcpserving import GCPServingDeployer
# deployer = GCPServingDeployer()
# deployer.deploy(VERSION_DIR, MODEL_NAME, VERSION_NAME)

Call the prediction endpoint

Create a test dataset, then call the endpoint on Kubeflow for predictions.


In [15]:
(train_X, train_y), (test_X, test_y) = read_input(GCP_Bucket + "train_fraud.csv")
endpoint.predict_nparray(test_X)



{"data":{"names":["t:0","t:1"],"tensor":{"shape":[1,2],"values":[165164.875,165164.875]}},"meta":{}}

Clean up the prediction endpoint

Delete the prediction endpoint created by this notebook.


In [16]:
endpoint.delete()


Deleted service: kubeflow/fairing-service-hpxqk
Deleted deployment: kubeflow/fairing-deployer-sntlx

In [ ]: