In [ ]:
# Copyright 2019 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

Deploying a Kubeflow Cluster on Google Cloud Platform (GCP)

This notebook provides instructions for setting up a Kubeflow cluster on GCP using the command-line interface (CLI). For additional help, see the guide to deploying Kubeflow using the CLI.

There are two possible alternatives:

  • The first alternative is to deploy Kubeflow cluster using the Kubeflow deployment web app, and the instruction can be found here.
  • Another alternative is to use recently launched AI Platform Pipeline. But, it is important to note that the AI Platform Pipeline is a standalone Kubeflow Pipeline deployment, where a lot of the components in full Kubeflow deployment won't be pre-installed. The instruction can be found here.

The CLI deployment gives you more control over the deployment process and configuration than you get if you use the deployment UI.

Prerequisites

Running Environment

This notebook helps in creating the Kubeflow cluster on GCP. You must run this notebook in an environment with Cloud SDK installed, such as Cloud Shell. Learn more about installing Cloud SDK.

Setting up a Kubeflow cluster

  1. Download kfctl
  2. Setup environment variables
  3. Create dedicated service account for deployment
  4. Deploy Kubefow
  5. Install Kubeflow Pipelines SDK
  6. Sanity check

Create a working directory

Create a new working directory in your current directory. The default name is kubeflow, but you can change the name.


In [ ]:
work_directory_name = 'kubeflow'

! mkdir -p $work_directory_name

%cd $work_directory_name

Download kfctl

Download kfctl to your working directory. The default version used is v0.7.0, but you can find the latest release here.


In [ ]:
## Download kfctl v0.7.0
! curl -LO https://github.com/kubeflow/kubeflow/releases/download/v0.7.0/kfctl_v0.7.0_linux.tar.gz
    
## Unpack the tar ball
! tar -xvf kfctl_v0.7.0_linux.tar.gz

If you are using AI Platform Notebooks, your environment is already authenticated. Skip the following cell.


In [ ]:
## Create user credentials
! gcloud auth application-default login

Set up environment variables

Set up environment variables to use while installing Kubeflow. Replace variable placeholders (for example, <VARIABLE NAME>) with the correct values for your environment.


In [ ]:
# Set your GCP project ID and the zone where you want to create the Kubeflow deployment
%env PROJECT=<ADD GCP PROJECT HERE>
%env ZONE=<ADD GCP ZONE TO LAUNCH KUBEFLOW CLUSTER HERE>

# google cloud storage bucket
%env GCP_BUCKET=gs://<ADD STORAGE LOCATION HERE>

# Use the following kfctl configuration file for authentication with 
# Cloud IAP (recommended):
uri = "https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_gcp_iap.0.7.0.yaml"
uri = uri.strip()
%env CONFIG_URI=$uri

# For using Cloud IAP for authentication, create environment variables
# from the OAuth client ID and secret that you obtained earlier:
%env CLIENT_ID=<ADD OAuth CLIENT ID HERE>
%env CLIENT_SECRET=<ADD OAuth CLIENT SECRET HERE>

# Set KF_NAME to the name of your Kubeflow deployment. You also use this
# value as directory name when creating your configuration directory. 
# For example, your deployment name can be 'my-kubeflow' or 'kf-test'.
%env KF_NAME=<ADD KUBEFLOW DEPLOYMENT NAME HERE>

# Set up name of the service account that should be created and used
# while creating the Kubeflow cluster
%env SA_NAME=<ADD SERVICE ACCOUNT NAME TO BE CREATED HERE>

Configure gcloud and add kfctl to your path.


In [ ]:
! gcloud config set project ${PROJECT}

! gcloud config set compute/zone ${ZONE}


# Set the path to the base directory where you want to store one or more 
# Kubeflow deployments. For example, /opt/.
# Here we use the current working directory as the base directory
# Then set the Kubeflow application directory for this deployment.

import os
base = os.getcwd()
%env BASE_DIR=$base

kf_dir = os.getenv('BASE_DIR') + "/" + os.getenv('KF_NAME')
%env KF_DIR=$kf_dir

# The following command is optional. It adds the kfctl binary to your path.
# If you don't add kfctl to your path, you must use the full path
# each time you run kfctl. In this example, the kfctl file is present in
# the current directory
new_path = os.getenv('PATH') + ":" + os.getenv('BASE_DIR')
%env PATH=$new_path

Create service account


In [ ]:
! gcloud iam service-accounts create ${SA_NAME}
! gcloud projects add-iam-policy-binding ${PROJECT} \
  --member serviceAccount:${SA_NAME}@${PROJECT}.iam.gserviceaccount.com \
  --role 'roles/owner'
! gcloud iam service-accounts keys create key.json \
  --iam-account ${SA_NAME}@${PROJECT}.iam.gserviceaccount.com

Set GOOGLE_APPLICATION_CREDENTIALS


In [ ]:
key_path = os.getenv('BASE_DIR') + "/" + 'key.json'
%env GOOGLE_APPLICATION_CREDENTIALS=$key_path

Setup and deploy Kubeflow


In [ ]:
! mkdir -p ${KF_DIR}
%cd $kf_dir
! kfctl apply -V -f ${CONFIG_URI}

Install Kubeflow Pipelines SDK


In [ ]:
%%capture

# Install the SDK (Uncomment the code if the SDK is not installed before)
! pip3 install 'kfp>=0.1.36' --quiet --user

Sanity Check: Check the ingress created


In [ ]:
! kubectl -n istio-system describe ingress

Access the Kubeflow cluster at https://<KF_NAME>.endpoints.<gcp_project_id>.cloud.goog/

Note that it may take up to 15-20 mins for the above url to be functional.


In [ ]: