CI/CD for TFX pipelines

Learning Objectives

  1. Develop a CI/CD workflow with Cloud Build to build and deploy a machine learning pipeline.
  2. Integrate with Github to trigger workflows with pipeline source repository changes.

In this lab, you will walk through authoring a Cloud Build CI/CD workflow that automatically builds and deploys the same TFX pipeline from lab-02.ipynb. You will also integrate your workflow with GitHub by setting up a trigger that starts the workflow when a new tag is applied to the GitHub repo hosting the pipeline's code.

Setup


In [ ]:
import yaml

# Set `PATH` to include the directory containing TFX CLI.
PATH=%env PATH
%env PATH=/home/jupyter/.local/bin:{PATH}

In [ ]:
!python -c "import tfx; print('TFX version: {}'.format(tfx.__version__))"

Note: this lab was built and tested with the following package versions:

TFX version: 0.21.4


In [ ]:
%pip install --upgrade --user tfx==0.21.4

Note: you may need to restart the kernel to pick up the correct package versions.

Understanding the Cloud Build workflow

Review the cloudbuild.yaml file to understand how the CI/CD workflow is implemented and how environment specific settings are abstracted using Cloud Build variables.

The Cloud Build CI/CD workflow automates the steps you walked through manually during lab-02:

  1. Builds the custom TFX image to be used as a runtime execution environment for TFX components and as the AI Platform Training training container.
  2. Compiles the pipeline and uploads the pipeline to the KFP environment
  3. Pushes the custom TFX image to your project's Container Registry

The Cloud Build workflow configuration uses both standard and custom Cloud Build builders. The custom builder encapsulates TFX CLI.

Configuring environment settings

Navigate to AI Platform Pipelines page in the Google Cloud Console.

1. Create or select an existing Kubernetes cluster (GKE) and deploy AI Platform. Make sure to select "Allow access to the following Cloud APIs https://www.googleapis.com/auth/cloud-platform" to allow for programmatic access to your pipeline by the Kubeflow SDK for the rest of the lab. Also, provide an App instance name such as "tfx" or "mlops". Note you may have already deployed an AI Pipelines instance during the Setup for the lab series. If so, you can proceed using that instance below in the next step.

Validate the deployment of your AI Platform Pipelines instance in the console before proceeding.

2. Configuring environment settings

Update the below constants with the settings reflecting your lab environment.

  • GCP_REGION - the compute region for AI Platform Training and Prediction
  • ARTIFACT_STORE - the GCS bucket created during installation of AI Platform Pipelines. The bucket name starts with the kubeflowpipelines- prefix.

In [ ]:
# Use the following command to identify the GCS bucket for metadata and pipeline storage.
!gsutil ls
  • ENDPOINT - set the ENDPOINT constant to the endpoint to your AI Platform Pipelines instance. The endpoint to the AI Platform Pipelines instance can be found on the AI Platform Pipelines page in the Google Cloud Console. Open the SETTINGS for your instance and use the value of the host variable in the Connect to this Kubeflow Pipelines instance from a Python client via Kubeflow Pipelines SKD section of the SETTINGS window. The format is '....[region].pipelines.googleusercontent.com'.

In [ ]:
GCP_REGION = 'us-central1'
ARTIFACT_STORE_URI = 'gs://kubeflowpipelines-default-l2iv13wnek'
ENDPOINT = '315252b57cfb9312-dot-us-central2.pipelines.googleusercontent.com'
PROJECT_ID = !(gcloud config get-value core/project)
PROJECT_ID = PROJECT_ID[0]

Creating the TFX CLI builder

1. Review the Dockerfile describing the TFX CLI builder.


In [ ]:
!cat tfx-cli/Dockerfile

In [ ]:
!cat tfx-cli/requirements.txt

2. Build the image and push it to your project's Container Registry.

Review the Cloud Build gcloud command line reference for builds submit. Your image should follow the format gcr.io/[PROJECT_ID]/[IMAGE_NAME]:latest. Note the source code for the tfx-cli is in the directory ./tfx-cli.


In [ ]:
IMAGE_NAME='tfx-cli'
TAG='latest'
IMAGE_URI='gcr.io/{}/{}:{}'.format(PROJECT_ID, IMAGE_NAME, TAG)

In [ ]:
!gcloud builds submit --timeout 15m --tag {IMAGE_URI} tfx-cli

Manually triggering CI/CD runs

You can manually trigger Cloud Build runs using the gcloud builds submit command.


In [ ]:
PIPELINE_NAME='tfx_covertype_continuous_training'
TAG_NAME='test'
TFX_IMAGE_NAME='lab-03-tfx-image'
DATA_ROOT_URI='gs://workshop-datasets/covertype/small'
MODEL_NAME='tfx_covertype_classifier'
PIPELINE_FOLDER='pipeline'
PIPELINE_DSL='runner.py'
RUNTIME_VERSION='2.1'
PYTHON_VERSION='3.7'
USE_KFP_SA='False'

SUBSTITUTIONS="""
_ENDPOINT={},\
_GCP_REGION={},\
_ARTIFACT_STORE_URI={},\
_TFX_IMAGE_NAME={},\
_DATA_ROOT_URI={},\
_MODEL_NAME={},\
TAG_NAME={},\
_PIPELINE_FOLDER={},\
_PIPELINE_DSL={},\
_PIPELINE_NAME={},\
_RUNTIME_VERSION={},\
_USE_KFP_SA={},\
_PYTHON_VERSION={}
""".format(ENDPOINT, 
           GCP_REGION, 
           ARTIFACT_STORE_URI, 
           TFX_IMAGE_NAME,
           DATA_ROOT_URI,
           MODEL_NAME,
           TAG_NAME, 
           PIPELINE_FOLDER,
           PIPELINE_DSL,
           PIPELINE_NAME,
           RUNTIME_VERSION,
           PYTHON_VERSION,
           USE_KFP_SA
           ).strip()

In [ ]:
!gcloud builds submit . --config cloudbuild.yaml --substitutions {SUBSTITUTIONS}

Setting up GitHub integration

In this exercise you integrate your CI/CD workflow with GitHub, using Cloud Build GitHub App. You will set up a trigger that starts the CI/CD workflow when a new tag is applied to the GitHub repo managing the pipeline source code. You will use a fork of this repo as your source GitHub repository.

Create a fork of this repo

1. Follow the GitHub documentation to fork this repo.

2. Create a Cloud Build trigger.

Connect the fork you created in the previous step to your Google Cloud project and create a trigger following the steps in the Creating GitHub app trigger article. Use the following values on the Edit trigger form:

Field Value
Name [YOUR TRIGGER NAME]
Description [YOUR TRIGGER DESCRIPTION]
Event Tag
Source [YOUR FORK]
Tag (regex) .*
Build Configuration Cloud Build configuration file (yaml or json)
Cloud Build configuration file location / workshops/tfx-caip-tf21/lab-03-tfx-cicd/cloudbuild.yaml

Use the following values for the substitution variables:

Variable Value
_ENDPOINT [Your inverting proxy host]
_TFX_IMAGE_NAME lab-03-tfx-image
_PIPELINE_NAME tfx_covertype_continuous_training
_PIPELINE_DSL runner.py
_DATA_ROOT_URI gs://workshop-datasets/covertype/small
_PIPELINE_FOLDER workshops/tfx-caip-tf21/lab-03-tfx-cicd/pipeline
_PYTHON_VERSION 3.7
_RUNTIME_VERSION 2.1
_USE_KFP_SA False

3. Trigger the build.

To start an automated build create a new release of the repo in GitHub. Alternatively, you can start the build by applying a tag using git.

git tag [TAG NAME]
git push origin --tags

4. Verify triggered build in Cloud Build dashboard.

After you see the pipeline finish building on the Cloud Build dashboard, return to AI Platform Pipelines in the console. Click OPEN PIPELINES DASHBOARD and view the newly deployed pipeline. Creating a release tag on GitHub will create a pipeline with the name tfx_covertype_continuous_training-[TAG NAME] while doing so from GitHub will create a pipeline with the name tfx_covertype_continuous_training_github-[TAG NAME].

Next Steps

In this lab, you walked through authoring a Cloud Build CI/CD workflow that automatically builds and deploys a TFX pipeline. You also integrated your TFX workflow with GitHub by setting up a Cloud Build trigger. In the next lab, you will walk through inspection of TFX metadata and pipeline artifacts created during TFX pipeline runs.

License

Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.</font>