A simple regression training using LightGBM through Fairing


In [ ]:
from kubeflow import fairing

# Setting up google container repositories (GCR) for storing output containers
# You can use any docker container registry istead of GCR
GCP_PROJECT = fairing.cloud.gcp.guess_project_name()
DOCKER_REGISTRY = 'gcr.io/{}/fairing-job'.format(GCP_PROJECT)
BASE_IMAGE = 'gcr.io/{}/lightgbm:latest'.format(GCP_PROJECT)

Build a base image for LightGBM

You need to build the docker image only once and that can be reused in future runs. Approximate time for build and push: 10mins


In [ ]:
!docker build . -t {BASE_IMAGE}
!docker push {BASE_IMAGE}

Launch a LightGBM train task


In [ ]:
from kubeflow import fairing
from kubeflow.fairing.frameworks import lightgbm

In [ ]:
# Creating a bucket for copying the trained model. 
# You can set gcs_bucket variable to an existing bucket name if that is desired.
gcs_bucket = "gs://{}-fairing".format(GCP_PROJECT)
!gsutil mb {gcs_bucket}

In [ ]:
params = {
    'task': 'train',
    'boosting_type': 'gbdt',
    'objective': 'regression',
    'metric': {'l2', 'l1'},
    'metric_freq': 1,
    'num_leaves': 31,
    'learning_rate': 0.05,
    'feature_fraction': 0.9,
    'bagging_fraction': 0.8,
    'bagging_freq': 5,
    "n_estimators": 10,
    "is_training_metric": "true",
    "valid_data": "gs://fairing-lightgbm/regression-example/regression.test",
    "train_data": "gs://fairing-lightgbm/regression-example/regression.train",
    'verbose': 1,
    "model_output": "{}/lightgbm/example/model.txt".format(gcs_bucket)
}

In [ ]:
lightgbm.execute(config=params,
      docker_registry=DOCKER_REGISTRY,
      base_image=BASE_IMAGE)

Let's look at the trained model


In [ ]:
url = params['model_output']
!gsutil cp {url} /tmp/model.txt
!head /tmp/model.txt