Sample for KFServing SDK with a custom image

This is a sample for KFServing SDK using a custom image.

The notebook shows how to use KFServing SDK to create, get and delete InferenceService with a custom image.

Setup

  • Your ~/.kube/config should point to a cluster with KFServing installed.
  • Your cluster's Istio Ingress gateway must be network accessible.

Build the docker image we will be using.

The goal of custom image support is to allow users to bring their own wrapped model inside a container and serve it with KFServing. Please note that you will need to ensure that your container is also running a web server e.g. Flask to expose your model endpoints. This example extends kfserving.KFModel which uses the tornado web server.

To build and push with Docker Hub set the DOCKER_HUB_USERNAME variable below with your Docker Hub username


In [ ]:
# Set this to be your dockerhub username
# It will be used when building your image and when creating the InferenceService for your image
DOCKER_HUB_USERNAME = "your_docker_username"

In [ ]:
%%bash -s "$DOCKER_HUB_USERNAME"
docker build -t $1/kfserving-custom-model ./model-server

In [ ]:
%%bash -s "$DOCKER_HUB_USERNAME"
docker push $1/kfserving-custom-model

KFServing Client SDK

We will use the KFServing client SDK to create the InferenceService and deploy our custom image.


In [ ]:
from kubernetes import client
from kubernetes.client import V1Container

from kfserving import KFServingClient
from kfserving import constants
from kfserving import utils
from kfserving import V1alpha2EndpointSpec
from kfserving import V1alpha2PredictorSpec
from kfserving import V1alpha2InferenceServiceSpec
from kfserving import V1alpha2InferenceService
from kfserving import V1alpha2CustomSpec

In [ ]:
namespace = utils.get_default_target_namespace()
print(namespace)

Define InferenceService

Firstly define default endpoint spec, and then define the inferenceservice using the endpoint spec.

To use a custom image we need to use V1alphaCustomSpec which takes a V1Container from the kuberenetes library


In [ ]:
api_version = constants.KFSERVING_GROUP + '/' + constants.KFSERVING_VERSION

default_endpoint_spec = V1alpha2EndpointSpec(
                          predictor=V1alpha2PredictorSpec(
                              custom=V1alpha2CustomSpec(
                                  container=V1Container(
                                      name="kfserving-custom-model",
                                      image=f"{DOCKER_HUB_USERNAME}/kfserving-custom-model"))))

isvc = V1alpha2InferenceService(api_version=api_version,
                          kind=constants.KFSERVING_KIND,
                          metadata=client.V1ObjectMeta(
                              name='kfserving-custom-model', namespace=namespace),
                          spec=V1alpha2InferenceServiceSpec(default=default_endpoint_spec))

Create the InferenceService

Call KFServingClient to create InferenceService.


In [ ]:
KFServing = KFServingClient()
KFServing.create(isvc)

Check the InferenceService


In [ ]:
KFServing.get('kfserving-custom-model', namespace=namespace, watch=True, timeout_seconds=120)

Run a prediction


In [ ]:
MODEL_NAME = "kfserving-custom-model"

In [ ]:
%%bash --out CLUSTER_IP
INGRESS_GATEWAY="istio-ingressgateway"
echo "$(kubectl -n istio-system get service $INGRESS_GATEWAY -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"

In [ ]:
%%bash -s "$MODEL_NAME" --out SERVICE_HOSTNAME
echo "$(kubectl get inferenceservice $1 -o jsonpath='{.status.url}' | cut -d "/" -f 3)"

In [ ]:
import requests
import json

with open('input.json') as json_file:
    data = json.load(json_file)
    url = f"http://{CLUSTER_IP.strip()}/v1/models/{MODEL_NAME}:predict"
    headers = {"Host": SERVICE_HOSTNAME.strip()}
    result = requests.post(url, data=json.dumps(data), headers=headers)
    print(result.content)

Delete the InferenceService


In [ ]:
KFServing.delete(MODEL_NAME, namespace=namespace)

In [ ]: