This is a sample for KFServing SDK using a custom image.
The notebook shows how to use KFServing SDK to create, get and delete InferenceService with a custom image.
~/.kube/config
should point to a cluster with KFServing installed.The goal of custom image support is to allow users to bring their own wrapped model inside a container and serve it with KFServing. Please note that you will need to ensure that your container is also running a web server e.g. Flask to expose your model endpoints. This example extends kfserving.KFModel which uses the tornado web server.
To build and push with Docker Hub set the DOCKER_HUB_USERNAME
variable below with your Docker Hub username
In [ ]:
# Set this to be your dockerhub username
# It will be used when building your image and when creating the InferenceService for your image
DOCKER_HUB_USERNAME = "your_docker_username"
In [ ]:
%%bash -s "$DOCKER_HUB_USERNAME"
docker build -t $1/kfserving-custom-model ./model-server
In [ ]:
%%bash -s "$DOCKER_HUB_USERNAME"
docker push $1/kfserving-custom-model
We will use the KFServing client SDK to create the InferenceService and deploy our custom image.
In [ ]:
from kubernetes import client
from kubernetes.client import V1Container
from kfserving import KFServingClient
from kfserving import constants
from kfserving import utils
from kfserving import V1alpha2EndpointSpec
from kfserving import V1alpha2PredictorSpec
from kfserving import V1alpha2InferenceServiceSpec
from kfserving import V1alpha2InferenceService
from kfserving import V1alpha2CustomSpec
In [ ]:
namespace = utils.get_default_target_namespace()
print(namespace)
Firstly define default endpoint spec, and then define the inferenceservice using the endpoint spec.
To use a custom image we need to use V1alphaCustomSpec which takes a V1Container from the kuberenetes library
In [ ]:
api_version = constants.KFSERVING_GROUP + '/' + constants.KFSERVING_VERSION
default_endpoint_spec = V1alpha2EndpointSpec(
predictor=V1alpha2PredictorSpec(
custom=V1alpha2CustomSpec(
container=V1Container(
name="kfserving-custom-model",
image=f"{DOCKER_HUB_USERNAME}/kfserving-custom-model"))))
isvc = V1alpha2InferenceService(api_version=api_version,
kind=constants.KFSERVING_KIND,
metadata=client.V1ObjectMeta(
name='kfserving-custom-model', namespace=namespace),
spec=V1alpha2InferenceServiceSpec(default=default_endpoint_spec))
In [ ]:
KFServing = KFServingClient()
KFServing.create(isvc)
In [ ]:
KFServing.get('kfserving-custom-model', namespace=namespace, watch=True, timeout_seconds=120)
In [ ]:
MODEL_NAME = "kfserving-custom-model"
In [ ]:
%%bash --out CLUSTER_IP
INGRESS_GATEWAY="istio-ingressgateway"
echo "$(kubectl -n istio-system get service $INGRESS_GATEWAY -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
In [ ]:
%%bash -s "$MODEL_NAME" --out SERVICE_HOSTNAME
echo "$(kubectl get inferenceservice $1 -o jsonpath='{.status.url}' | cut -d "/" -f 3)"
In [ ]:
import requests
import json
with open('input.json') as json_file:
data = json.load(json_file)
url = f"http://{CLUSTER_IP.strip()}/v1/models/{MODEL_NAME}:predict"
headers = {"Host": SERVICE_HOSTNAME.strip()}
result = requests.post(url, data=json.dumps(data), headers=headers)
print(result.content)
In [ ]:
KFServing.delete(MODEL_NAME, namespace=namespace)
In [ ]: