Sample for KFServing SDK

This is a sample for KFServing SDK.

The notebook shows how to use KFServing SDK to create, get, rollout_canary, promote and delete InferenceService.


In [ ]:
from kubernetes import client

from kfserving import KFServingClient
from kfserving import constants
from kfserving import utils
from kfserving import V1alpha2EndpointSpec
from kfserving import V1alpha2PredictorSpec
from kfserving import V1alpha2TensorflowSpec
from kfserving import V1alpha2InferenceServiceSpec
from kfserving import V1alpha2InferenceService
from kubernetes.client import V1ResourceRequirements

Define namespace where InferenceService needs to be deployed to. If not specified, below function defines namespace to the current one where SDK is running in the cluster, otherwise it will deploy to default namespace.


In [ ]:
namespace = utils.get_default_target_namespace()

Define InferenceService

Firstly define default endpoint spec, and then define the inferenceservice basic on the endpoint spec.


In [ ]:
api_version = constants.KFSERVING_GROUP + '/' + constants.KFSERVING_VERSION
default_endpoint_spec = V1alpha2EndpointSpec(
                          predictor=V1alpha2PredictorSpec(
                            tensorflow=V1alpha2TensorflowSpec(
                              storage_uri='gs://kfserving-samples/models/tensorflow/flowers',
                              resources=V1ResourceRequirements(
                                  requests={'cpu':'100m','memory':'1Gi'},
                                  limits={'cpu':'100m', 'memory':'1Gi'}))))
    
isvc = V1alpha2InferenceService(api_version=api_version,
                          kind=constants.KFSERVING_KIND,
                          metadata=client.V1ObjectMeta(
                              name='flower-sample', namespace=namespace),
                          spec=V1alpha2InferenceServiceSpec(default=default_endpoint_spec))

Create InferenceService

Call KFServingClient to create InferenceService.


In [ ]:
KFServing = KFServingClient()
KFServing.create(isvc)

Check the InferenceService


In [ ]:
KFServing.get('flower-sample', namespace=namespace, watch=True, timeout_seconds=120)

Add Canary to InferenceService

Firstly define canary endpoint spec, and then rollout 10% traffic to the canary version, watch the rollout process.


In [ ]:
canary_endpoint_spec = V1alpha2EndpointSpec(
                         predictor=V1alpha2PredictorSpec(
                           tensorflow=V1alpha2TensorflowSpec(
                             storage_uri='gs://kfserving-samples/models/tensorflow/flowers-2',
                             resources=V1ResourceRequirements(
                                 requests={'cpu':'100m','memory':'1Gi'},
                                 limits={'cpu':'100m', 'memory':'1Gi'}))))

KFServing.rollout_canary('flower-sample', canary=canary_endpoint_spec, percent=10,
                         namespace=namespace, watch=True, timeout_seconds=120)

Rollout more traffic to canary of the InferenceService

Rollout traffice percent to 50% to canary version.


In [ ]:
KFServing.rollout_canary('flower-sample', percent=50, namespace=namespace,
                         watch=True, timeout_seconds=120)

Promote Canary to Default


In [ ]:
KFServing.promote('flower-sample', namespace=namespace, watch=True, timeout_seconds=120)

Delete the InferenceService


In [ ]:
KFServing.delete('flower-sample', namespace=namespace)