Income Prediction Explanations

Census

We will use an SKLearn classifier built on the 1996 US Census DataSet which predicts high (>50K$) or low (<=50K$) income based on the Census demographic data.

The Kfserving resource provdes:

A pretrained sklearn model stored on a Google bucket
A pretrained Tabular Seldon Alibi Explainer. The training has taken samples of the training data and stored the categorical mapping to allow for human readable results. See the Alibi Docs for further details of training and setting up a model explainer for your data.

** For users of KFServing v0.3.0 please follow the notebook for that branch.



In [ ]:

    
!pygmentize income.yaml



In [ ]:

    
!kubectl apply -f income.yaml



In [ ]:

    
CLUSTER_IPS=!(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
CLUSTER_IP=CLUSTER_IPS[0]
print(CLUSTER_IP)



In [ ]:

    
SERVICE_HOSTNAMES=!(kubectl get inferenceservice income -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME=SERVICE_HOSTNAMES[0]
print(SERVICE_HOSTNAME)



In [ ]:

    
import sys
sys.path.append('../')
from alibi_helper import *
from alibi.datasets import fetch_adult
adult = fetch_adult()
cmap = dict.fromkeys(adult.category_map.keys())
for key, val in adult.category_map.items():
    cmap[key] = {i: v for i, v in enumerate(val)}



In [ ]:

    
idxLow = 0
idxHigh = 32554
for idx in [idxLow,idxHigh]:
    show_row([getFeatures([adult.data[idx]], cmap)],adult)
    show_prediction(predict(adult.data[idx:idx+1].tolist(),"income",adult,SERVICE_HOSTNAME,CLUSTER_IP))

Get Explanation for Low Income Prediction



In [ ]:

    
exp = explain(adult.data[idxLow:idxLow+1].tolist(),"income",SERVICE_HOSTNAME,CLUSTER_IP)



In [ ]:

    
show_anchors(exp['data']['anchor'])

Show precision. How likely predictions using the Anchor features would produce the same result.



In [ ]:

    
show_bar([exp['data']['precision']],[''],"Precision")
show_bar([exp['data']['coverage']],[''],"Coverage")



In [ ]:

    
show_feature_coverage(exp['data'])



In [ ]:

    
show_examples(exp['data'],0,adult)



In [ ]:

    
show_examples(exp['data'],0,adult,False)

Get Explanation for High Income Example



In [ ]:

    
exp = explain(adult.data[idxHigh:idxHigh+1].tolist(),"income", SERVICE_HOSTNAME,CLUSTER_IP)



In [ ]:

    
show_anchors(exp['data']['anchor'])

Show precision. How likely predictions using the Anchor features would produce the same result.



In [ ]:

    
show_bar([exp['data']['precision']],[''],"Precision")
show_bar([exp['data']['coverage']],[''],"Coverage")



In [ ]:

    
show_feature_coverage(exp['data'])



In [ ]:

    
show_examples(exp['data'],0,adult)



In [ ]:

    
show_examples(exp['data'],0,adult,False)

Teardown



In [ ]:

    
!kubectl delete -f income.yaml



In [ ]: