This notebook shows:
For ML fairness background on COMPAS see:
This notebook trains a model to mimic the behavior of the COMPAS recidivism classifier and uses the SHAP library to provide feature importance for each prediction by the model. We can then analyze our COMPAS proxy model for fairness using the What-If Tool, and explore how important each feature was to each prediction through the SHAP values.
The specific binary classification task for this model is to determine if a person belongs in the "Low" risk class according to COMPAS (negative class), or the "Medium" or "High" risk class (positive class). We then analyze it with the What-If Tool for its ability to predict recidivism within two years of arrest.
A simpler version of this notebook that doesn't make use of the SHAP explainer can be found .
Copyright 2019 Google LLC. SPDX-License-Identifier: Apache-2.0
In [0]:
#@title Install What-If Tool Widget and SHAP library
!pip install --upgrade --quiet witwidget shap
In [0]:
#@title Read training dataset from CSV {display-mode: "form"}
import pandas as pd
import numpy as np
import tensorflow as tf
import witwidget
import os
import pickle
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from sklearn.utils import shuffle
df = pd.read_csv('https://storage.googleapis.com/what-if-tool-resources/computefest2019/cox-violent-parsed_filt.csv')
In [0]:
# Preprocess the data
# Filter out entries with no indication of recidivism or no compass score
df = df[df['is_recid'] != -1]
df = df[df['decile_score'] != -1]
# Rename recidivism column
df['recidivism_within_2_years'] = df['is_recid']
# Make the COMPASS label column numeric (0 and 1), for use in our model
df['COMPASS_determination'] = np.where(df['score_text'] == 'Low', 0, 1)
df = pd.get_dummies(df, columns=['sex', 'race'])
# Get list of all columns from the dataset we will use for model input or output.
input_features = ['sex_Female', 'sex_Male', 'age', 'race_African-American', 'race_Caucasian', 'race_Hispanic', 'race_Native American', 'race_Other', 'priors_count', 'juv_fel_count', 'juv_misd_count', 'juv_other_count']
to_keep = input_features + ['recidivism_within_2_years', 'COMPASS_determination']
to_remove = [col for col in df.columns if col not in to_keep]
df = df.drop(columns=to_remove)
input_columns = df.columns.tolist()
labels = df['COMPASS_determination']
df.head()
In [0]:
# Create data structures needing for training and testing.
# The training data doesn't contain the column we are predicting,
# 'COMPASS_determination', or the column we are using for evaluation of our
# trained model, 'recidivism_within_2_years'.
df_for_training = df.drop(columns=['COMPASS_determination', 'recidivism_within_2_years'])
train_size = int(len(df_for_training) * 0.8)
train_data = df_for_training[:train_size]
train_labels = labels[:train_size]
test_data_with_labels = df[train_size:]
In [0]:
# Create the model
# This is the size of the array we'll be feeding into our model for each example
input_size = len(train_data.iloc[0])
model = Sequential()
model.add(Dense(200, input_shape=(input_size,), activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', optimizer='adam')
In [0]:
# Train the model
model.fit(train_data.values, train_labels.values, epochs=4, batch_size=32, validation_split=0.1)
In [0]:
# Create a SHAP explainer by passing a subset of our training data
import shap
explainer = shap.DeepExplainer(model, train_data.values[:200])
In [0]:
# Explain predictions of the model on the first 5 examples from our training set
# to test the SHAP explainer.
shap_values = explainer.shap_values(train_data.values[:5])
shap_values
In [0]:
#@title Show model results and SHAP values in WIT
from witwidget.notebook.visualization import WitWidget, WitConfigBuilder
num_datapoints = 1000 #@param {type: "number"}
# Column indices to strip out from data from WIT before passing it to the model.
columns_not_for_model_input = [
test_data_with_labels.columns.get_loc("recidivism_within_2_years"),
test_data_with_labels.columns.get_loc("COMPASS_determination")
]
# Return model predictions and SHAP values for each inference.
def custom_predict_with_shap(examples_to_infer):
# Delete columns not used by model
model_inputs = np.delete(
np.array(examples_to_infer), columns_not_for_model_input, axis=1).tolist()
# Get the class predictions from the model.
preds = model.predict(model_inputs)
preds = [[1 - pred[0], pred[0]] for pred in preds]
# Get the SHAP values from the explainer and create a map of feature name
# to SHAP value for each example passed to the model.
shap_output = explainer.shap_values(np.array(model_inputs))[0]
attributions = []
for shap in shap_output:
attrs = {}
for i, col in enumerate(df_for_training.columns):
attrs[col] = shap[i]
attributions.append(attrs)
ret = {'predictions': preds, 'attributions': attributions}
return ret
examples_for_shap_wit = test_data_with_labels.values.tolist()
column_names = test_data_with_labels.columns.tolist()
config_builder = WitConfigBuilder(
examples_for_shap_wit[:num_datapoints],
feature_names=column_names).set_custom_predict_fn(
custom_predict_with_shap).set_target_feature('recidivism_within_2_years')
ww = WitWidget(config_builder, height=800)
In "Performance and Fairness" tab, slice the dataset by different features (such as race or sex)
In the "Performance + Fairness" tab, change the cost ratio so that you can optimize the threshold based off of a non-symmetric cost of false positives vs false negatives. Then click the "optimize threshold" button and see the effect on the confusion matrix.