Fairness Design Pattern

The Fairness design pattern provides techniques for ensuring that model predictions are fair and equitable for different groups of users and scenarios. Evaluating your entire end-to-end ML workflow – from data collection to model deployment – through a fairness lens is essential to building successful, high quality models.



In [3]:

    
# If you're running on Colab, you'll need to install the What-if Tool package and authenticate
# If you're on Cloud AI Platform Notebooks, you'll need to install XGBoost on the TF instance
def pip_install(module):
    !pip install {module} --quiet

try:
    import google.colab
    IN_COLAB = True
except:
    IN_COLAB = False

if IN_COLAB:
    pip_install('witwidget')

    from google.colab import auth
    auth.authenticate_user()
else:
    pip_install('xgboost')









    



     |████████████████████████████████| 2.3MB 2.8MB/s



In [4]:

    
import pandas as pd
import xgboost as xgb
import numpy as np
import collections
import witwidget

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.utils import shuffle
from witwidget.notebook.visualization import WitWidget, WitConfigBuilder

Download and pre-process data

In this section we'll:

Download a subset of the mortgage dataset from Google Cloud Storage
Because XGBoost requires all columns to be numerical, we'll convert all categorical columns to dummy columns (0 or 1 values for each possible category value)
Note that we've already done some pre-processing on the original dataset to convert value codes to strings: for example, an agency code of 1 becomes Office of the Comptroller of the Currency (OCC)



In [ ]:

    
# Use a small subset of the data since the original dataset is too big for Colab (2.5GB)
# Data source: https://www.ffiec.gov/hmda/hmdaflat.htm
!gsutil cp gs://mortgage_dataset_files/mortgage-small.csv .









    



Copying gs://mortgage_dataset_files/mortgage-small.csv...
/ [1 files][330.8 MiB/330.8 MiB]                                                
Operation completed over 1 objects/330.8 MiB.



In [ ]:

    
# Set column dtypes for Pandas
COLUMN_NAMES = collections.OrderedDict({
  'as_of_year': np.int16,
  'agency_code': 'category',
  'loan_type': 'category',
  'property_type': 'category',
  'loan_purpose': 'category',
  'occupancy': np.int8,
  'loan_amt_thousands': np.float64,
  'preapproval': 'category',
  'county_code': np.float64,
  'applicant_income_thousands': np.float64,
  'purchaser_type': 'category',
  'hoepa_status': 'category',
  'lien_status': 'category',
  'population': np.float64,
  'ffiec_median_fam_income': np.float64,
  'tract_to_msa_income_pct': np.float64,
  'num_owner_occupied_units': np.float64,
  'num_1_to_4_family_units': np.float64,
  'approved': np.int8
})



In [ ]:

    
# Load data into Pandas
data = pd.read_csv(
  'mortgage-small.csv', 
  index_col=False,
  dtype=COLUMN_NAMES
)
data = data.dropna()
data = shuffle(data, random_state=2)
data.head()









    Out[ ]:







  
    
      
      as_of_year
      agency_code
      loan_type
      property_type
      loan_purpose
      occupancy
      loan_amt_thousands
      preapproval
      county_code
      applicant_income_thousands
      purchaser_type
      hoepa_status
      lien_status
      population
      ffiec_median_fam_income
      tract_to_msa_income_pct
      num_owner_occupied_units
      num_1_to_4_family_units
      approved
    
  
  
    
      310650
      2016
      Consumer Financial Protection Bureau (CFPB)
      Conventional (any loan other than FHA, VA, FSA...
      One to four-family (other than manufactured ho...
      Refinancing
      1
      110.0
      Not applicable
      119.0
      55.0
      Freddie Mac (FHLMC)
      Not a HOEPA loan
      Secured by a first lien
      5930.0
      64100.0
      98.81
      1305.0
      1631.0
      1
    
    
      630129
      2016
      Department of Housing and Urban Development (HUD)
      Conventional (any loan other than FHA, VA, FSA...
      One to four-family (other than manufactured ho...
      Home purchase
      1
      480.0
      Not applicable
      33.0
      270.0
      Loan was not originated or was not sold in cal...
      Not a HOEPA loan
      Secured by a first lien
      4791.0
      90300.0
      144.06
      1420.0
      1450.0
      0
    
    
      715484
      2016
      Federal Deposit Insurance Corporation (FDIC)
      Conventional (any loan other than FHA, VA, FSA...
      One to four-family (other than manufactured ho...
      Refinancing
      2
      240.0
      Not applicable
      59.0
      96.0
      Commercial bank, savings bank or savings assoc...
      Not a HOEPA loan
      Secured by a first lien
      3439.0
      105700.0
      104.62
      853.0
      1076.0
      1
    
    
      887708
      2016
      Office of the Comptroller of the Currency (OCC)
      Conventional (any loan other than FHA, VA, FSA...
      One to four-family (other than manufactured ho...
      Refinancing
      1
      76.0
      Not applicable
      65.0
      85.0
      Loan was not originated or was not sold in cal...
      Not a HOEPA loan
      Secured by a subordinate lien
      3952.0
      61300.0
      90.93
      1272.0
      1666.0
      1
    
    
      719598
      2016
      National Credit Union Administration (NCUA)
      Conventional (any loan other than FHA, VA, FSA...
      One to four-family (other than manufactured ho...
      Refinancing
      1
      100.0
      Not applicable
      127.0
      70.0
      Loan was not originated or was not sold in cal...
      Not a HOEPA loan
      Secured by a first lien
      2422.0
      46400.0
      88.37
      650.0
      1006.0
      1

The What-If Tool can be used before you've built or trained a model by passing it a dataset directly. After running the cell below, changing the "Color By" dropdown to approved so we can see how the data is distributed by our label class.

You can also experiment with further slicing the data. For example, try changing the "Binning | Y-Axis" dropdown to loan_type to visualize the percentage of applications approved for each loan type in the dataset.



In [ ]:

    
# Show WIT before training model by passing it only a dataset
config_builder = (WitConfigBuilder(data[:1000].values.tolist(), data.columns.tolist()))
WitWidget(config_builder, height=800)









    























































































































































































































  



  






  

  




























  






































  

  


























  













  






  






  

  


























  




























  






  








  







  





  



































  

  

















  

  







  
  


















  












  



  












  








  

  
  

  




















  

  































  

  














  






  

  










  

  



















  






  

  


















  

  













  












  

  














  

















  

  











  

  




















  

  









  

  



















  






  




  




  



  




  














  







  






  
  







  
  












  

  






  
  













  













  
  












  













  





  
  
  
  
  
  



  
  






  
  






  

  











  














  

  








  

  









  

  










  
  







  

  

























































  






  











  
















  

  




  






  





























  












  







  











  





  














  





  

  











    





  
  






    Out[ ]:





<witwidget.notebook.colab.wit.WitWidget at 0x7fc706ea2208>

Based on our What-If Tool analysis, we'll limit the dataset to only include loans for home purchases or refinancing since we don't have quite enough data on other loans.



In [ ]:

    
data = data[data['loan_purpose'].isin(['Home purchase', 'Refinancing'])]



In [ ]:

    
# Label preprocessing
labels = data['approved'].values

# See the distribution of approved / denied classes (0: denied, 1: approved)
print(data['approved'].value_counts())









    



1    623613
0    303387
Name: approved, dtype: int64



In [ ]:

    
data = data.drop(columns=['approved'])

For XGBoost all model inputs need to be numeric, so we'll use the Pandas get_dummies method to convert categorical columns to columns with boolean values.



In [ ]:

    
# Convert categorical columns to dummy columns
dummy_columns = list(data.dtypes[data.dtypes == 'category'].index)
data = pd.get_dummies(data, columns=dummy_columns)



In [ ]:

    
# Preview the data
data.head()









    Out[ ]:







  
    
      
      as_of_year
      occupancy
      loan_amt_thousands
      county_code
      applicant_income_thousands
      population
      ffiec_median_fam_income
      tract_to_msa_income_pct
      num_owner_occupied_units
      num_1_to_4_family_units
      agency_code_Consumer Financial Protection Bureau (CFPB)
      agency_code_Department of Housing and Urban Development (HUD)
      agency_code_Federal Deposit Insurance Corporation (FDIC)
      agency_code_Federal Reserve System (FRS)
      agency_code_National Credit Union Administration (NCUA)
      agency_code_Office of the Comptroller of the Currency (OCC)
      loan_type_Conventional (any loan other than FHA, VA, FSA, or RHS loans)
      loan_type_FHA-insured (Federal Housing Administration)
      loan_type_FSA/RHS (Farm Service Agency or Rural Housing Service)
      loan_type_VA-guaranteed (Veterans Administration)
      property_type_Manufactured housing
      property_type_One to four-family (other than manufactured housing)
      loan_purpose_Home improvement
      loan_purpose_Home purchase
      loan_purpose_Refinancing
      preapproval_Not applicable
      preapproval_Preapproval was not requested
      preapproval_Preapproval was requested
      purchaser_type_Affiliate institution
      purchaser_type_Commercial bank, savings bank or savings association
      purchaser_type_Fannie Mae (FNMA)
      purchaser_type_Farmer Mac (FAMC)
      purchaser_type_Freddie Mac (FHLMC)
      purchaser_type_Ginnie Mae (GNMA)
      purchaser_type_Life insurance company, credit union, mortgage bank, or finance company
      purchaser_type_Loan was not originated or was not sold in calendar year covered by register
      purchaser_type_Other type of purchaser
      purchaser_type_Private securitization
      hoepa_status_HOEPA loan
      hoepa_status_Not a HOEPA loan
      lien_status_Not applicable (purchased loans)
      lien_status_Not secured by a lien
      lien_status_Secured by a first lien
      lien_status_Secured by a subordinate lien
    
  
  
    
      310650
      2016
      1
      110.0
      119.0
      55.0
      5930.0
      64100.0
      98.81
      1305.0
      1631.0
      1
      0
      0
      0
      0
      0
      1
      0
      0
      0
      0
      1
      0
      0
      1
      1
      0
      0
      0
      0
      0
      0
      1
      0
      0
      0
      0
      0
      0
      1
      0
      0
      1
      0
    
    
      630129
      2016
      1
      480.0
      33.0
      270.0
      4791.0
      90300.0
      144.06
      1420.0
      1450.0
      0
      1
      0
      0
      0
      0
      1
      0
      0
      0
      0
      1
      0
      1
      0
      1
      0
      0
      0
      0
      0
      0
      0
      0
      0
      1
      0
      0
      0
      1
      0
      0
      1
      0
    
    
      715484
      2016
      2
      240.0
      59.0
      96.0
      3439.0
      105700.0
      104.62
      853.0
      1076.0
      0
      0
      1
      0
      0
      0
      1
      0
      0
      0
      0
      1
      0
      0
      1
      1
      0
      0
      0
      1
      0
      0
      0
      0
      0
      0
      0
      0
      0
      1
      0
      0
      1
      0
    
    
      887708
      2016
      1
      76.0
      65.0
      85.0
      3952.0
      61300.0
      90.93
      1272.0
      1666.0
      0
      0
      0
      0
      0
      1
      1
      0
      0
      0
      0
      1
      0
      0
      1
      1
      0
      0
      0
      0
      0
      0
      0
      0
      0
      1
      0
      0
      0
      1
      0
      0
      0
      1
    
    
      719598
      2016
      1
      100.0
      127.0
      70.0
      2422.0
      46400.0
      88.37
      650.0
      1006.0
      0
      0
      0
      0
      1
      0
      1
      0
      0
      0
      0
      1
      0
      0
      1
      1
      0
      0
      0
      0
      0
      0
      0
      0
      0
      1
      0
      0
      0
      1
      0
      0
      1
      0

Train the XGBoost model



In [ ]:

    
# Split the data into train / test sets
x,y = data,labels
x_train,x_test,y_train,y_test = train_test_split(x,y)



In [ ]:

    
x_train = x_train.astype(float)
x_test = x_test.astype(float)



In [ ]:

    
# Train the model, this will take a few minutes to run
bst = xgb.XGBClassifier(
    objective='binary:logistic'
)

bst.fit(x_train, y_train)









    Out[ ]:





XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0,
              learning_rate=0.1, max_delta_step=0, max_depth=3,
              min_child_weight=1, missing=None, n_estimators=100, n_jobs=1,
              nthread=None, objective='binary:logistic', random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, seed=None,
              silent=None, subsample=1, verbosity=1)



In [ ]:

    
# Get predictions on the test set and print the accuracy score
y_pred = bst.predict(x_test)
acc = accuracy_score(y_test, y_pred.round())
print(acc, '\n')









    



0.881967637540453



In [ ]:

    
# Print a confusion matrix
print('Confusion matrix:')
cm = confusion_matrix(y_test, y_pred.round())
cm = cm / cm.astype(np.float).sum(axis=1)
print(cm)









    



Confusion matrix:
[[0.86590297 0.06469772]
 [0.22857749 0.88971835]]

Using the What-if Tool to interpret the model

Once your model has deployed, you're ready to connect it to the What-if Tool using the WitWidget.



In [ ]:

    
# Format a subset of the test data to send to the What-if Tool for visualization
# Append ground truth label value to training data

# This is the number of examples you want to display in the What-if Tool
num_wit_examples = 1000
test_examples = np.hstack((x_test[:num_wit_examples].values,y_test[:num_wit_examples].reshape(-1,1)))



In [ ]:

    
# Create a What-if Tool visualization, it may take a minute to load
# See the cell below this for exploration ideas

# This prediction adjustment function is needed as this xgboost model's
# prediction returns just a score for the positive class of the binary
# classification, whereas the What-If Tool expects a list of scores for each
# class (in this case, both the negative class and the positive class).
def custom_fn(examples):
  df = pd.DataFrame(examples, columns=x_train.columns.tolist())
  preds = bst.predict_proba(df)
  return preds

config_builder = (WitConfigBuilder(test_examples.tolist(), data.columns.tolist() + ['mortgage_status'])
  .set_custom_predict_fn(custom_fn)
  .set_target_feature('mortgage_status')
  .set_label_vocab(['denied', 'approved']))
WitWidget(config_builder, height=800)









    























































































































































































































  



  






  

  




























  






































  

  


























  













  






  






  

  


























  




























  






  








  







  





  



































  

  

















  

  







  
  


















  












  



  












  








  

  
  

  




















  

  































  

  














  






  

  










  

  



















  






  

  


















  

  













  












  

  














  

















  

  











  

  




















  

  









  

  



















  






  




  




  



  




  














  







  






  
  







  
  












  

  






  
  













  













  
  












  













  





  
  
  
  
  
  



  
  






  
  






  

  











  














  

  








  

  









  

  










  
  







  

  

























































  






  











  
















  

  




  






  





























  












  







  











  





  














  





  

  











    





  
  






    Out[ ]:





<witwidget.notebook.colab.wit.WitWidget at 0x7fc706f187b8>

What-if Tool exploration ideas

Individual data points: the default graph shows all data points from the test set, colored by their ground truth label (approved or denied)
- Try selecting data points close to the middle and tweaking some of their feature values. Then run inference again to see if the model prediction changes
- Select a data point and then select the "Show nearest counterfactual datapoint" radio button. This will highlight a data point with feature values closest to your original one, but with the opposite prediction
Binning data: create separate graphs for individual features
- From the "Binning - X axis" dropdown, try selecting one of the agency codes, for example "Department of Housing and Urban Development (HUD)". This will create 2 separate graphs, one for loan applications from the HUD (graph labeled 1), and one for all other agencies (graph labeled 0). This shows us that loans from this agency are more likely to be denied
Exploring overall performance: Click on the "Performance & Fairness" tab to view overall performance statistics on the model's results on the provided dataset, including confusion matrices, PR curves, and ROC curves.
- Experiment with the threshold slider, raising and lowering the positive classification score the model needs to return before it decides to predict "approved" for the loan, and see how it changes accuracy, false positives, and false negatives.
- On the left side "Slice by" menu, select "loan_purpose_Home purchase". You'll now see performance on the two subsets of your data: the "0" slice shows when the loan is not for a home purchase, and the "1" slice is for when the loan is for a home purchase. Check out the accuracy, false postive, and false negative rate between the two slices to look for differences in performance. If you expand the rows to look at the confusion matrices, you can see that the model predicts "approved" more often for home purchase loans.
- You can use the optimization buttons on the left side to have the tool auto-select different positive classification thresholds for each slice in order to achieve different goals. If you select the "Demographic parity" button, then the two thresholds will be adjusted so that the model predicts "approved" for a similar percentage of applicants in both slices. What does this do to the accuracy, false positives and false negatives for each slice?

Copyright 2020 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License

Fairness Design Pattern

Download and pre-process data

[[name]]

[[_title]]

Add Feature

Colors

Create similarity feature