Licensed under the Apache License, Version 2.0 (the 'License'); you may not use this file except in compliance with the License. You may obtain a copy of the License at

https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an 'AS IS' BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

This colab contains TensorFlow code for implementing the constrained optimization methods presented in the paper:

Harikrishna Narasimhan, Andrew Cotter, Maya Gupta, Serena Wang, 'Pairwise Fairness for Ranking and Regression', AAAI 2020. [link]

First, let's install and import the relevant libraries.


In [0]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import random
import sys
from sklearn import model_selection
import tensorflow as tf

In [4]:
!pip install git+https://github.com/google-research/tensorflow_constrained_optimization


Collecting git+https://github.com/google-research/tensorflow_constrained_optimization
  Cloning https://github.com/google-research/tensorflow_constrained_optimization to /tmp/pip-req-build-qdt4wk1d
  Running command git clone -q https://github.com/google-research/tensorflow_constrained_optimization /tmp/pip-req-build-qdt4wk1d
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from tfco-nightly==0.3.dev20200613) (1.18.5)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from tfco-nightly==0.3.dev20200613) (1.4.1)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from tfco-nightly==0.3.dev20200613) (1.12.0)
Requirement already satisfied: tensorflow>=1.14 in /usr/local/lib/python3.6/dist-packages (from tfco-nightly==0.3.dev20200613) (2.2.0)
Requirement already satisfied: tensorflow-estimator<2.3.0,>=2.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (2.2.0)
Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.9.0)
Requirement already satisfied: google-pasta>=0.1.8 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.2.0)
Requirement already satisfied: keras-preprocessing>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.1.2)
Requirement already satisfied: tensorboard<2.3.0,>=2.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (2.2.2)
Requirement already satisfied: protobuf>=3.8.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.10.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.2.1)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.1.0)
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.29.0)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.34.2)
Requirement already satisfied: h5py<2.11.0,>=2.10.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (2.10.0)
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.12.1)
Requirement already satisfied: astunparse==1.6.3 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.6.3)
Requirement already satisfied: gast==0.3.3 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.3.3)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.6.0.post3)
Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (2.23.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.2.2)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (47.1.1)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.0.1)
Requirement already satisfied: google-auth<2,>=1.6.3 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.7.2)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.4.1)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (2.9)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.24.3)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (2020.4.5.1)
Requirement already satisfied: importlib-metadata; python_version < "3.8" in /usr/local/lib/python3.6/dist-packages (from markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.6.0)
Requirement already satisfied: rsa<4.1,>=3.1.4 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (4.0)
Requirement already satisfied: cachetools<3.2,>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.1.1)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.2.8)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (1.3.0)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.6/dist-packages (from importlib-metadata; python_version < "3.8"->markdown>=2.6.8->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.1.0)
Requirement already satisfied: pyasn1>=0.1.3 in /usr/local/lib/python3.6/dist-packages (from rsa<4.1,>=3.1.4->google-auth<2,>=1.6.3->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (0.4.8)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.6/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.3.0,>=2.2.0->tensorflow>=1.14->tfco-nightly==0.3.dev20200613) (3.1.0)
Building wheels for collected packages: tfco-nightly
  Building wheel for tfco-nightly (setup.py) ... done
  Created wheel for tfco-nightly: filename=tfco_nightly-0.3.dev20200613-cp36-none-any.whl size=148305 sha256=20125b69ff217dcf5a0dab72f9de0760456a7319ba6837b2dcbef6344aa52526
  Stored in directory: /tmp/pip-ephem-wheel-cache-syjh4pkj/wheels/c9/b3/c3/78e0691949466af462380554286105216cd95a9ae7cf08ee78
Successfully built tfco-nightly
Installing collected packages: tfco-nightly
Successfully installed tfco-nightly-0.3.dev20200613

In [0]:
import tensorflow_constrained_optimization as tfco

Pairwise Regression Fairness

We will be training a linear scoring function $f(x) = w^\top x$ where $x \in \mathbb{R}^d$ is the input feature vector. Our goal is to train the regression model subject to pairwise fairness constraints.

Specifically, for the regression model $f$, we denote:

  • $sqerr(f)$ as the squared error for model $f$. $$ sqerr(f) = \mathbf{E}\big[\big(f(x) - y\big)^2\big] $$
  • $err_{i,j}(f)$ as the pairwise error over example pairs where the higher label example is from group $i$, and the lower label example is from group $j$.
$$ err_{i, j}(f) = \mathbf{E}\big[\mathbb{I}\big(f(x) < f(x')\big) \,\big|\, y > y',~ grp(x) = i, ~grp(x') = j\big] $$


We then wish to solve the following constrained problem: $$min_f\; sqerr(f)$$ $$\text{ s.t. } |err_{i,j}(f) - err_{k,\ell}(f)| \leq \epsilon \;\;\; \forall ((i,j), (k,\ell)) \in \mathcal{G},$$

where $\mathcal{G}$ contains the pairs we are interested in constraining.

Load Communities & Crime Data

We will use the benchmark Communities and Crimes dataset from the UCI Machine Learning repository for our illustration. This dataset contains various demographic and racial distribution details (aggregated from census and law enforcement data sources) about different communities in the US, along with the per capita crime rate in each commmunity. Our goal is to predict the crime rate for a community, a regression problem. We consider communities where the percentage of black population is above the 70-th percentile as the protected group.


In [0]:
# We will divide the data into 25 minibatches and refer to them as 'queries'.
num_queries = 25

# List of column names in the dataset.
column_names = ["state", "county", "community", "communityname", "fold", "population", "householdsize", "racepctblack", "racePctWhite", "racePctAsian", "racePctHisp", "agePct12t21", "agePct12t29", "agePct16t24", "agePct65up", "numbUrban", "pctUrban", "medIncome", "pctWWage", "pctWFarmSelf", "pctWInvInc", "pctWSocSec", "pctWPubAsst", "pctWRetire", "medFamInc", "perCapInc", "whitePerCap", "blackPerCap", "indianPerCap", "AsianPerCap", "OtherPerCap", "HispPerCap", "NumUnderPov", "PctPopUnderPov", "PctLess9thGrade", "PctNotHSGrad", "PctBSorMore", "PctUnemployed", "PctEmploy", "PctEmplManu", "PctEmplProfServ", "PctOccupManu", "PctOccupMgmtProf", "MalePctDivorce", "MalePctNevMarr", "FemalePctDiv", "TotalPctDiv", "PersPerFam", "PctFam2Par", "PctKids2Par", "PctYoungKids2Par", "PctTeen2Par", "PctWorkMomYoungKids", "PctWorkMom", "NumIlleg", "PctIlleg", "NumImmig", "PctImmigRecent", "PctImmigRec5", "PctImmigRec8", "PctImmigRec10", "PctRecentImmig", "PctRecImmig5", "PctRecImmig8", "PctRecImmig10", "PctSpeakEnglOnly", "PctNotSpeakEnglWell", "PctLargHouseFam", "PctLargHouseOccup", "PersPerOccupHous", "PersPerOwnOccHous", "PersPerRentOccHous", "PctPersOwnOccup", "PctPersDenseHous", "PctHousLess3BR", "MedNumBR", "HousVacant", "PctHousOccup", "PctHousOwnOcc", "PctVacantBoarded", "PctVacMore6Mos", "MedYrHousBuilt", "PctHousNoPhone", "PctWOFullPlumb", "OwnOccLowQuart", "OwnOccMedVal", "OwnOccHiQuart", "RentLowQ", "RentMedian", "RentHighQ", "MedRent", "MedRentPctHousInc", "MedOwnCostPctInc", "MedOwnCostPctIncNoMtg", "NumInShelters", "NumStreet", "PctForeignBorn", "PctBornSameState", "PctSameHouse85", "PctSameCity85", "PctSameState85", "LemasSwornFT", "LemasSwFTPerPop", "LemasSwFTFieldOps", "LemasSwFTFieldPerPop", "LemasTotalReq", "LemasTotReqPerPop", "PolicReqPerOffic", "PolicPerPop", "RacialMatchCommPol", "PctPolicWhite", "PctPolicBlack", "PctPolicHisp", "PctPolicAsian", "PctPolicMinor", "OfficAssgnDrugUnits", "NumKindsDrugsSeiz", "PolicAveOTWorked", "LandArea", "PopDens", "PctUsePubTrans", "PolicCars", "PolicOperBudg", "LemasPctPolicOnPatr", "LemasGangUnitDeploy", "LemasPctOfficDrugUn", "PolicBudgPerPop", "ViolentCrimesPerPop"]

dataset_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.data"

# Read dataset from the UCI web repository and assign column names.
data_df = pd.read_csv(dataset_url, sep=",", names=column_names,
                      na_values="?")

# Make sure that there are no missing values in the "ViolentCrimesPerPop" column.
assert(not data_df["ViolentCrimesPerPop"].isna().any())

# Real-valued label: "ViolentCrimesPerPop".
labels_df = data_df["ViolentCrimesPerPop"]

# Now that we have assigned binary labels, 
# we drop the "ViolentCrimesPerPop" column from the data frame.
data_df.drop(columns="ViolentCrimesPerPop", inplace=True)

# Group features.
race_black_70_percentile = data_df["racepctblack"].quantile(q=0.7)
groups_df = (data_df["racepctblack"] >= race_black_70_percentile)

# Drop categorical features.
data_df.drop(columns=["state", "county", "community", "communityname", "fold"],
             inplace=True)

# Handle missing features.
feature_names = data_df.columns
for feature_name in feature_names:  
    missing_rows = data_df[feature_name].isna()  # Which rows have missing values?
    if missing_rows.any():  # Check if at least one row has a missing value.
        data_df[feature_name].fillna(0.0, inplace=True)  # Fill NaN with 0.
        missing_rows.rename(feature_name + "_is_missing", inplace=True)
        data_df = data_df.join(missing_rows)  # Append boolean "is_missing" feature.

labels = labels_df.values.astype(np.float32)
groups = groups_df.values.astype(np.float32)
features = data_df.values.astype(np.float32)

# Set random seed so that the results are reproducible.
np.random.seed(123456)

# We randomly divide the examples into 'num_queries' queries.
queries = np.random.randint(0, num_queries, size=features.shape[0])

# Train and test indices.
train_indices, test_indices = model_selection.train_test_split(
    range(features.shape[0]), test_size=0.4)

# Train features, labels and protected groups.
train_set = {
  'features': features[train_indices, :],
  'labels': labels[train_indices],
  'groups': groups[train_indices],
  'queries': queries[train_indices],
  'dimension': features.shape[-1],
  'num_queries': num_queries
}

# Test features, labels and protected groups.
test_set = {
  'features': features[test_indices, :],
  'labels': labels[test_indices],
  'groups': groups[test_indices],
  'queries': queries[test_indices],
  'dimension': features.shape[-1],
  'num_queries': num_queries
}

Evaluation Metrics

We will need functions to convert labeled data into paired data.


In [0]:
def pair_high_low_docs(data):
  # Returns a DataFrame of pairs of larger-smaller labeled regression examples
  # given in DataFrame.
  # For all pairs of docs, and remove rows that are not needed.
  pos_docs = data.copy()
  neg_docs = data.copy()

  # Include a merge key.
  pos_docs.insert(0, "merge_key", 0)
  neg_docs.insert(0, "merge_key", 0)

  # Merge docs and drop merge key and label column.
  pairs = pos_docs.merge(neg_docs, on="merge_key", how="outer",
                         suffixes=("_pos", "_neg"))

  # Only retain rows where label_pos > label_neg.
  pairs = pairs[pairs.label_pos > pairs.label_neg]

  # Drop merge_key.
  pairs.drop(columns=["merge_key"], inplace=True)
  return pairs


def convert_labeled_to_paired_data(data_dict, index=None):
  # Forms pairs of examples from each batch/query.

  # Converts data arrays to pandas DataFrame with required column names and
  # makes a call to convert_df_to_pairs and returns a dictionary.
  features = data_dict['features']
  labels = data_dict['labels']
  groups = data_dict['groups']
  queries = data_dict['queries']

  if index is not None:
    data_df = pd.DataFrame(features[queries == index, :])
    data_df = data_df.assign(label=pd.DataFrame(labels[queries == index]))
    data_df = data_df.assign(group=pd.DataFrame(groups[queries == index]))
    data_df = data_df.assign(query_id=pd.DataFrame(queries[queries == index]))
  else:
    data_df = pd.DataFrame(features)
    data_df = data_df.assign(label=pd.DataFrame(labels))
    data_df = data_df.assign(group=pd.DataFrame(groups))
    data_df = data_df.assign(query_id=pd.DataFrame(queries))

  # Forms pairs of positive-negative docs for each query in given DataFrame
  # if the DataFrame has a query_id column. Otherise forms pairs from all rows
  # of the DataFrame.
  data_pairs = data_df.groupby('query_id').apply(pair_high_low_docs)

  # Create groups ndarray.
  pos_groups = data_pairs['group_pos'].values.reshape(-1, 1)
  neg_groups = data_pairs['group_neg'].values.reshape(-1, 1)
  group_pairs = np.concatenate((pos_groups, neg_groups), axis=1)

  # Create queries ndarray.
  query_pairs = data_pairs['query_id_pos'].values.reshape(-1,)

  # Create features ndarray.
  feature_names = data_df.columns
  feature_names = feature_names.drop(['query_id', 'label'])
  feature_names = feature_names.drop(['group'])

  pos_features = data_pairs[[str(s) + '_pos' for s in feature_names]].values
  pos_features = pos_features.reshape(-1, 1, len(feature_names))

  neg_features = data_pairs[[str(s) + '_neg' for s in feature_names]].values
  neg_features = neg_features.reshape(-1, 1, len(feature_names))

  feature_pairs = np.concatenate((pos_features, neg_features), axis=1)

  # Paired data dict.
  paired_data = {
      'feature_pairs': feature_pairs, 
      'group_pairs': group_pairs, 
      'query_pairs': query_pairs,
      'features': features,
      'labels': labels,
      'queries': queries,
      'dimension': data_dict['dimension'],
      'num_queries': data_dict['num_queries']
  }

  return paired_data

We will also need functions to evaluate the pairwise error rates for a linear model.


In [0]:
def get_mask(groups, pos_group, neg_group=None):
  # Returns a boolean mask selecting positive-negative document pairs where 
  # the protected group for  the positive document is pos_group and 
  # the protected group for the negative document (if specified) is neg_group.
  # Repeat group membership positive docs as many times as negative docs.
  mask_pos = groups[:, 0] == pos_group
  
  if neg_group is None:
    return mask_pos
  else:
    mask_neg = groups[:, 1] == neg_group
    return mask_pos & mask_neg


def mean_squared_error(model, dataset):
  # Returns mean squared error for Keras model on dataset.
  scores = model.predict(dataset['features'])
  labels = dataset['labels']
  return np.mean((scores - labels) ** 2)


def group_error_rate(model, dataset, pos_group, neg_group=None):
  # Returns error rate for Keras model on data set, considering only document 
  # pairs where the protected group for the positive document is pos_group, and  
  # the protected group for the negative document (if specified) is neg_group.
  d = dataset['dimension']
  scores0 = model.predict(dataset['feature_pairs'][:, 0, :].reshape(-1, d))
  scores1 = model.predict(dataset['feature_pairs'][:, 1, :].reshape(-1, d))
  mask = get_mask(dataset['group_pairs'], pos_group, neg_group)
  diff = scores0 - scores1
  diff = diff[mask > 0].reshape((-1))
  return np.mean(diff < 0)

Create Linear Model

We then write a function to create the linear scoring model.


In [0]:
def create_scoring_model(feature_pairs, features, dimension):
  # Returns a linear Keras scoring model, and returns a nullary function 
  # returning predictions on the features.

  # Linear scoring model with no hidden layers.
  layers = []
  # Input layer takes `dimension` inputs.
  layers.append(tf.keras.Input(shape=(dimension,)))
  layers.append(tf.keras.layers.Dense(1)) 
  scoring_model = tf.keras.Sequential(layers)

  # Create a nullary function that returns applies the linear model to the 
  # features and returns the tensor with the prediction differences on pairs.
  def prediction_diffs():
    scores0 = scoring_model(feature_pairs()[:, 0, :].reshape(-1, dimension))
    scores1 = scoring_model(feature_pairs()[:, 1, :].reshape(-1, dimension))
    return scores0 - scores1
      
  # Create a nullary function that returns the predictions on individual 
  # examples.
  predictions = lambda: scoring_model(features())

  return scoring_model, prediction_diffs, predictions

Formulate Optimization Problem

We are ready to formulate the constrained optimization problem using the TFCO library.


In [0]:
def group_mask_fn(groups, pos_group, neg_group=None):
  # Returns a nullary function returning group mask.
  group_mask = lambda: np.reshape(
      get_mask(groups(), pos_group, neg_group), (-1))
  return group_mask


def formulate_problem(
    feature_pairs, group_pairs, features, labels, dimension, 
    constraint_groups=[], constraint_slack=None):
  # Formulates a constrained problem that optimizes the squared error for a linear
  # model on the specified dataset, subject to pairwise fairness constraints 
  # specified by the constraint_groups and the constraint_slack.
  # 
  # Args:
  #   feature_pairs: Nullary function returning paired features
  #   group_pairs: Nullary function returning paired groups
  #   features: Nullary function returning features
  #   labels: Nullary function returning labels
  #   dimension: Input dimension for scoring model
  #   constraint_groups: List containing tuples of the form 
  #     ((pos_group0, neg_group0), (pos_group1, neg_group1)), specifying the 
  #     group memberships for the document pairs to compare in the constraints.
  #   constraint_slack: slackness '\epsilon' allowed in the constraints.
  # Returns:
  #   A RateMinimizationProblem object, and a Keras scoring model.

  # Create linear scoring model: we get back a Keras model and a nullary  
  # function returning predictions on the features.
  scoring_model, prediction_diffs, predictions = create_scoring_model(
      feature_pairs, features, dimension)
  
  # Context for the optimization objective.
  context = tfco.rate_context(prediction_diffs)

  # Squared loss objective.
  squared_loss = lambda: tf.reduce_mean((predictions() - labels()) ** 2)
  
  # Constraint set.
  constraint_set = []
  
  # Context for the constraints.
  for ((pos_group0, neg_group0), (pos_group1, neg_group1)) in constraint_groups:
    # Context for group 0.
    group_mask0 = group_mask_fn(group_pairs, pos_group0, neg_group0)
    context_group0 = context.subset(group_mask0)

    # Context for group 1.
    group_mask1 = group_mask_fn(group_pairs, pos_group1, neg_group1)
    context_group1 = context.subset(group_mask1)

    # Add constraints to constraint set.
    constraint_set.append(
        tfco.negative_prediction_rate(context_group0) <= (
            tfco.negative_prediction_rate(context_group1) + constraint_slack))
    constraint_set.append(
        tfco.negative_prediction_rate(context_group1) <= (
            tfco.negative_prediction_rate(context_group0) + constraint_slack))
  
  # Formulate constrained minimization problem.
  problem = tfco.RateMinimizationProblem(
      tfco.wrap_rate(squared_loss), constraint_set)
  
  return problem, scoring_model

Train Model

The following function then trains the linear model by solving the above constrained optimization problem. We first provide a training function with minibatch gradient updates. There are three types of pairwise fairness criterion we handle (specified by 'constraint_type'), and assign the (pos_group, neg_group) pairs to compare accordingly.


In [0]:
def train_model(train_set, params):
  # Trains the model with stochastic updates (one query per updates).
  #
  # Args:
  #   train_set: Dictionary of "paired" training data.
  #   params: Dictionary of hyper-paramters for training.
  #
  # Returns:
  #   Trained model, list of objectives, list of group constraint violations.

  # Set random seed for reproducibility.
  random.seed(333333)
  np.random.seed(121212)
  tf.random.set_seed(212121)

  # Set up problem and model.
  if params['constrained']:
    # Constrained optimization.
    if params['constraint_type'] == 'marginal_equal_opportunity':
      constraint_groups = [((0, None), (1, None))]
    elif params['constraint_type'] == 'cross_group_equal_opportunity':
      constraint_groups = [((0, 1), (1, 0))]
    else:
      constraint_groups = [((0, 1), (1, 0)), ((0, 0), (1, 1))]
  else:
    # Unconstrained optimization.
    constraint_groups = []

  # Dictionary that will hold batch features pairs, group pairs and labels for 
  # current batch. We include one query per-batch. 
  paired_batch = {}
  batch_index = 0  # Index of current query.

  # Data functions.
  feature_pairs = lambda: paired_batch['feature_pairs']
  group_pairs = lambda: paired_batch['group_pairs'] 
  features = lambda: paired_batch['features'] 
  labels = lambda: paired_batch['labels'] 

  # Create scoring model and constrained optimization problem.
  problem, scoring_model = formulate_problem(
      feature_pairs, group_pairs, features, labels, train_set['dimension'],
      constraint_groups, params['constraint_slack'])
  
  # Create a loss function for the problem.
  lagrangian_loss, update_ops, multipliers_variables = (
      tfco.create_lagrangian_loss(problem, dual_scale=params['dual_scale']))

  # Create optimizer
  optimizer = tf.keras.optimizers.Adagrad(learning_rate=params['learning_rate'])
  
  # List of trainable variables.
  var_list = (
      scoring_model.trainable_weights + problem.trainable_variables + 
      [multipliers_variables])
  
  # List of objectives, group constraint violations.
  # violations, and snapshot of models during course of training.
  objectives = []
  group_violations = []
  models = []

  feature_pair_batches = train_set['feature_pairs']
  group_pair_batches = train_set['group_pairs']
  query_pairs = train_set['query_pairs']  
  feature_batches = train_set['features']
  label_batches = train_set['labels']
  queries = train_set['queries']  

  print()
  # Run loops * iterations_per_loop full batch iterations.
  for ii in range(params['loops']):
    for jj in range(params['iterations_per_loop']):
      # Populate paired_batch dict with all pairs for current query. The batch
      # index is the same as the current query index.
      paired_batch = {
          'feature_pairs': feature_pair_batches[query_pairs == batch_index],
          'group_pairs': group_pair_batches[query_pairs == batch_index],
          'features': feature_batches[queries == batch_index],
          'labels': label_batches[queries == batch_index]
      }

      # Optimize loss.
      update_ops()
      optimizer.minimize(lagrangian_loss, var_list=var_list)

      # Update batch_index, and cycle back once last query is reached.
      batch_index = (batch_index + 1) % train_set['num_queries']
    
    # Snap shot current model.
    model_copy = tf.keras.models.clone_model(scoring_model)
    model_copy.set_weights(scoring_model.get_weights())
    models.append(model_copy)

    # Evaluate metrics for snapshotted model. 
    error, gerr, group_viol = evaluate_results(
        scoring_model, train_set, params)
    objectives.append(error)
    group_violations.append(
        [x - params['constraint_slack'] for x in group_viol])

    sys.stdout.write(
        '\r Loop %d: error = %.3f, max constraint violation = %.3f' % 
        (ii, objectives[-1], max(group_violations[-1])))
  print()
  
  if params['constrained']:
    # Find model iterate that trades-off between objective and group violations.
    best_index = tfco.find_best_candidate_index(
        np.array(objectives), np.array(group_violations), rank_objectives=False)
  else:
    # Find model iterate that achieves lowest objective.
    best_index = np.argmin(objectives)

  return models[best_index]

Summarize and Plot Results

Having trained a model, we will need functions to summarize the various evaluation metrics.


In [0]:
def evaluate_results(model, test_set, params):
  # Returns sqaured error, group error rates, group-level constraint violations.
  if params['constraint_type'] == 'marginal_equal_opportunity':
    g0_error = group_error_rate(model, test_set, 0)
    g1_error = group_error_rate(model, test_set, 1)
    group_violations = [g0_error - g1_error, g1_error - g0_error]
    return (mean_squared_error(model, test_set), [g0_error, g1_error], 
            group_violations)
  else:
    g00_error = group_error_rate(model, test_set, 0, 0)
    g01_error = group_error_rate(model, test_set, 0, 1)
    g10_error = group_error_rate(model, test_set, 1, 1)
    g11_error = group_error_rate(model, test_set, 1, 1)
    group_violations_offdiag = [g01_error - g10_error, g10_error - g01_error]
    group_violations_diag = [g00_error - g11_error, g11_error - g00_error]

    if params['constraint_type'] == 'cross_group_equal_opportunity':
      return (mean_squared_error(model, test_set), 
              [[g00_error, g01_error], [g10_error, g11_error]], 
              group_violations_offdiag)
    else:
      return (mean_squared_error(model, test_set), 
              [[g00_error, g01_error], [g10_error, g11_error]], 
              group_violations_offdiag + group_violations_diag)
    

def display_results(
    model, test_set, params, method, error_type, show_header=False):
  # Prints evaluation results for model on test data.
  error, group_error, diffs = evaluate_results(model, test_set, params)

  if params['constraint_type'] == 'marginal_equal_opportunity':
    if show_header:
      print('\nMethod\t\t\tError\t\tMSE\t\tGroup 0\t\tGroup 1\t\tDiff')
    print('%s\t%s\t\t%.3f\t\t%.3f\t\t%.3f\t\t%.3f' % (
        method, error_type, error, group_error[0], group_error[1], 
        np.max(diffs)))
  elif params['constraint_type'] == 'cross_group_equal_opportunity':
    if show_header:
      print('\nMethod\t\t\tError\t\tMSE\t\tGroup 0/1\tGroup 1/0\tDiff')
    print('%s\t%s\t\t%.3f\t\t%.3f\t\t%.3f\t\t%.3f' % (
        method, error_type, error, group_error[0][1], group_error[1][0], 
        np.max(diffs)))
  else:
    if show_header:
      print('\nMethod\t\t\tError\t\MSE\t\tGroup 0/1\tGroup 1/0\t' +
            'Group 0/0\tGroup 1/1\tDiff')
    print('%s\t%s\t\t%.3f\t\t%.3f\t\t%.3f\t\t%.3f\t\t%.3f\t\t%.3f' % (
        method, error_type, error, group_error[0][1], group_error[1][0], 
        group_error[0][0], group_error[1][1], np.max(diffs)))

Experimental Results

We now run experiments with two types of pairwise fairness criteria: (1) marginal_equal_opportunity and (2) pairwise equal opportunity. In each case, we compare an unconstrained model trained to optimize just the squared error and a constrained model trained with pairwise fairness constraints.


In [0]:
# Convert train/test set to paired data for later evaluation.
paired_train_set = convert_labeled_to_paired_data(train_set)
paired_test_set = convert_labeled_to_paired_data(test_set)

(1) Marginal Equal Opportunity

For a scoring model $f: \mathbb{R}^d \rightarrow \mathbb{R}$, recall:

  • $sqerr(f)$ as the squared error for scoring function $f$.

and we additionally define:

  • $err_i(f)$ as the row-marginal pairwise error over example pairs where the higher label example is from group $i$, and the lower label is from either groups
$$ err_i(f) = \mathbf{E}\big[\mathbb{I}\big(f(x) < f(x')\big) \,\big|\, y > y',~ grp(x) = i\big] $$

The constrained optimization problem we solve constraints the row-marginal pairwise errors to be similar:

$$min_f\;sqerr(f)$$$$\text{s.t. }\;|err_0(f) - err_1(f)| \leq 0.02$$

In [33]:
# Model hyper-parameters.
model_params = {
    'loops': 10, 
    'iterations_per_loop': 250, 
    'learning_rate': 0.1,
    'constraint_type': 'marginal_equal_opportunity', 
    'constraint_slack': 0.02,
    'dual_scale': 1.0}

# Unconstrained optimization.
model_params['constrained'] = False
model_unc  = train_model(paired_train_set, model_params)
display_results(model_unc, paired_train_set, model_params, 'Unconstrained     ', 
                'Train', show_header=True)
display_results(model_unc, paired_test_set, model_params,  'Unconstrained     ', 
                'Test')

# Constrained optimization with TFCO.
model_params['constrained'] = True
model_con  = train_model(paired_train_set, model_params)
display_results(model_con, paired_train_set, model_params, 'Constrained     ', 
                'Train', show_header=True)
display_results(model_con, paired_test_set, model_params, 'Constrained     ', 
                'Test')


 Loop 9: error = 0.057, max constraint violation = 0.041

Method			Error		MSE		Group 0		Group 1		Diff
Unconstrained     	Train		0.057		0.496		0.435		0.061
Unconstrained     	Test		0.054		0.478		0.443		0.035

 Loop 9: error = 0.057, max constraint violation = -0.018

Method			Error		MSE		Group 0		Group 1		Diff
Constrained     	Train		0.057		0.483		0.485		0.002
Constrained     	Test		0.054		0.466		0.486		0.019

(2) Pairwise Equal Opportunity

Recall that we denote $err_{i,j}(f)$ as the pairwise error over example pairs where the higher label example is from group $i$, and the lower label example is from group $j$. $$ err_{i, j}(f) ~=~ \mathbf{E}\big[\mathbb{I}\big(f(x) < f(x')\big) \,\big|\, y > y',~ grp(x) = i, ~grp(x') = j\big] $$

We first constrain only the cross-group errors, highlighted below.


Negative
Group 0 Group 1
Positive Group 0 $err_{0,0}$ $\mathbf{err_{0,1}}$
Group 1 $\mathbf{err_{1,0}}$ $err_{1,1}$

The optimization problem we solve constraints the cross-group pairwise errors to be similar:

$$min_f\; sqerr(f)$$$$\text{s.t. }\;\; |err_{0,1}(f) - err_{1,0}(f)| \leq 0.02$$

In [34]:
# Model hyper-parameters.
model_params = {
    'loops': 10, 
    'iterations_per_loop': 250, 
    'learning_rate': 0.1,
    'constraint_type': 'cross_group_equal_opportunity', 
    'constraint_slack': 0.02,
    'dual_scale': 1.0}

# Unconstrained optimization.
model_params['constrained'] = False
model_unc  = train_model(paired_train_set, model_params)
display_results(model_unc, paired_train_set, model_params, 'Unconstrained     ', 
                'Train', show_header=True)
display_results(model_unc, paired_test_set, model_params,  'Unconstrained     ', 
                'Test')

# Constrained optimization with TFCO.
model_params['constrained'] = True
model_con  = train_model(paired_train_set, model_params)
display_results(model_con, paired_train_set, model_params, 'Constrained     ', 
                'Train', show_header=True)
display_results(model_con, paired_test_set, model_params, 'Constrained     ', 
                'Test')


 Loop 9: error = 0.057, max constraint violation = 0.071

Method			Error		MSE		Group 0/1	Group 1/0	Diff
Unconstrained     	Train		0.057		0.529		0.438		0.091
Unconstrained     	Test		0.054		0.516		0.446		0.070

 Loop 9: error = 0.057, max constraint violation = 0.013

Method			Error		MSE		Group 0/1	Group 1/0	Diff
Constrained     	Train		0.058		0.484		0.457		0.027
Constrained     	Test		0.055		0.478		0.476		0.003

In [0]: