Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This colab contains TensorFlow code for implementing the constrained optimization methods presented in the paper:
Harikrishna Narasimhan, Andrew Cotter, Maya Gupta, Serena Wang, "Pairwise Fairness for Ranking and Regression", AAAI 2020. [link]
First, let's install and import the relevant libraries.
In [0]:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
We will need the TensorFlow Constrained Optimization (TFCO) library.
In [0]:
!pip install git+https://github.com/google-research/tensorflow_constrained_optimization
In [0]:
import tensorflow_constrained_optimization as tfco
We will be training a linear ranking model $f(x) = w^\top x$ where $x \in \mathbb{R}^d$ is a set of features for a query-document pair (the protected attribute is not a feature). Our goal is to train the model such that it accurately ranks the positive documents in a query above the negative ones.
Specifically, for the ranking model $f$, we denote:
We then wish to solve the following constrained problem: $$min_f\; err(f)$$ $$\text{ s.t. } |err_{i,j}(f) - err_{k,\ell}(f)| \leq \epsilon \;\;\; \forall ((i,j), (k,\ell)) \in \mathcal{G},$$
where $\mathcal{G}$ contains the pairs we are interested in constraining.
We begin by generating a synthetic ranking dataset.
Setting:
Generative process:
For convenience, the first document for each query is labeled positive, and remaining $n-1$ documents are labeled negative.
For each query-document pair:
In [0]:
def create_dataset(num_queries, num_docs):
# Create a synthetic 2-dimensional training dataset with 1000 queries,
# with 1 positive document and 10 negative document each
# and with two protected groups for each document.
num_posdocs = 1
num_negdocs = num_docs - 1
dimension = 2
num_groups = 2
tot_pairs = num_queries * num_posdocs * num_negdocs
# Conditional distributions are Gaussian: Conditioned on the label and the
# protected group, the feature distribution is a Gaussian.
# Positive documents, Group 0
mu_10 = [1,0]
sigma_10 = np.array([[1, 0], [0, 1]])
# Positive documents, Group 1
mu_11 = [-1.5, 0.75]
sigma_11 = np.array([[1, 0], [0, 1]]) * 0.5
# Negative documents, Group 0
mu_00 = [-1,-1]
sigma_00 = np.array([[1, 0], [0, 1]])
# Negative documents, Group 1
mu_01 = [-2,-1]
sigma_01 = np.array([[1, 0], [0, 1]])
# Generate positive documents
posdocs_groups = (np.random.rand(num_queries, num_posdocs) <= 0.1) * 1
posdocs_mask = np.dstack([posdocs_groups] * dimension)
posdocs0 = np.random.multivariate_normal(mu_10, sigma_10,
size=(num_queries, num_posdocs))
posdocs1 = np.random.multivariate_normal(mu_11, sigma_11,
size=(num_queries, num_posdocs))
posdocs = (1 - posdocs_mask) * posdocs0 + posdocs_mask * posdocs1
# Generate negative documents
negdocs_groups = (np.random.rand(num_queries, num_negdocs) <= 0.1) * 1
negdocs_mask = np.dstack([negdocs_groups] * dimension)
negdocs0 = np.random.multivariate_normal(mu_00, sigma_00,
size=(num_queries, num_negdocs))
negdocs1 = np.random.multivariate_normal(mu_01, sigma_01,
size=(num_queries, num_negdocs))
negdocs = (1 - negdocs_mask) * negdocs0 + negdocs_mask * negdocs1
# Concatenate positive and negative documents for each query
# (along axis 1, where documents are arranged)
features = np.concatenate((posdocs, negdocs), axis=1)
# Concatenate the associated labels:
# (for each query, first num_posdocs documents are positive, remaining negative)
poslabels = np.tile([1], reps=(num_queries, num_posdocs))
neglabels = np.tile([-1], reps=(num_queries, num_negdocs))
labels = np.concatenate((poslabels, neglabels), axis=1)
# Concatenate the protected groups
groups = np.concatenate((posdocs_groups, negdocs_groups), axis=1)
dataset = {
'features': features,
'labels': labels,
'groups': groups,
'num_queries': num_queries,
'num_posdocs': num_posdocs,
'num_negdocs': num_negdocs,
'dimension': dimension,
'num_groups': num_groups,
'tot_pairs': tot_pairs
}
return dataset
Next, let's write functions to plot the data and the linear decision boundary.
In [21]:
def plot_data(dataset, ax=None):
# Plot data set.
features = dataset["features"]
labels = dataset["labels"]
groups = dataset["groups"]
# Create axes if not specified
if not ax:
_, ax = plt.subplots(1, 1, figsize=(4.0, 4.0))
ax.set_xlabel("Feature 0")
ax.set_ylabel("Feature 1")
# Plot positive points in group 0
data = features[(labels==1) & (groups==0), :]
ax.plot(data[:,0], data[:,1], 'bx', label="Pos, Group 0")
# Plot positive points in group 1
data = features[(labels==1) & (groups==1), :]
ax.plot(data[:,0], data[:,1], 'bo', label="Pos, Group 1")
# Plot negative points in group 0
data = features[(labels==-1) & (groups==0), :]
ax.plot(data[:,0], data[:,1], 'rx', label="Neg, Group 0")
# Plot negative points in group 1
data = features[(labels==-1) & (groups==1), :]
ax.plot(data[:,0], data[:,1], 'ro', label="Neg, Group 1")
ax.legend(loc = "upper right")
def plot_model(weights, ax, x_range, y_range, fmt):
# Plot model decision boundary.
ax.plot([x_range[0], x_range[1]],
[-x_range[0] * weights[0] / weights[1],
-x_range[1] * weights[0] / weights[1]],
fmt)
ax.set_ylim(y_range)
# Sample data set.
dataset = create_dataset(num_queries=500, num_docs=10)
plot_data(dataset)
We will also need functions to evaluate the pairwise error rates for a linear model.
In [0]:
def get_mask(dataset, pos_group, neg_group=None):
# Returns a boolean mask selecting positive-negative document pairs where
# the protected group for the positive document is pos_group and
# the protected group for the negative document (if specified) is neg_group.
groups = dataset['groups']
num_negdocs = dataset['num_negdocs']
# Repeat group membership positive docs as many times as negative docs.
mask_pos = groups[:,0] == pos_group
mask_pos_rep = np.repeat(mask_pos.reshape(-1,1), num_negdocs, axis=1)
if neg_group is None:
return mask_pos_rep
else:
mask_neg = groups[:,1:] == neg_group
return mask_pos_rep & mask_neg
def error_rate(model, dataset):
# Returns error rate of model on dataset.
features = dataset["features"]
num_negdocs = dataset['num_negdocs']
scores = np.matmul(features, model)
pos_scores_rep = np.repeat(scores[:, 0].reshape(-1,1), num_negdocs, axis=1)
neg_scores = scores[:, 1:]
diff = pos_scores_rep - neg_scores
return np.mean(diff.reshape((-1)) <= 0)
def group_error_rate(model, dataset, pos_group, neg_group=None):
# Returns error rate of model on data set, considering only document pairs,
# where the protected group for the positive document is pos_group, and the
# protected group for the negative document (if specified) is neg_group.
features = dataset['features']
num_negdocs = dataset['num_negdocs']
scores = np.matmul(features, model)
pos_scores_rep = np.repeat(scores[:, 0].reshape(-1,1), num_negdocs, axis=1)
neg_scores = scores[:, 1:]
mask = get_mask(dataset, pos_group, neg_group)
diff = pos_scores_rep - neg_scores
diff = diff * mask
return np.sum(diff.reshape((-1)) < 0) * 1.0 / np.sum(mask)
We then write a function to create the linear ranking model in Tensorflow.
In [0]:
def group_tensors(predictions, dataset, pos_group, neg_group=None):
# Returns predictions and labels for document-pairs where the protected group
# for the positive document is pos_group, and the protected group for the
# negative document (if specified) is neg_group.
mask = get_mask(dataset, pos_group, neg_group)
mask = np.reshape(mask, (-1))
group_labels = lambda: tf.constant(np.ones(np.sum(mask)), dtype=tf.float32)
group_predictions = lambda: tf.boolean_mask(predictions(), mask)
return group_predictions, group_labels
def linear_model(dataset):
# Creates a linear ranking model, and returns a nullary function returning
# predictions on the dataset, and the model weights.
# Create variables containing the model parameters.
weights = tf.Variable(tf.ones(dataset["dimension"], dtype=tf.float32),
name="weights")
# Create a constant tensor containing the features.
features_tensor = tf.constant(dataset["features"], dtype=tf.float32)
# Create a nullary function that returns applies the linear model to the
# features and returns the tensor with the predictions.
def predictions():
predicted_scores = tf.tensordot(features_tensor, weights, axes=(2, 0))
# Compute ranking errors and flatten tensor.
pos_scores = tf.slice(predicted_scores, begin=[0,0],
size=[-1, dataset["num_posdocs"]])
neg_scores = tf.slice(predicted_scores, begin=[0, dataset["num_posdocs"]],
size=[-1,-1])
pos_scores_rep = tf.tile(pos_scores, multiples=(1, dataset["num_negdocs"]))
predictions_tensor = tf.reshape(pos_scores_rep - neg_scores, shape=[-1])
return predictions_tensor
return predictions, weights
We are ready to formulate the constrained optimization problem using the TFCO library.
In [0]:
def formulate_problem(dataset, constraint_groups=[], constraint_slack=None):
# Returns a RateMinimizationProblem object and the linear model weights.
#
# Formulates a constrained problem that optimizes the error rate for a linear
# model on the specified dataset, subject to pairwise fairness constraints
# specified by the constraint_groups and the constraint_slack.
#
# Args:
# dataset: Dataset dictionary returned by create_dataset()
# constraint_groups: List containing tuples of the form
# ((pos_group0, neg_group0), (pos_group1, neg_group1)), specifying the
# group memberships for the document pairs to compare in the constraints.
# constraint_slack: slackness "\epsilon" allowed in the constraints.
# Create linear model: we get back a nullary function returning the
# predictions on the dataset, and the model weights.
predictions, weights = linear_model(dataset)
# Create a nullary function returning a constant tensor with the labels.
labels = lambda: tf.constant(np.ones(dataset["tot_pairs"]), dtype=tf.float32)
# Context for the optimization objective.
context = tfco.rate_context(predictions, labels)
# Constraint set.
constraint_set = []
# Context for the constraints.
for ((pos_group0, neg_group0), (pos_group1, neg_group1)) in constraint_groups:
# Context for group 0.
group0_predictions, group0_labels = group_tensors(
predictions, dataset, pos_group0, neg_group0)
context_group0 = tfco.rate_context(group0_predictions, group0_labels)
# Context for group 1.
group1_predictions, group1_labels = group_tensors(
predictions, dataset, pos_group1, neg_group1)
context_group1 = tfco.rate_context(group1_predictions, group1_labels)
# Add constraints to constraint set.
constraint_set.append(
tfco.false_negative_rate(context_group0) <= (
tfco.false_negative_rate(context_group1) + constraint_slack)
)
constraint_set.append(
tfco.false_negative_rate(context_group1) <= (
tfco.false_negative_rate(context_group0) + constraint_slack)
)
# Formulate constrained minimization problem.
problem = tfco.RateMinimizationProblem(
tfco.error_rate(context), constraint_set)
return problem, weights
The following function then trains the linear model by solving the above constrained optimization problem. We will handle three type of pairwise fairness constraints.
In [0]:
# Pairwise fairness constraint types.
MARGINAL_EQUAL_OPPORTUNITY = 0
CROSS_GROUP_EQUAL_OPPORTUNITY = 1
CROSS_AND_WITHIN_GROUP_EQUAL_OPPORTUNITY = 2
In [0]:
def train_model(train_set, params):
# Trains the model and returns the trained model weights, objective and
# maximum constraint violation values.
# Set up problem and model.
if params['flag_constrained']:
# Constrained optimization.
if params['constraint_type'] == MARGINAL_EQUAL_OPPORTUNITY:
constraint_groups = [((0, None), (1, None))]
elif params['constraint_type'] == CROSS_GROUP_EQUAL_OPPORTUNITY:
constraint_groups = [((0, 1), (1, 0))]
else:
constraint_groups = [((0, 1), (1, 0)), ((0, 0), (1, 1))]
else:
# Unconstrained optimization.
constraint_groups = []
problem, weights = formulate_problem(
train_set, constraint_groups, params["constraint_bound"])
# Set up optimization problem.
optimizer = tfco.ProxyLagrangianOptimizerV2(
tf.keras.optimizers.Adagrad(learning_rate=params["learning_rate"]),
num_constraints=problem.num_constraints)
# List of trainable variables.
var_list = (
[weights] + problem.trainable_variables + optimizer.trainable_variables())
# List of objectives and constraint violations during course of training.
objectives = []
constraints = []
# Run loops * iterations_per_loop full batch iterations.
for ii in xrange(params['loops']):
for ii in xrange(params['iterations_per_loop']):
optimizer.minimize(problem, var_list=var_list)
objectives.append(problem.objective())
if params['flag_constrained']:
constraints.append(max(problem.constraints()))
return weights.numpy(), objectives, constraints
Having trained a model, we will need functions to summarize the various evaluation metrics and plot the trained decision boundary
In [0]:
def evaluate_results(model, test_set, params):
# Returns overall and group error rates for model on test set.
if params['constraint_type'] == MARGINAL_EQUAL_OPPORTUNITY:
return (error_rate(model, test_set),
[group_error_rate(model, test_set, 0),
group_error_rate(model, test_set, 1)])
else:
return (error_rate(model, test_set),
[[group_error_rate(model, test_set, 0, 0),
group_error_rate(model, test_set, 0, 1)],
[group_error_rate(model, test_set, 1, 0),
group_error_rate(model, test_set, 1, 1)]])
def display_results(model, objectives, constraints, test_set, params, ax):
# Prints evaluation results for model on test data
# and plots its decision boundary
# Evaluate model on test set and print results.
error, group_error = evaluate_results(model, test_set, params)
if params['constraint_type'] == MARGINAL_EQUAL_OPPORTUNITY:
if params['flag_constrained']:
print 'Constrained', '\t \t',
else:
print 'Test Error\t \t', 'Overall', '\t', 'Group 0', '\t', 'Group 1',
print '\t', 'Diff'
print 'Unconstrained', '\t \t',
print("%.3f" % (error_rate(model, test_set)) + '\t\t' +
"%.3f" % group_error[0] + '\t\t' + "%.3f" % group_error[1] + '\t\t'
"%.3f" % abs(group_error[0] - group_error[1]))
elif params['constraint_type'] == CROSS_GROUP_EQUAL_OPPORTUNITY:
if params['flag_constrained']:
print 'Constrained', '\t \t',
else:
print 'Test Error\t \t', 'Overall', '\t', 'Group 0/1', '\t', 'Group 1/0',
print '\t', 'Diff'
print 'Unconstrained', '\t \t',
print("%.3f" % error + '\t\t' +
"%.3f" % group_error[0][1] + '\t\t' +
"%.3f" % group_error[1][0] + '\t\t' +
"%.3f" % abs(group_error[0][1] - group_error[1][0]))
else:
if params['flag_constrained']:
print 'Constrained', '\t \t',
else:
print 'Test Error\t \t', 'Overall', '\t', 'Group 0/1', '\t', 'Group 1/0',
print '\t', 'Diff', '\t',
print '\t', 'Group 0/0', '\t', 'Group 1/1', '\t', 'Diff'
print 'Unconstrained', '\t \t',
print("%.3f" % error + '\t\t' +
"%.3f" % group_error[0][1] + '\t\t' +
"%.3f" % group_error[1][0] + '\t\t' +
"%.3f" % abs(group_error[0][1] - group_error[1][0]) + '\t\t' +
"%.3f" % group_error[0][0] + '\t\t' +
"%.3f" % group_error[1][1] + '\t\t' +
"%.3f" % abs(group_error[0][0] - group_error[1][1]))
# Plot decision boundary and progress of training objective/constraint viol.
if params['flag_constrained']:
ax[0].set_title("Model: Constrained")
else:
ax[0].set_title("Model: Unconstrained")
features = train_set['features']
plot_data(train_set, ax[0])
plot_model(model, ax[0],
[features[:, :, 0].min(), features[:, :, 0].max()],
[features[:, :, 1].min(), features[:, :, 1].max()],
"k--")
if params['flag_constrained']:
ax[1].set_title("Objective (Hinge)")
ax[1].set_xlabel("Number of epochs")
objective_curve, = ax[1].plot(range(params['loops']), objectives)
ax[2].set_title("Constraint Violation")
ax[2].set_xlabel("Number of epochs")
constraint_curve, = ax[2].plot(range(params['loops']), constraints)
We now run experiments with two types of pairwise fairness criteria: (1) marginal equal opportunity and (2) pairwise equal opportunity. In each case, we compare an unconstrained model trained to optimize the error rate and a constrained model trained with pairwise fairness constraints.
For a ranking model $f: \mathbb{R}^d \rightarrow \mathbb{R}$, recall:
and we additionally define:
The constrained optimization problem we solve constraints the row-marginal pairwise errors to be similar:
$$min_f\;err(f)$$$$\text{s.t. }\;|err_0(f) - err_1(f)| \leq 0.01$$
In [28]:
np.random.seed(121212)
train_set = create_dataset(num_queries=500, num_docs=10)
test_set = create_dataset(num_queries=500, num_docs=10)
model_params = {
'loops': 25,
'iterations_per_loop': 10,
'learning_rate': 1.0,
'constraint_type': MARGINAL_EQUAL_OPPORTUNITY,
'constraint_bound': 0.01}
# Plot training stats, data and model.
ff, ax = plt.subplots(1, 4, figsize=(16.0, 3.5))
# Unconstrained optimization.
model_params['flag_constrained'] = False
model, objectives, constraints = train_model(train_set, model_params)
display_results(model, objectives, constraints, test_set, model_params, [ax[0]])
# Constrained optimization.
model_params['flag_constrained'] = True
model, objectives, constraints = train_model(train_set, model_params)
display_results(model, objectives, constraints, test_set, model_params, ax[1:])
ff.tight_layout()
Recall that we denote $err_{i,j}(f)$ as the ranking error over positive-negative document pairs where the pos. document is from group $i$, and the neg. document is from group $j$. $$ err_{i, j}(f) ~=~ \mathbf{E}\big[\mathbb{I}\big(f(x) < f(x')\big) \,\big|\, y = 1,~ y' = 0,~ grp(x) = i, ~grp(x') = j\big] $$
We first constrain only the cross-group errors, highlighted below.
Negative | |||
Group 0 | Group 1 | ||
Positive | Group 0 | $err_{0,0}$ | $\mathbf{err_{0,1}}$ |
Group 1 | $\mathbf{err_{1,0}}$ | $err_{1,1}$ |
The optimization problem we solve constraints the cross-group pairwise errors to be similar:
$$min_f\; err(f)$$$$\text{s.t. }\;\; |err_{0,1}(f) - err_{1,0}(f)| \leq 0.01$$
In [29]:
model_params = {
'loops': 50,
'iterations_per_loop': 10,
'learning_rate': 1.0,
'constraint_type': CROSS_GROUP_EQUAL_OPPORTUNITY,
'constraint_bound': 0.01}
# Plot training stats, data and model
ff, ax = plt.subplots(1, 4, figsize=(16.0, 3.5))
model_params['flag_constrained'] = False
model, objectives, constraints = train_model(train_set, model_params)
display_results(model, objectives, constraints, test_set, model_params, [ax[0]])
model_params['flag_constrained'] = True
model, objectives, constraints = train_model(train_set, model_params)
display_results(model, objectives, constraints, test_set, model_params, ax[1:])
ff.tight_layout()
We next constrain both the cross-group errors to be similar, and the within-group errors to be similar:
Negative | |||
Group 0 | Group 1 | ||
Positive | Group 0 | $\mathbf{err_{0,0}}$ | $\mathbf{err_{0,1}}$ |
Group 1 | $\mathbf{err_{1,0}}$ | $\mathbf{err_{1,1}}$ |
The constrained optimization problem we wish to solve is given by: $$min_f \;err(f)$$ $$\text{s.t. }\;\;|err_{0,1}(f) - err_{1,0}(f)| \leq 0.01 \;\;\; |err_{0,0}(f) - err_{1,1}(f)| \leq 0.01 $$
In [30]:
model_params = {
'loops': 250,
'iterations_per_loop': 10,
'learning_rate': 0.2,
'constraint_type': CROSS_AND_WITHIN_GROUP_EQUAL_OPPORTUNITY,
'constraint_bound': 0.01
}
# Plot training stats, data and model
ff, ax = plt.subplots(1, 4, figsize=(16.0, 3.5))
model_params['flag_constrained'] = False
model, objectives, constraints = train_model(train_set, model_params)
display_results(model, objectives, constraints, test_set, model_params, [ax[0]])
model_params['flag_constrained'] = True
model, objectives, constraints = train_model(train_set, model_params)
display_results(model, objectives, constraints, test_set, model_params, ax[1:])
ff.tight_layout()
The experimental results show that unconstrained training performs poorly on the pairwise fairness metrics, while the constrained optimization methods perform yield significantly lower fairness violations at the cost of a higher overall error rate. The constrained model ends up substatially lowering the error rates associated with the minorty group (group 1), at the cost of a slightly higher error rate for the majority group (group 0).
Also, note that the quality of the learned ranking function depends on the slope of the hyperplane and is unaffected by its intercept. The hyperplane learned by the unconstrained approach ranks the majority examples (the group marked with x) well, but is not accurate at ranking the minority examples (the group marked with o).