Copyright 2018 Google LLC.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: ldixon@google.com, jetpack@google.com, sorenj@google.com, nthain@google.com, lucyvasserman@google.com
Click to run this colab interactively at on colab.research.google.com.
This notebook demonstrates Pinned AUC as an unintended model bias metric for Conversation AI wikipedia models.
See the paper Measuring and Mitigating Unintended Bias in Text Classification for background, detailed explanation, and experimental results.
Also see https://developers.google.com/machine-learning/fairness-overview for more info on Google's Machine Learning Fairness work.
Disclaimer
We start by loading some libraries that we will use and customizing the visualization parameters.
In [0]:
!pip install -U -q git+https://github.com/conversationai/unintended-ml-bias-analysis
!pip install -U -q pandas==0.22.0
In [0]:
from unintended_ml_bias import model_bias_analysis
In [0]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import pandas as pd
import seaborn as sns
import pkg_resources
In [0]:
cm = sns.light_palette("red", as_cmap=True)
"Model Families" allows the results to capture training variance by grouping different training versions of each model together. model_families is a list of lists, each sub-list ("model_family") contains the names of different training versions of the same model.
We compare 3 versions each of three different models, "wiki_cnn", "wiki_debias_random", and "wiki_debias".
In [0]:
model_families = [
['wiki_cnn_v3_100', 'wiki_cnn_v3_101', 'wiki_cnn_v3_102'],
['wiki_debias_cnn_v3_100', 'wiki_debias_cnn_v3_101', 'wiki_debias_cnn_v3_102'],
['wiki_debias_random_cnn_v3_100', 'wiki_debias_random_cnn_v3_101', 'wiki_debias_random_cnn_v3_102'],
]
In [0]:
# Read the scored data into DataFrame
df = pd.read_csv(pkg_resources.resource_stream("unintended_ml_bias", "eval_datasets/bias_madlibs_77k_scored.csv"))
In [0]:
# Add columns for each subgroup.
terms = [line.strip() for line in pkg_resources.resource_stream("unintended_ml_bias", "bias_madlibs_data/adjectives_people.txt")]
model_bias_analysis.add_subgroup_columns_from_text(df, 'text', terms)
At this point, our scored data is in DataFrame df, with columns:
You can run the analysis below on any data in this format. Subgroup labels can be generated via words in the text as done above, or come from human labels if you have them.
Pinned AUC measures the extent of unintended bias of a real-value score function by measuring each sub-group's divergence from the general distribution.
Let $D$ represent the full data set and $D_g$ be the set of examples in subgroup $g$. Then:
$$ Pinned \ dataset \ for \ group \ g = pD_g = s(D_g) + s(D), |s(D_g)| = |s(D)| $$$$ Pinned \ AUC \ for \ group \ g = pAUC_g = AUC(pD_g) $$$$ Pinned \ AUC \ Squared \ Equality \ Difference = \Sigma_{g \in G}(AUC - pAUC_g)^2 $$The table below shows the pinned AUC equality difference for each model family. Lower scores (lighter red) represent more similarity between each group's pinned AUC, which means less unintended bias.
On this set, the wiki_debias_cnn model demonstrates least unintended bias.
In [0]:
eq_diff = model_bias_analysis.per_subgroup_auc_diff_from_overall(df, terms, model_families, squared_error=True)
# sort to guarantee deterministic output
eq_diff.sort_values(by=['model_family'], inplace=True)
eq_diff.reset_index(drop=True, inplace=True)
eq_diff.style.background_gradient(cmap=cm)
Out[0]:
In [0]:
pinned_auc_results = model_bias_analysis.per_subgroup_aucs(df, terms, model_families, 'label')
for family in model_families:
name = model_bias_analysis.model_family_name(family)
model_bias_analysis.per_subgroup_scatterplots(
pinned_auc_results,
'subgroup',
name + '_aucs',
name + ' Pinned AUC',
y_lim=(0.8, 1.0))