Skip to main content

Algorithms for mitigating unfairness in supervised machine learning

Project description

Build Status MIT license PyPI

fairlearn

The fairlearn project seeks to enable anyone involved in the development of artificial intelligence (AI) systems to assess their system's fairness and mitigate the observed unfairness. The fairlearn repository contains a Python package and Jupyter notebooks with the examples of usage.

Current release

  • The current stable release is available at fairlearn v0.4.0.

  • Our current version differs substantially from version 0.2 or earlier. Users of these older versions should visit our onboarding guide.

What we mean by fairness

An AI system can behave unfairly for a variety of reasons. In fairlearn, we define whether an AI system is behaving unfairly in terms of its impact on people – i.e., in terms of harms. We focus on two kinds of harms:

  • Allocation harms. These harms can occur when AI systems extend or withhold opportunities, resources, or information. Some of the key applications are in hiring, school admissions, and lending.

  • Quality-of-service harms. Quality of service refers to whether a system works as well for one person as it does for another, even if no opportunities, resources, or information are extended or withheld.

We follow the approach known as group fairness, which asks: Which groups of individuals are at risk for experiencing harms? The relevant groups need to be specified by the data scientist and are application specific.

Group fairness is formalized by a set of constraints, which require that some aspect (or aspects) of the AI system's behavior be comparable across the groups. The fairlearn package enables assessment and mitigation of unfairness under several common definitions. To learn more about our definitions of fairness, please visit our terminology page.

Note: Fairness is fundamentally a sociotechnical challenge. Many aspects of fairness, such as justice and due process, are not captured by quantitative fairness metrics. Furthermore, there are many quantitative fairness metrics which cannot all be satisfied simultaneously. Our goal is to enable humans to assess different mitigation strategies and then make trade-offs appropriate to their scenario.

Overview of fairlearn

The fairlearn package contains the following algorithms for mitigating unfairness in binary classification and regression:

algorithm description classification/regression sensitive features supported fairness definitions
fairlearn. reductions. ExponentiatedGradient Black-box approach to fair classification described in A Reductions Approach to Fair Classification binary classification categorical DP, EO
fairlearn. reductions. GridSearch Black-box approach described in Section 3.4 of A Reductions Approach to Fair Classification binary classification binary DP, EO
fairlearn. reductions. GridSearch Black-box approach that implements a grid-search variant of the algorithm described in Section 5 of Fair Regression: Quantitative Definitions and Reduction-based Algorithms regression binary BGL
fairlearn. postprocessing. ThresholdOptimizer Postprocessing algorithm based on the paper Equality of Opportunity in Supervised Learning. This technique takes as input an existing classifier and the sensitive feature, and derives a monotone transformation of the classifier's prediction to enforce the specified parity constraints. binary classification categorical DP, EO

Note: DP refers to demographic parity, EO to equalized odds, and BGL to bounded group loss. For more information on these and other terms we use in this repository please refer to the terminology page. To request additional algorithms or fairness definitions, please open a new issue.

Install fairlearn

The package can be installed via

pip install fairlearn

or you can clone the repository locally via

git clone git@github.com:fairlearn/fairlearn.git

If you clone from git and wish to use the Fairness dashboard, you will need to install Yarn, and then do the following:

cd fairlearn/widget/js
yarn install
yarn build:all
rm -rf dist
rm -rf lib
rm -rf node_modules

These commands only need to be run when you want the latest version of the dashboard (after pulling from our GitHub repository).

To verify that the cloned repository works (the pip package does not include the tests), run

pip install -r requirements.txt
python -m pytest -s ./test/unit
Onboarding guide for users of version 0.2 or earlier

Up to version 0.2, fairlearn contained only the exponentiated gradient method. The fairlearn repository now has a more comprehensive scope and aims to incorporate other methods as specified above. The same exponentiated gradient technique is now the class fairlearn.reductions.ExponentiatedGradient. While in the past exponentiated gradient was invoked via

import numpy as np
from fairlearn.classred import expgrad
from fairlearn.moments import DP

estimator = LogisticRegression()  # or any other estimator
exponentiated_gradient_result = expgrad(X, sensitive_features, y, estimator, constraints=DP())
positive_probabilities = exponentiated_gradient_result.best_classifier(X)
randomized_predictions = (positive_probabilities >= np.random.rand(len(positive_probabilities))) * 1

the equivalent operation is now

from fairlearn.reductions import ExponentiatedGradient, DemographicParity

estimator = LogisticRegression()  # or any other estimator
exponentiated_gradient = ExponentiatedGradient(estimator, constraints=DemographicParity())
exponentiated_gradient.fit(X, y, sensitive_features=sensitive_features)
randomized_predictions = exponentiated_gradient.predict(X)

Please open a new issue if you encounter any problems.

Usage

For common usage refer to the Jupyter notebooks and our API guide

Contributing

To contribute please check our contributing guide.

Maintainers

The fairlearn project is maintained by:

  • @MiroDudik
  • @riedgar-ms
  • @rihorn2
  • @romanlutz
  • @bethz

For a full list of contributors refer to the authors page

Issues

Regular (non-security) issues

Please submit a report through GitHub issues. A maintainer will respond promptly as follows:

  • bug: triage as bug and provide estimated timeline based on severity
  • feature request: triage as feature request and provide estimated timeline
  • question or discussion: triage as question and either respond or notify/identify a suitable expert to respond

Maintainers will try to link duplicate issues when possible.

Reporting security issues

Please take a look at our guidelines for reporting security issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairlearn-0.4.0.tar.gz (10.7 MB view details)

Uploaded Source

Built Distribution

fairlearn-0.4.0-py3-none-any.whl (21.6 MB view details)

Uploaded Python 3

File details

Details for the file fairlearn-0.4.0.tar.gz.

File metadata

  • Download URL: fairlearn-0.4.0.tar.gz
  • Upload date:
  • Size: 10.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.40.1 CPython/3.6.9

File hashes

Hashes for fairlearn-0.4.0.tar.gz
Algorithm Hash digest
SHA256 032feab1a2dcf0a942a74c397ebc5fbeb2364cfda8a93ac537d91a3a03c5ca63
MD5 57fb6812f9b95cfd7d9a20e110e21371
BLAKE2b-256 3bdfeddb47dcb6828e0ef70aaf97959a25c298bc74971e4663a58ca0ce8fdbcc

See more details on using hashes here.

File details

Details for the file fairlearn-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: fairlearn-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 21.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.40.1 CPython/3.6.9

File hashes

Hashes for fairlearn-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f60db80475f1a2fa3197788cd89d6edf2ce9cba10839bdcf1dbcfb3cdda70bca
MD5 384911092716938bb3c86d37ed2a5cfc
BLAKE2b-256 3b04eccc94001eca76b1b55e1e73dcba0be0912640fc4817549ea21e4a114035

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page