Skip to main content

Algorithms for mitigating unfairness in supervised machine learning

Project description

Build Status MIT license PyPI

fairlearn

The fairlearn project seeks to enable anyone involved in the development of artificial intelligence (AI) systems to assess their system's fairness and mitigate the observed unfairness. The fairlearn repository contains a Python package and Jupyter notebooks with the examples of usage.

Current release

  • The current stable release is available at fairlearn v0.3.0.

  • Our current version differs substantially from version 0.2 or earlier. Users of these older versions should visit our onboarding guide.

What we mean by fairness

An AI system can behave unfairly for a variety of reasons. In fairlearn, we define whether an AI system is behaving unfairly in terms of its impact on people – i.e., in terms of harms. We focus on two kinds of harms:

  • Allocation harms. These harms can occur when AI systems extend or withhold opportunities, resources, or information. Some of the key applications are in hiring, school admissions, and lending.

  • Quality-of-service harms. Quality of service refers to whether a system works as well for one person as it does for another, even if no opportunities, resources, or information are extended or withheld.

We follow the approach known as group fairness, which asks: Which groups of individuals are at risk for experiencing harms? The relevant groups need to be specified by the data scientist and are application specific.

Group fairness is formalized by a set of constraints, which require that some aspect (or aspects) of the AI system's behavior be comparable across the groups. The fairlearn package enables assessment and mitigation of unfairness under several common definitions. To learn more about our definitions of fairness, please visit our terminology page.

Note: Fairness is fundamentally a sociotechnical challenge. Many aspects of fairness, such as justice and due process, are not captured by quantitative fairness metrics. Furthermore, there are many quantitative fairness metrics which cannot all be satisfied simultaneously. Our goal is to enable humans to assess different mitigation strategies and then make trade-offs appropriate to their scenario.

Overview of fairlearn

The fairlearn package contains the following algorithms for mitigating unfairness in binary classification and regression:

algorithm description classification/regression sensitive features supported fairness definitions
fairlearn. reductions. ExponentiatedGradient Black-box approach to fair classification described in A Reductions Approach to Fair Classification binary classification categorical DP, EO
fairlearn. reductions. GridSearch Black-box approach described in Section 3.4 of A Reductions Approach to Fair Classification binary classification binary DP, EO
fairlearn. reductions. GridSearch Black-box approach that implements a grid-search variant of the algorithm described in Section 5 of Fair Regression: Quantitative Definitions and Reduction-based Algorithms regression binary BGL
fairlearn. postprocessing. ThresholdOptimizer Postprocessing algorithm based on the paper Equality of Opportunity in Supervised Learning. This technique takes as input an existing classifier and the sensitive feature, and derives a monotone transformation of the classifier's prediction to enforce the specified parity constraints. binary classification categorical DP, EO

Note: DP refers to demographic parity, EO to equalized odds, and BGL to bounded group loss. For more information on these and other terms we use in this repository please refer to the terminology page. To request additional algorithms or fairness definitions, please open a new issue.

Install fairlearn

The package can be installed via

pip install fairlearn

or you can clone the repository locally via

git clone git@github.com:fairlearn/fairlearn.git

To verify that the cloned repository works (the pip package does not include the tests), run

pip install -r requirements.txt
python -m pytest -s ./test/unit
Onboarding guide for users of version 0.2 or earlier

Up to version 0.2, fairlearn contained only the exponentiated gradient method. The fairlearn repository now has a more comprehensive scope and aims to incorporate other methods as specified above. The same exponentiated gradient technique is now the class fairlearn.reductions.ExponentiatedGradient. While in the past exponentiated gradient was invoked via

import numpy as np
from fairlearn.classred import expgrad
from fairlearn.moments import DP

estimator = LogisticRegression()  # or any other estimator
exponentiated_gradient_result = expgrad(X, sensitive_features, y, estimator, constraints=DP())
positive_probabilities = exponentiated_gradient_result.best_classifier(X)
randomized_predictions = (positive_probabilities >= np.random.rand(len(positive_probabilities))) * 1

the equivalent operation is now

from fairlearn.reductions import ExponentiatedGradient, DemographicParity

estimator = LogisticRegression()  # or any other estimator
exponentiated_gradient = ExponentiatedGradient(estimator, constraints=DemographicParity())
exponentiated_gradient.fit(X, y, sensitive_features=sensitive_features)
randomized_predictions = exponentiated_gradient.predict(X)

Please open a new issue if you encounter any problems.

Usage

For common usage refer to the Jupyter notebooks and our API guide

Contributing

To contribute please check our contributing guide.

Maintainers

The fairlearn project is maintained by:

  • @MiroDudik
  • @riedgar-ms
  • @rihorn2
  • @romanlutz
  • @bethz

Issues

Regular (non-security) issues

Please submit a report through GitHub issues. A maintainer will respond promptly as follows:

  • bug: triage as bug and provide estimated timeline based on severity
  • feature request: triage as feature request and provide estimated timeline
  • question or discussion: triage as question and either respond or notify/identify a suitable expert to respond

Maintainers will try to link duplicate issues when possible.

Reporting security issues

Please take a look at our guidelines for reporting security issues.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fairlearn-0.3.0.tar.gz (30.5 kB view details)

Uploaded Source

Built Distribution

fairlearn-0.3.0-py3-none-any.whl (45.7 kB view details)

Uploaded Python 3

File details

Details for the file fairlearn-0.3.0.tar.gz.

File metadata

  • Download URL: fairlearn-0.3.0.tar.gz
  • Upload date:
  • Size: 30.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for fairlearn-0.3.0.tar.gz
Algorithm Hash digest
SHA256 af6d48ded4b3dd4cb2a01143721b8f3756f54abe98187d4ff6886eee3a52a2b8
MD5 b3f004f329a52ca45b77644888a6e7f6
BLAKE2b-256 36ebb0978535fd64359c1edf9f89ab16bea836fb834627ec1c3374c123ee54f8

See more details on using hashes here.

File details

Details for the file fairlearn-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: fairlearn-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 45.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.9

File hashes

Hashes for fairlearn-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 32ae570614b621870281e68c4036b02bc8e5770f0e3639f45b14fe2aa348aaec
MD5 fd42c76caf52574405212a5853d72157
BLAKE2b-256 9af0943d08c8a724e96c536b42fe470889b2aae0c92dd20f616a73cf681444af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page