Algorithms for mitigating unfairness in supervised machine learning
Project description
fairlearn
The fairlearn project seeks to enable anyone involved in the development of artificial intelligence (AI) systems to assess their system's fairness and mitigate the observed unfairness. The fairlearn repository contains a Python package and Jupyter notebooks with the examples of usage.
- Current release
- What we mean by fairness
- Overview of fairlearn
- Install fairlearn
- Usage
- Contributing
- Maintainers
- Issues
Current release
-
The current stable release is available at fairlearn v0.4.4.
-
Our current version differs substantially from version 0.2 or earlier. Users of these older versions should visit our onboarding guide.
What we mean by fairness
An AI system can behave unfairly for a variety of reasons. In fairlearn, we define whether an AI system is behaving unfairly in terms of its impact on people – i.e., in terms of harms. We focus on two kinds of harms:
-
Allocation harms. These harms can occur when AI systems extend or withhold opportunities, resources, or information. Some of the key applications are in hiring, school admissions, and lending.
-
Quality-of-service harms. Quality of service refers to whether a system works as well for one person as it does for another, even if no opportunities, resources, or information are extended or withheld.
We follow the approach known as group fairness, which asks: Which groups of individuals are at risk for experiencing harms? The relevant groups need to be specified by the data scientist and are application specific.
Group fairness is formalized by a set of constraints, which require that some aspect (or aspects) of the AI system's behavior be comparable across the groups. The fairlearn package enables assessment and mitigation of unfairness under several common definitions. To learn more about our definitions of fairness, please visit our terminology page.
Note: Fairness is fundamentally a sociotechnical challenge. Many aspects of fairness, such as justice and due process, are not captured by quantitative fairness metrics. Furthermore, there are many quantitative fairness metrics which cannot all be satisfied simultaneously. Our goal is to enable humans to assess different mitigation strategies and then make trade-offs appropriate to their scenario.
Overview of fairlearn
The fairlearn
package contains the following algorithms for mitigating unfairness in binary classification and regression:
algorithm | description | classification/regression | sensitive features | supported fairness definitions |
---|---|---|---|---|
fairlearn. reductions. ExponentiatedGradient |
Black-box approach to fair classification described in A Reductions Approach to Fair Classification | binary classification | categorical | DP, EO |
fairlearn. reductions. GridSearch |
Black-box approach described in Section 3.4 of A Reductions Approach to Fair Classification | binary classification | binary | DP, EO |
fairlearn. reductions. GridSearch |
Black-box approach that implements a grid-search variant of the algorithm described in Section 5 of Fair Regression: Quantitative Definitions and Reduction-based Algorithms | regression | binary | BGL |
fairlearn. postprocessing. ThresholdOptimizer |
Postprocessing algorithm based on the paper Equality of Opportunity in Supervised Learning. This technique takes as input an existing classifier and the sensitive feature, and derives a monotone transformation of the classifier's prediction to enforce the specified parity constraints. | binary classification | categorical | DP, EO |
Note: DP refers to demographic parity, EO to equalized odds, and BGL to bounded group loss. For more information on these and other terms we use in this repository please refer to the terminology page. To request additional algorithms or fairness definitions, please open a new issue.
Install fairlearn
The package can be installed via
pip install fairlearn
or optionally with a full feature set by adding extras, e.g. pip install fairlearn[customplots]
.
or you can clone the repository locally via
git clone git@github.com:fairlearn/fairlearn.git
To verify that the cloned repository works (the pip package does not include the tests), run
pip install -r requirements.txt
python -m pytest -s ./test/unit
Onboarding guide for users of version 0.2 or earlier
Up to version 0.2, fairlearn contained only the exponentiated gradient method. The fairlearn repository now has a more comprehensive scope and aims to incorporate other methods as specified above. The same exponentiated gradient technique is now the class fairlearn.reductions.ExponentiatedGradient
. While in the past exponentiated gradient was invoked via
import numpy as np
from fairlearn.classred import expgrad
from fairlearn.moments import DP
estimator = LogisticRegression() # or any other estimator
exponentiated_gradient_result = expgrad(X, sensitive_features, y, estimator, constraints=DP())
positive_probabilities = exponentiated_gradient_result.best_classifier(X)
randomized_predictions = (positive_probabilities >= np.random.rand(len(positive_probabilities))) * 1
the equivalent operation is now
from fairlearn.reductions import ExponentiatedGradient, DemographicParity
estimator = LogisticRegression() # or any other estimator
exponentiated_gradient = ExponentiatedGradient(estimator, constraints=DemographicParity())
exponentiated_gradient.fit(X, y, sensitive_features=sensitive_features)
randomized_predictions = exponentiated_gradient.predict(X)
Please open a new issue if you encounter any problems.
Usage
For common usage refer to the Jupyter notebooks and our API guide
Contributing
To contribute please check our contributing guide.
Maintainers
The fairlearn project is maintained by:
- @MiroDudik
- @riedgar-ms
- @rihorn2
- @romanlutz
For a full list of contributors refer to the authors page
Issues
Regular (non-security) issues
Please submit a report through GitHub issues. A maintainer will respond promptly as follows:
- bug: triage as
bug
and provide estimated timeline based on severity - feature request: triage as
feature request
and provide estimated timeline - question or discussion: triage as
question
and either respond or notify/identify a suitable expert to respond
Maintainers will try to link duplicate issues when possible.
Reporting security issues
Please take a look at our guidelines for reporting security issues.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fairlearn-0.4.4.tar.gz
.
File metadata
- Download URL: fairlearn-0.4.4.tar.gz
- Upload date:
- Size: 10.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2dd23833e1c07ed0264a0c5d24c75be3ae0518b1413ae537c060d0146889f09b |
|
MD5 | 72e417bede0dc958348a5686681e3207 |
|
BLAKE2b-256 | 94ebb2d01eb406531c1e2c11a554497fa0f8f9e4c73d8a460b236744647d133e |
File details
Details for the file fairlearn-0.4.4-py3-none-any.whl
.
File metadata
- Download URL: fairlearn-0.4.4-py3-none-any.whl
- Upload date:
- Size: 21.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d27038bbbdbdf7dd6804477cf99c7ae9f7be8f8f6c7d3a0f79840f7cb0c86619 |
|
MD5 | e1ba097c4970e5a73369365fe53a3db9 |
|
BLAKE2b-256 | a98ae4a0d3eeac8b0edf1f3611951c0114c676f17b43f0680c6cf710923a8e52 |