Skip to main content

A Python library for drift detection in Machine Learning problems

Project description

ci coverage documentation bsd_3_license

Frouros is a Python library for drift detection in Machine Learning problems.

It provides a set of algorithms for drift detection, both for the supervised and unsupervised parts, as well as some semi-supervised algorithms. It is design with the intention of being integrated easily with the scikit-learn library. This integration allows Frouros to be used in machine learning problem pipelines, in the implementation of new drift detection algorithms and could be used to compare performance between detectors, as a benchmark.

Quickstart

As a quick and easy example, we can generate two bivariate normal distribution in order to use an unsupervised method like MMD (Maximum Mean Discrepancy). This method tries to verify if generated samples come from the same distribution or not. If they come from different distributions, it means that there is covariate drift.

from sklearn.gaussian_process.kernels import RBF
import numpy as np
from frouros.unsupervised.distance_based import MMD

np.random.seed(31)
# X samples from a normal distribution with mean = [1. 1.] and cov = [[2. 0.][0. 2.]]
x_mean = np.ones(2)
x_cov = 2*np.eye(2)
# Y samples a normal distribution with mean = [0. 0.] and cov = [[2. 1.][1. 2.]]
y_mean = np.zeros(2)
y_cov = np.eye(2) + 1

num_samples = 200
X_ref = np.random.multivariate_normal(x_mean, x_cov, num_samples)
X_test = np.random.multivariate_normal(y_mean, y_cov, num_samples)

alpha = 0.01  # significance level for the hypothesis test

detector = MMD(num_permutations=1000, kernel=RBF(length_scale=1.0), random_state=31)
detector.fit(X=X_ref)
detector.transform(X=X_test)
mmd, p_value = detector.distance

p_value < alpha
>>> True  # Drift detected. We can reject H0, so both samples come from different distributions.

More advance examples can be found here.

Installation

Frouros supports Python 3.8, 3.9 and 3.10 versions. It can be installed via pip:

pip install frouros

there is also the option to use PyTorch models with the help of skorch:

pip install frouros[pytorch]

Drift detection methods

The currently supported methods are listed in the following table. They are divided in three main categories depending on the type of drift that they are capable of detecting and how they detect it.

Type Subtype Method
Supervised
CUSUM Based
CUSUM
Geometric Moving Average
Page Hinkley
DDM Based
DDM
ECDD-WT
EDDM
HDDM-A
HDDM-W
RDDM
STEPD
Window Based
ADWIN
KSWIN
Semi-supervised
Margin Density Based
MD3-SVM
MD3-RS
Unsupervised
Distance Based
EMD
Histogram Intersection
JS
KL
MMD
PSI
Statistical Test
Chi-Square
CVM
KS
Welch's T-test

Datasets

Some well-known datasets and synthetic generators are provided and listed in the following table.

Type Dataset
Real
Elec2
Synthetic
SEA

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

frouros-0.1.0.tar.gz (49.6 kB view details)

Uploaded Source

Built Distribution

frouros-0.1.0-py3-none-any.whl (79.8 kB view details)

Uploaded Python 3

File details

Details for the file frouros-0.1.0.tar.gz.

File metadata

  • Download URL: frouros-0.1.0.tar.gz
  • Upload date:
  • Size: 49.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for frouros-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8b3f9b88a19883cd1d464820f5785eec541b99a0aa2d5bc8e8a7ebcfd06596a5
MD5 87035966b821e3b4fa3419eec095be24
BLAKE2b-256 d4d30363a9e42cee21823565b8b3ce69e9df9b3a97efcfca059991b879c75978

See more details on using hashes here.

File details

Details for the file frouros-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: frouros-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 79.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for frouros-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3adc3ebca17964cf7c760c0fc9caf1efaf8a4f321d94bc5149ad5988adcc346d
MD5 fcd943122ecaf6b6236c48ed8e230bb4
BLAKE2b-256 360f9c4188c67796f33adc10cf47c3d0c7b7371b6ef906da5e6aa9bdc29c0502

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page