Implemenation of the PQMass two sample test from Lemos et al. 2024
Project description
PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation
Implementation of the PQMass two sample test from Lemos et al. 2024 here
Install
Just do:
pip install pqm
Usage
This is the main use case:
from pqm import pqm_pvalue
import numpy as np
x_sample = np.random.normal(size = (500, 10))
y_sample = np.random.normal(size = (400, 10))
# To get pvalues from PQMass
pvalues = pqm_pvalue(x_sample, y_sample, num_refs = 100, bootstrap = 50)
print(np.mean(pvalues), np.std(pvalues))
# To get chi^2 from PQMass
chi2_stat, dof = pqm_chi2(x_sample, y_sample, num_refs = 100, bootstrap = 50)
print(np.mean(chi2_stat), np.std(chi2_stat))
print(np.unqiue(dof)) # This should be the same as num_refs - 1, if it is not, we suggest you use pqm_pvalue
If your two samples are drawn from the same distribution, then the p-value should be drawn from the random uniform(0,1) distribution. This means that if you get a very small value (i.e., 1e-6), then you have failed the null hypothesis test, and the two samples are not drawn from the same distribution.
For the chi^2 metric, given your two sets of samples, if they come from the same distribution, the histogram of your chi² values should follow the chi² distribution. The peak of this distribution will be at DoF - 2, and the standard deviation will be √(2 * DoF). If your histogram shifts to the right of the expected chi² distribution, it suggests that the samples are out of distribution. Conversely, if the histogram shifts to the left, it indicates potential duplication or memorization (particularly relevant for generative models).
Note that the chi^2 metric faces limitations if you have a few samples. A solution could be to use bootstrapping. Another such solution is to pqm_pvalue. We leave it to the user to identify the best solution for their problem.
Developing
If you're a developer then:
git clone git@github.com:Ciela-Institute/PQM.git
cd PQM
git checkout -b my-new-branch
pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pqm-0.3.0.tar.gz
.
File metadata
- Download URL: pqm-0.3.0.tar.gz
- Upload date:
- Size: 286.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8f3b14aa1b7430384baa52ee439a763f3b34cc83e6b4fae77703bb392d151264 |
|
MD5 | 27ed2d04d19a0509bbc6811cfa9a275f |
|
BLAKE2b-256 | a22c09f24f70920316cf4a4ef8a0d5f0ca355f623364734547299f4f4f06747e |
File details
Details for the file pqm-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: pqm-0.3.0-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9afd8c7d894a9211e958f4abd12853a73075c5c610fea4bdb3e29d230e059075 |
|
MD5 | f189961351bb8aab660da64f6c99ef3b |
|
BLAKE2b-256 | c3caf18d798df200e26207e1ade7fc84db5471b153dcc864b5341577479f18e7 |