Skip to main content

Implemenation of the PQMass two sample test from Lemos et al. 2024

Project description

PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

Implementation of the PQMass two sample test from Lemos et al. 2024 here

Install

Just do:

pip install pqm

Usage

This is the main use case:

from pqm import pqm_pvalue
import numpy as np

x_sample = np.random.normal(size = (500, 10))
y_sample = np.random.normal(size = (400, 10))

# To get pvalues from PQMass
pvalues = pqm_pvalue(x_sample, y_sample, num_refs = 100, bootstrap = 50)
print(np.mean(pvalues), np.std(pvalues))

# To get chi^2 from PQMass
chi2_stat, dof = pqm_chi2(x_sample, y_sample, num_refs = 100, bootstrap = 50)
print(np.mean(chi2_stat), np.std(chi2_stat))
print(np.unqiue(dof)) # This should be the same as num_refs - 1, if it is not, we suggest you use pqm_pvalue

If your two samples are drawn from the same distribution, then the p-value should be drawn from the random uniform(0,1) distribution. This means that if you get a very small value (i.e., 1e-6), then you have failed the null hypothesis test, and the two samples are not drawn from the same distribution.

For the chi^2 metric, given your two sets of samples, if they come from the same distribution, the histogram of your chi² values should follow the chi² distribution. The peak of this distribution will be at DoF - 2, and the standard deviation will be √(2 * DoF). If your histogram shifts to the right of the expected chi² distribution, it suggests that the samples are out of distribution. Conversely, if the histogram shifts to the left, it indicates potential duplication or memorization (particularly relevant for generative models).

Note that the chi^2 metric faces limitations if you have a few samples. A solution could be to use bootstrapping. Another such solution is to pqm_pvalue. We leave it to the user to identify the best solution for their problem.

Developing

If you're a developer then:

git clone git@github.com:Ciela-Institute/PQM.git
cd PQM
git checkout -b my-new-branch
pip install -e .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pqm-0.2.0.tar.gz (285.8 kB view details)

Uploaded Source

Built Distribution

pqm-0.2.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file pqm-0.2.0.tar.gz.

File metadata

  • Download URL: pqm-0.2.0.tar.gz
  • Upload date:
  • Size: 285.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pqm-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7ae9f046ff675d35819d3d421384df1202c686fb8ef3f44e91efa8989122cf3b
MD5 094953e6c1153b22d9b219485601bded
BLAKE2b-256 ad26bde8c5fbf2032f8ffc7890f1315a06a9396de2600536e7a7cb17432e513e

See more details on using hashes here.

File details

Details for the file pqm-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pqm-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pqm-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e552515d89f8b7dfbfee3540bb9e63aeb636ed4b289a81cc4c0ebde109c81d29
MD5 cee9188d35dcd7e7973cb75029edbc76
BLAKE2b-256 a138894be699d87b1205ff2227c7a0ecf01cf80211b4abf4b91b0ea287f2e5f1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page