Skip to main content

statistics tools and utilities

Project description

Scikit-HEP project hepstats package: statistics tools and utilities

Scikit-HEP

PyPI PyPI - Python Version DOI Build Status Azure DevOps coverage Azure DevOps tests Binder

Installation

Install hepstats like any other Python package:

pip install hepstats

or similar (use e.g. virtualenv if you wish).

Getting Started

The hepstats module includes modeling and hypothesis tests submodules. This a quick user guide to each submodule. The binder examples are also a good way to get started.

modeling

The modeling submodule includes the Bayesian Block algorithm that can be used to improve the binning of histograms. The visual improvement can be dramatic, and more importantly, this algorithm produces histograms that accurately represent the underlying distribution while being robust to statistical fluctuations. Here is a small example of the algorithm applied on Laplacian sampled data, compared to a histogram of this sample with a fine binning.

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from hepstats.modeling import bayesian_blocks

>>> data = np.random.laplace(size=10000)
>>> blocks = bayesian_blocks(data)

>>> plt.hist(data, bins=1000, label='Fine Binning', density=True, alpha=0.6)
>>> plt.hist(data, bins=blocks, label='Bayesian Blocks', histtype='step', density=True, linewidth=2)
>>> plt.legend(loc=2)

bayesian blocks example

hypotests

This submodule provides tools to do hypothesis tests such as discovery test and computations of upper limits or confidence intervals. hepstats needs a fitting backend to perform computations such as zfit. Any fitting library can be used if their API is compatible with hepstats (see api checks).

We give here a simple example of an upper limit calculation of the yield of a Gaussian signal with known mean and sigma over an exponential background. The fitting backend used is the zfit package.

>>> import zfit
>>> from zfit.loss import ExtendedUnbinnedNLL
>>> from zfit.minimize import Minuit

>>> bounds = (0.1, 3.0)
>>> obs = zfit.Space('x', limits=bounds)

>>> bkg = np.random.exponential(0.5, 300)
>>> peak = np.random.normal(1.2, 0.1, 10)
>>> data = np.concatenate((bkg, peak))
>>> data = data[(data > bounds[0]) & (data < bounds[1])]
>>> N = data.size
>>> data = zfit.Data.from_numpy(obs=obs, array=data)

>>> lambda_ = zfit.Parameter("lambda", -2.0, -4.0, -1.0)
>>> Nsig = zfit.Parameter("Nsig", 1., -20., N)
>>> Nbkg = zfit.Parameter("Nbkg", N, 0., N*1.1)
>>> signal = Nsig * zfit.pdf.Gauss(obs=obs, mu=1.2, sigma=0.1)
>>> background = Nbkg * zfit.pdf.Exponential(obs=obs, lambda_=lambda_)
>>> loss = ExtendedUnbinnedNLL(model=signal + background, data=data)

>>> from hepstats.hypotests.calculators import AsymptoticCalculator
>>> from hepstats.hypotests import UpperLimit
>>> from hepstats.hypotests.parameters import POI, POIarray

>>> calculator = AsymptoticCalculator(loss, Minuit())
>>> poinull = POIarray(Nsig, np.linspace(0.0, 25, 20))
>>> poialt = POI(Nsig, 0)
>>> ul = UpperLimit(calculator, poinull, poialt)
>>> ul.upperlimit(alpha=0.05, CLs=True)

Observed upper limit: Nsig = 15.725784747406346
Expected upper limit: Nsig = 11.927442041887158
Expected upper limit +1 sigma: Nsig = 16.596396280677116
Expected upper limit -1 sigma: Nsig = 8.592750403611896
Expected upper limit +2 sigma: Nsig = 22.24864429383046
Expected upper limit -2 sigma: Nsig = 6.400549971360598

upper limit example

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hepstats-0.2.1.tar.gz (27.6 kB view details)

Uploaded Source

Built Distribution

hepstats-0.2.1-py2.py3-none-any.whl (34.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file hepstats-0.2.1.tar.gz.

File metadata

  • Download URL: hepstats-0.2.1.tar.gz
  • Upload date:
  • Size: 27.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.8

File hashes

Hashes for hepstats-0.2.1.tar.gz
Algorithm Hash digest
SHA256 ae700e320b1a5d6c45cc28967d9b920117182a118e48bb18fea9073e62a26b7e
MD5 646525daba75652147854dc31185f978
BLAKE2b-256 630b8623b5f5e4fc4fd75e564f6115dac2358092f3d82f5be08a2b9720f5749e

See more details on using hashes here.

File details

Details for the file hepstats-0.2.1-py2.py3-none-any.whl.

File metadata

  • Download URL: hepstats-0.2.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 34.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.8

File hashes

Hashes for hepstats-0.2.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 48572dfac2999dd8126b3f2aca38f2100075faecc88426e48f069c6404e85c8c
MD5 d1e71a109a03bfde21a5c10b8e24b956
BLAKE2b-256 af3bbdb6be3132a0d4affc2b9c86c4e2fde8f5ce0d5dd9bced503c2e3854d55f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page