Skip to main content

statistics tools and utilities

Project description

hepstats: statistics tools and utilities

Build Status Azure DevOps tests Azure DevOps coverage Binder PyPI PyPI - Python Version DOI

Installation

Install hepstats like any other Python package:

pip install hepstats

or similar (use --user, virtualenv, etc. if you wish).

Getting Started

The hepstats module includes modeling and hypothesis tests submodules. This a quick user guide to each submodule. The binder examples are also a good way to get started.

modeling

The modeling submodule includes the Bayesian Block algorithm that can be used to improve the binning of histograms. The visual improvement can be dramatic, and more importantly, this algorithm produces histograms that accurately represent the underlying distribution while being robust to statistical fluctuations. Here is a small example of the algorithm applied on Laplacian sampled data, compared to a histogram of this sample with a fine binning.

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from hepstats.modeling import bayesian_blocks

>>> data = np.random.laplace(size=10000)
>>> blocks = bayesian_blocks(data)

>>> plt.hist(data, bins=1000, label='Fine Binning', density=True, alpha=0.6)
>>> plt.hist(data, bins=blocks, label='Bayesian Blocks', histtype='step', density=True, linewidth=2)
>>> plt.legend(loc=2)

bayesian blocks example

hypotests

This submodule provides tools to do hypothesis tests such as discovery test and computations of upper limits or confidence intervals. scikit-stats needs a fitting backend to perform computations such as zfit. Any fitting library can be used if their API is compatible with scikit-stats (see api checks).

We give here a simple example of a discovery test, using zfit as backend, of gaussian signal with known mean and sigma over an exponential background.

>>> import zfit
>>> from zfit.loss import ExtendedUnbinnedNLL
>>> from zfit.minimize import Minuit

>>> bounds = (0.1, 3.0)
>>> obs = zfit.Space('x', limits=bounds)

>>> bkg = np.random.exponential(0.5, 300)
>>> peak = np.random.normal(1.2, 0.1, 25)
>>> data = np.concatenate((bkg, peak))
>>> data = data[(data > bounds[0]) & (data < bounds[1])]
>>> N = data.size
>>> data = zfit.Data.from_numpy(obs=obs, array=data)

>>> lambda_ = zfit.Parameter("lambda", -2.0, -4.0, -1.0)
>>> Nsig = zfit.Parameter("Ns", 20., -20., N)
>>> Nbkg = zfit.Parameter("Nbkg", N, 0., N*1.1)
>>> signal = Nsig * zfit.pdf.Gauss(obs=obs, mu=1.2, sigma=0.1)
>>> background = Nbkg * zfit.pdf.Exponential(obs=obs, lambda_=lambda_)
>>> loss = ExtendedUnbinnedNLL(model=signal + background, data=data)

>>> from hepstats.hypotests.calculators import AsymptoticCalculator
>>> from hepstats.hypotests import Discovery
>>> from hepstats.hypotests.parameters import POI

>>> calculator = AsymptoticCalculator(loss, Minuit())
>>> poinull = POI(Nsig, 0)
>>> discovery_test = Discovery(calculator, [poinull])
>>> discovery_test.result()

p_value for the Null hypothesis = 0.0007571045424956679
Significance (in units of sigma) = 3.1719464825102244

The discovery test prints out the pvalue and the significance of the null hypothesis to be rejected.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hepstats-0.1.2.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

hepstats-0.1.2-py2.py3-none-any.whl (24.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file hepstats-0.1.2.tar.gz.

File metadata

  • Download URL: hepstats-0.1.2.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.6.8

File hashes

Hashes for hepstats-0.1.2.tar.gz
Algorithm Hash digest
SHA256 3fcbf80b0d04c997b6ae9019039b7b6b8be531f2acece54b2e7ab44e83c2f67d
MD5 3513fba6411af34b2e2154deea3460c2
BLAKE2b-256 a1baec32e695d0299f9b5d721aa7465de0b78b253fd87daa57e8fe8a041affff

See more details on using hashes here.

File details

Details for the file hepstats-0.1.2-py2.py3-none-any.whl.

File metadata

  • Download URL: hepstats-0.1.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 24.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.6.2 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/3.6.8

File hashes

Hashes for hepstats-0.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 dce96e9b864861b00f27a8ae2ad79b6310c49989c716ee0ba027017540e9b429
MD5 510b92dbf165dffae30059b304027398
BLAKE2b-256 da4108a8c7028fdc578aa36fe42789f242ba25a91b4f9ce769c2d8a33ef46b1b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page