Skip to main content

Python implementation of the BumpHunter algorithm used by HEP community.

Project description

pyBumpHunter

Binder Test PyPI

This is a python version of the BumpHunter algorithm, see arXiv:1101.0390, G. Choudalakis, designed to find localized excess (or deficit) of events in a 1D or 2D distribution.

The main BumpHunter function will scan a data distribution using variable-width window sizes and calculate the p-value of data with respect to a given background distribution in each window. The minimum p-value obtained from all windows is the local p-value. To cope with the "look-elsewhere effect" a global p-value is calculated by performing background-only pseudo-experiments.

The BumpHunter algorithm can also perform signal injection tests where more and more signal is injected in toy data until a given signal significance (global) is reached (signal injection not available in 2D yet).

Content

  • pyBumpHunter : The pyBumpHunter package
  • example : Folder containing a set of example scripts and notebooks
  • example/results : Folder containing the outputs of example scripts
  • test : Folder containing the testing scripts (based on pytest)
  • data/data.root : Toy data used in the examples and tests
  • data/gen_data.C : Code used to generate the toy data with ROOT

Dependencies

Requires python >= 3.6 py

BumpHunter depends on the following python libraries :

  • numpy
  • scipy
  • matplotlib

pyBumpHunter wiki

Examples

The examples provided in example.py and test.ipynb require the uproot package in order to read the data from a ROOT software file.

The data provided in the example consists of three histograms: a steeply falling 'background' distribution in a [0,20] x-axis range, a 'signal' gaussian shape centered on a value of 5.5, and a 'data' distribution sampled from background and signal distributions, with a signal fraction of 0.15%. The data file is produced by running gen_data.C in ROOT.

In order to run the example script, simply type python3 example.py in a terminal.

You can also open the example notebook with jupyter or binder.

  • Bump hunting:

  • Tomography scan:

  • Test statistics and global p-value:

See the wiki for a detailed overview of all the features offered by pyBumpHunter.

To do list

  • Run BH on 2D histograms

Authors and contributors

Louis Vaslin (main developper), Julien Donini

Thanks to Samuel Calvet for his help in cross-checking and validating pyBumpHunter against the (internal) C++ version of BumpHunter developped by the ATLAS collaboration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyBumpHunter-0.3.1.tar.gz (2.1 MB view details)

Uploaded Source

Built Distribution

pyBumpHunter-0.3.1-py2.py3-none-any.whl (29.9 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file pyBumpHunter-0.3.1.tar.gz.

File metadata

  • Download URL: pyBumpHunter-0.3.1.tar.gz
  • Upload date:
  • Size: 2.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.9

File hashes

Hashes for pyBumpHunter-0.3.1.tar.gz
Algorithm Hash digest
SHA256 9b45e51560e85376e535a5f1e574e4e46d99bc4c2dc1aaf74e8d345d36919956
MD5 0b10ac6eaadae841ea812f1b4a7adc05
BLAKE2b-256 98f2b9d7cfeff97649ee893e1449a44e93c7268c13196b1330630ed1509d1f8c

See more details on using hashes here.

File details

Details for the file pyBumpHunter-0.3.1-py2.py3-none-any.whl.

File metadata

  • Download URL: pyBumpHunter-0.3.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.9

File hashes

Hashes for pyBumpHunter-0.3.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 fca3be4ed8acc4a9bc39b9c3be1768ce46cbad2c52f1eb6b45125bf613a49bfd
MD5 8b7ceec6d052a91ce2e9f09d760e3e93
BLAKE2b-256 a8ea0f6d4b884982087051f5af481f1ae459587be76798dd9c9f9465651a2eaa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page