Skip to main content

Fast histogramming in Python build on pybind11 and OpenMP.

Project description

pygram11

Documentation Status Actions Status builds.sr.ht status PyPI version Conda Forge

Simple and fast histogramming in Python accelerated with OpenMP (with help from pybind11).

pygram11 provides fast functions for calculating histograms (and the variance in each bin). The API is very simple; documentation can be found here (you'll also find some benchmarks there).

Installing

Using pygram11 only requires NumPy. To build and install from source you'll just need a compiler with support for C++11 and OpenMP.

From PyPI

Binary wheels are provided for Linux and macOS. They can be installed from PyPI via pip.

pip install pygram11

From conda-forge

For installation via the conda package manager pygram11 is part of conda-forge.

conda install pygram11 -c conda-forge

Please note that on macOS the OpenMP libraries from LLVM (libomp) and Intel (libiomp) may clash if your conda environment includes the Intel Math Kernel Library (MKL) package distributed by Anaconda. You may need to install the nomkl package to prevent the clash (Intel MKL accelerates many linear algebra operations, but does not impact pygram11):

conda install nomkl ## sometimes necessary fix (macOS only)

From Source

All you need is a C++11 compiler and OpenMP. If you are using a relatively modern GCC release on Linux then you probably don't have to worry about the OpenMP dependency. If you are on macOS, you can install libomp from Homebrew. With those dependencies met, simply run:

pip install git+https://github.com/douglasdavis/pygram11.git@master

In Action

A histogram (with fixed bin width) of weighted data in one dimension:

>>> x = np.random.randn(10000)
>>> w = np.random.uniform(0.8, 1.2, 10000)
>>> h, err = pygram11.histogram(x, bins=40, range=(-4, 4), weights=w)

A histogram with fixed bin width which saves the under and overflow in the first and last bins:

>>> x = np.random.randn(1000000)
>>> h, err = pygram11.histogram(x, bins=20, range=(-3, 3), flow=True)

A histogram in two dimensions with variable width bins:

>>> x = np.random.randn(10000)
>>> y = np.random.randn(10000)
>>> xbins = [-2.0, -1.0, -0.5, 1.5, 2.0]
>>> ybins = [-3.0, -1.5, -0.1, 0.8, 2.0]
>>> h, err = pygram11.histogram2d(x, y, bins=[xbins, ybins])

Histogramming multiple weight variations for the same data, then putting the result in a DataFrame (the input pandas DataFrame will be interpreted as a NumPy array):

>>> weights = pd.DataFrame({"weight_a" : np.abs(np.random.randn(10000)),
...                         "weight_b" : np.random.uniform(0.5, 0.8, 10000),
...                         "weight_c" : np.random.rand(10000)})
>>> data = np.random.randn(10000)
>>> count, err = pygram11.histogram(data, bins=20, range=(-3, 3), weights=weights, flow=True)
>>> count_df = pd.DataFrame(count, columns=weights.columns)
>>> err_df = pd.DataFrame(err, columns=weights.columns)

I also wrote a blog post with some simple examples.

Other Libraries

  • There is an effort to develop an object oriented histogramming library for Python called boost-histogram. This library will be feature complete w.r.t. everything a physicist needs with histograms.
  • Simple and fast histogramming in Python using the NumPy C API: fast-histogram (no variance or overflow support).
  • If you want to calculate histograms on a GPU in Python, check out cupy.histogram. They only have 1D histograms (no weights or overflow).

If there is something you'd like to see in pygram11, please open an issue or pull request.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygram11-0.10.1.dev0.tar.gz (804.8 kB view details)

Uploaded Source

File details

Details for the file pygram11-0.10.1.dev0.tar.gz.

File metadata

  • Download URL: pygram11-0.10.1.dev0.tar.gz
  • Upload date:
  • Size: 804.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0.post20200814 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.5

File hashes

Hashes for pygram11-0.10.1.dev0.tar.gz
Algorithm Hash digest
SHA256 5c36a1d0f0485286f741f9f1f573bbb97691833b2704396e72f152d571c238c1
MD5 b2858646fcfa31543147d8ef0689b7c1
BLAKE2b-256 ed29008764b2ed11eac13968db1090728aeed939f93b87e6eccee3aa2f102f56

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page