Histogramming tools on CUDA.
Project description
cuda-histogram
cuda-histogram
is a histogram filling package for GPUs. The package tries to
follow UHI and keeps its API similar to
boost-histogram and
hist.
Main features of cuda-histogram:
- Implements a subset of the features of boost-histogram using CuPy (see API
documentation for a complete list):
- Axes
Regular
andVariable
axesedges()
centers()
index(...)
- ...
- Histogram
fill(..., weight=...)
(includingNan
flow)- simple indexing with slicing (see example below)
values(flow=...)
variance(flow=...)
- Axes
- Allows users to detach the generated GPU histogram to CPU -
to_boost()
- converts toboost-histogram.Histogram
to_hist()
- converts tohist.Hist
Near future goals for the package -
- Implement support for
Categorical
axes (exists internally but need refactoring to match boost-histogram's API) - Improve indexing (
__getitem__
) to exactly match boost-histogram's API
Installation
cuda-histogram is available on PyPI
as well as on conda. The
library can be installed using pip
-
pip install cuda-histogram
or using conda
-
conda install -c conda-forge cuda-histogram
Usage
Ideally, a user would want to create a cuda-histogram, fill values on GPU, and convert the filled histogram to boost-histogram/Hist object to access all the UHI functionalities.
Creating a histogram
import cuda_histogram; import cupy as cp
ax1 = cuda_histogram.axis.Regular(10, 0, 1)
ax2 = cuda_histogram.axis.Variable([0, 2, 3, 6])
h = cuda_histogram.Hist(ax1, ax2)
>>> ax1, ax2, h
(Regular(10, 0, 1), Variable([0. 2. 3. 6.]), Hist(Regular(10, 0, 1), Variable([0. 2. 3. 6.])))
Filling a histogram
Differences in API (from boost-histogram) -
- Has an additional
NaN
flow - Accepts only CuPy arrays
h.fill(cp.random.normal(size=1_000_000), cp.random.normal(size=1_000_000)) # set weight=... for weighted fills
>>> h.values(), type(h.values()) # set flow=True for flow bins (underflow, overflow, nanflow)
(array([[28532., 1238., 64.],
[29603., 1399., 61.],
[30543., 1341., 78.],
[31478., 1420., 98.],
[32692., 1477., 92.],
[32874., 1441., 96.],
[33584., 1515., 88.],
[34304., 1490., 114.],
[34887., 1598., 116.],
[35341., 1472., 103.]]), <class 'cupy.ndarray'>)
Indexing axes and histograms
Differences in API (from boost-histogram) -
underflow
is indexed as0
and not-1
ax[...]
will return acuda_histogram.Interval
object- No interpolation is performed
Hist
indices should be in the range of bin edges, instead of integers
>>> ax1.index(0.5)
array([6])
>>> ax1.index(-1)
array([0])
>>> ax1[0]
<Interval ((-inf, 0.0)) instance at 0x1c905208790>
>>> h[0, 0], type(h[0, 0])
(Hist(Regular(1, 0.0, 0.1), Variable([0. 2.])), <class 'cuda_histogram.hist.Hist'>)
>>> h[0, 0].values(), type(h[0, 0].values())
(array([[28532.]]), <class 'cupy.ndarray'>)
>>> h[0, :].values(), type(h[0, 0].values())
(array([[28532., 1238., 64.]]), <class 'cupy.ndarray'>)
>>> h[0.2, :].values(), type(h[0, 0].values()) # indices in range of bin edges
(array([[30543., 1341., 78.]]), <class 'cupy.ndarray'>)
>>> h[:, 1:2].values(), type(h[0, 0].values()) # no interpolation
C:\Users\Saransh\Saransh_softwares\OpenSource\Python\cuda-histogram\src\cuda_histogram\axis\__init__.py:580: RuntimeWarning: Reducing along axis Variable([0. 2. 3. 6.]): requested start 1 between bin boundaries, no interpolation is performed
warnings.warn(
(array([[28532.],
[29603.],
[30543.],
[31478.],
[32692.],
[32874.],
[33584.],
[34304.],
[34887.],
[35341.]]), <class 'cupy.ndarray'>)
Converting to CPU
All the existing functionalities of boost-histogram and Hist can be used on the converted histogram.
h.to_boost()
>>> h.to_boost().values(), type(h.to_boost().values())
(array([[28532., 1238., 64.],
[29603., 1399., 61.],
[30543., 1341., 78.],
[31478., 1420., 98.],
[32692., 1477., 92.],
[32874., 1441., 96.],
[33584., 1515., 88.],
[34304., 1490., 114.],
[34887., 1598., 116.],
[35341., 1472., 103.]]), <class 'numpy.ndarray'>)
h.to_hist()
>>> h.to_hist().values(), type(h.to_hist().values())
(array([[28532., 1238., 64.],
[29603., 1399., 61.],
[30543., 1341., 78.],
[31478., 1420., 98.],
[32692., 1477., 92.],
[32874., 1441., 96.],
[33584., 1515., 88.],
[34304., 1490., 114.],
[34887., 1598., 116.],
[35341., 1472., 103.]]), <class 'numpy.ndarray'>)
Getting help
cuda-histogram
's code is hosted on GitHub.- If something is not working the way it should, or if you want to request a new feature, create a new issue on GitHub.
- To discuss something related to
cuda-histogram
, use the discussions tab on GitHub.
Contributing
Contributions of any kind welcome! See CONTRIBUTING.md for information on setting up a development environment.
Acknowledgements
This library was primarily developed by Lindsey Gray, Saransh Chopra, and Jim Pivarski.
Support for this work was provided by the National Science Foundation cooperative agreement OAC-1836650 and PHY-2323298 (IRIS-HEP). Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cuda_histogram-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03f18b43accac6f1376edbcb8aa63f470733bab1ebc695b0cf1a195097c47c8f |
|
MD5 | 0d5238ebd304ca6092780fae9e1049fc |
|
BLAKE2b-256 | 0a7f04e9827485f12079bc19ecdf4e58105693588c0ee0663f15a592e6f86e4c |