Faster loops for NumPy using multithreading and other tricks
Project description
PNumPy
Parallel NumPy seamlessly speeds up NumPy for large arrays (64K+ elements) with no change required to your existing NumPy code.
PNumPy supports Linux, Windows, and MacOS for NumPy >= 1.18 for python 3.6, 3.7, 3.8, and 3.9.
This first release speeds up NumPy binary and unary ufuncs such as add, multiply, isnan, abs, sin, log, sum, min and many more. Sped up functions also include: sort, argsort, lexsort, arange, boolean indexing, and fancy indexing. In the near future we will speed up: astype, where, putmask, and searchsorted.
Other packages that use numpy, such as scikit-learn or pandas, will also be sped up for large arrays.
Installation
pip install pnumpy
You can also install the latest development versions with
pip install https://github.com/Quansight/pnumpy/archive/main.zip
Documentation
See the full documentation
To use the project:
import pnumpy as pn
Parallel NumPy speeds up NumPy silently under the hood. To see some benchmarks yourself run
pn.benchmark()
To get a partial list of functions sped up run
pn.atop_info()
To disable or enable pnumpy run
pn.disable()
pn.enable()
Additional Functionality
PNumPy provides additional routines such as converting a NumPy record array to a column major array in parallel (pn.recarray_to_colmajor) which is useful for DataFrames. Other routines include pn.lexsort32, which performs an indirect sort using np.int32 instead of np.int64 consuming half the memory and running faster.
Threading
PNumPy uses a combination of threads and 256 bit vector intrinsics to speed up calculations. By default most operations will only use 3 additional worker threads in combination with the main python thread for a total 4. Large arrays are divided up into 16K chunks and threads are assigned to maintain cache coherency. More threads are dynamically deployed for more intensive CPU problems like np.sin. Users can customize threading. The example below shows how 4 threads can work together to quadruple the effective L2 cache size.
To cap the number of additional worker threads to 3 run
pn.thread_setworkers(3)
To disable or re-enable threading run
pn.thread_disable()
pn.thread_enable()
To disable or re-enable just the atop engine run
pn.atop_disable()
pn.atop_enable()
FAQ
Q: If I type np.sort(a) where a is an array, will it be sped up?
A: If len(a) > 65536 and pnumpy has been imported, it will automatically be sped up
Q: How is sort sped up?
A: PNumPy uses additional threads to divide up the sorting job. For example it might perform an 8 way quicksort followed by a 4 way mergesort
Q: How is scikit or pandas sped up?
A: PNumPy's vector loops and threads will speed up any package that uses large NumPy arrays
Development
To run all the tests run:
python -m pip install pytest
python -m pytest tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file pnumpy-2.0.23-cp36-abi3-win_amd64.whl
.
File metadata
- Download URL: pnumpy-2.0.23-cp36-abi3-win_amd64.whl
- Upload date:
- Size: 415.0 kB
- Tags: CPython 3.6+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb4a1679f6b9e42a6085415126259c50362ca2de73e0c96e9fb8201bef0e73b8 |
|
MD5 | 32f07bead3f87c1fa6af4c058b61ba1c |
|
BLAKE2b-256 | 0b60740b8446c8c399a4c0eb6f7e5ded831fac909892517eef6582181568423d |
File details
Details for the file pnumpy-2.0.23-cp36-abi3-manylinux2010_x86_64.whl
.
File metadata
- Download URL: pnumpy-2.0.23-cp36-abi3-manylinux2010_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.6+, manylinux: glibc 2.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3530f8083174b07e7342e99e20adeeb2ca5d706c691e7542df2124facd4e375 |
|
MD5 | 58e571808821ce0506527acce47026d8 |
|
BLAKE2b-256 | d20f0cb8a391b7eb0268a8b596fb8bf5559a8d636a9d97da97488fc224301963 |
File details
Details for the file pnumpy-2.0.23-cp36-abi3-manylinux1_x86_64.whl
.
File metadata
- Download URL: pnumpy-2.0.23-cp36-abi3-manylinux1_x86_64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.6+
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.8.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | abd6a369d1c580f6aa1cb16a6e67a73bfa9ebe4b2b53457ffa1da76a9b96589d |
|
MD5 | 58b2a96489a20cb8f7c80a4b3cdbaa4d |
|
BLAKE2b-256 | a0fe2dede5d615747b5f1aca6a2deb7dd281db049aaf35e37fc29a2e57ad166d |
File details
Details for the file pnumpy-2.0.23-cp36-abi3-macosx_10_14_x86_64.whl
.
File metadata
- Download URL: pnumpy-2.0.23-cp36-abi3-macosx_10_14_x86_64.whl
- Upload date:
- Size: 363.9 kB
- Tags: CPython 3.6+, macOS 10.14+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/53.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29d5e0d5f4c871b173c1a1fab45ae290769455eff9b75e3fb70585a851547b0c |
|
MD5 | feee2f9bc028a8abcacd7f3088969570 |
|
BLAKE2b-256 | bc3bfa8a74add72731a7a2dfe96384fef082e213b133764b4a418393ad276696 |