Python wrapper for the C-Blosc2 library.
Project description
A Python wrapper for the extremely fast Blosc2 compression library
- Author:
The Blosc development team
- Contact:
- Github:
- PyPi:
- Gitter:
- Code of Conduct:
What it is
Blosc (http://blosc.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call.
Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc.
python-blosc2 is a Python package that wraps C-Blosc2, the newest version of the Blosc compressor. Currently python-blosc2 already reproduces the API of python-blosc, so the former can be used as a drop-in replacement for the later. However, there are a few exceptions for the complete compatibility that are listed here: https://github.com/Blosc/python-blosc2/blob/main/RELEASE_NOTES.md#changes-from-python-blosc-to-python-blosc2
In addition, python-blosc2 aims to leverage the new C-Blosc2 API so as to support super-chunks, serialization and all the features introduced in C-Blosc2. This is work in process and will be done incrementally in future releases.
Note: python-blosc2 is meant to be backward compatible with python-blosc data. That means that it can read data generated with python-blosc, but the opposite is not true (i.e. there is no forward compatibility).
Installing
Blosc is now offering Python wheels for the main OS (Win, Mac and Linux) and platforms. You can install binary packages from PyPi using pip:
pip install blosc2
Documentation
The documentation is here:
https://blosc.org/python-blosc2/python-blosc2.html
Also, some examples are available on:
Building
python-blosc2 comes with the Blosc sources with it and can be built with:
git clone https://github.com/Blosc/python-blosc2/
cd python-blosc2
git submodule update --init --recursive
python -m pip install -r requirements.txt
python setup.py build_ext --inplace
That’s all. You can proceed with testing section now.
Testing
After compiling, you can quickly check that the package is sane by running the doctests in blosc/test.py:
python -m pip install -r requirements-tests.txt
python -m pytest (add -v for verbose mode)
Benchmarking
If curious, you may want to run a small benchmark that compares a plain NumPy array copy against compression through different compressors in your Blosc build:
PYTHONPATH=. python bench/pack_compress.py
Just to whet your appetite, here are some speed figures for an Intel box (i9-10940X CPU @ 3.30GHz, 14 cores) running Ubuntu 22.04. In particular, see how performance for pack_array2/unpack_array2 has improved vs the previous version (labeled as pack_array/unpack_array):
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= python-blosc2 version: 0.3.3.dev0 Blosc version: 2.4.2.dev ($Date:: 2022-09-16 #$) Compressors available: ['blosclz', 'lz4', 'lz4hc', 'zlib', 'zstd'] Compressor library versions: BLOSCLZ: 2.5.1 LZ4: 1.9.4 LZ4HC: 1.9.4 ZLIB: 1.2.11.zlib-ng ZSTD: 1.5.2 Python version: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21) [GCC 10.3.0] Platform: Linux-5.15.0-41-generic-x86_64 (#44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022) Linux dist: Ubuntu 22.04 LTS Processor: x86_64 Byte-ordering: little Detected cores: 14.0 Number of threads to use by default: 8 -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Creating NumPy arrays with 10**8 int64/float64 elements: Time for copying array with np.copy: 0.394 s (3.79 GB/s)) *** the arange linear distribution *** Using *** Codec.BLOSCLZ *** compressor: Time for compress/decompress: 0.051/0.101 s (29.08/14.80 GB/s)) cr: 444.3x Time for pack_array/unpack_array: 0.600/0.764 s (2.49/1.95 GB/s)) cr: 442.3x Time for pack_array2/unpack_array2: 0.059/0.158 s (25.28/9.44 GB/s)) cr: 444.2x Using *** Codec.LZ4 *** compressor: Time for compress/decompress: 0.059/0.116 s (25.07/12.82 GB/s)) cr: 279.2x Time for pack_array/unpack_array: 0.615/0.758 s (2.42/1.97 GB/s)) cr: 277.9x Time for pack_array2/unpack_array2: 0.058/0.160 s (25.52/9.31 GB/s)) cr: 279.2x Using *** Codec.LZ4HC *** compressor: Time for compress/decompress: 0.193/0.085 s (7.71/17.45 GB/s)) cr: 155.9x Time for pack_array/unpack_array: 0.786/0.754 s (1.89/1.98 GB/s)) cr: 155.4x Time for pack_array2/unpack_array2: 0.218/0.165 s (6.84/9.02 GB/s)) cr: 155.9x Using *** Codec.ZLIB *** compressor: Time for compress/decompress: 0.250/0.141 s (5.96/10.55 GB/s)) cr: 273.8x Time for pack_array/unpack_array: 0.799/0.845 s (1.87/1.76 GB/s)) cr: 273.2x Time for pack_array2/unpack_array2: 0.261/0.243 s (5.71/6.13 GB/s)) cr: 273.8x Using *** Codec.ZSTD *** compressor: Time for compress/decompress: 0.189/0.079 s (7.89/18.92 GB/s)) cr: 644.9x Time for pack_array/unpack_array: 0.725/0.770 s (2.06/1.94 GB/s)) cr: 630.9x Time for pack_array2/unpack_array2: 0.206/0.143 s (7.25/10.39 GB/s)) cr: 644.8x *** the linspace linear distribution *** Using *** Codec.BLOSCLZ *** compressor: Time for compress/decompress: 0.091/0.113 s (16.34/13.21 GB/s)) cr: 50.1x Time for pack_array/unpack_array: 0.623/0.751 s (2.39/1.98 GB/s)) cr: 50.0x Time for pack_array2/unpack_array2: 0.124/0.163 s (11.98/9.12 GB/s)) cr: 50.1x Using *** Codec.LZ4 *** compressor: Time for compress/decompress: 0.077/0.114 s (19.33/13.12 GB/s)) cr: 55.7x Time for pack_array/unpack_array: 0.624/0.740 s (2.39/2.01 GB/s)) cr: 55.8x Time for pack_array2/unpack_array2: 0.098/0.190 s (15.19/7.83 GB/s)) cr: 55.7x Using *** Codec.LZ4HC *** compressor: Time for compress/decompress: 0.352/0.075 s (4.23/19.98 GB/s)) cr: 53.6x Time for pack_array/unpack_array: 0.918/0.781 s (1.62/1.91 GB/s)) cr: 53.6x Time for pack_array2/unpack_array2: 0.389/0.139 s (3.83/10.72 GB/s)) cr: 53.6x Using *** Codec.ZLIB *** compressor: Time for compress/decompress: 0.395/0.148 s (3.77/10.08 GB/s)) cr: 50.4x Time for pack_array/unpack_array: 0.940/0.824 s (1.59/1.81 GB/s)) cr: 50.4x Time for pack_array2/unpack_array2: 0.433/0.252 s (3.44/5.92 GB/s)) cr: 50.4x Using *** Codec.ZSTD *** compressor: Time for compress/decompress: 0.402/0.098 s (3.71/15.22 GB/s)) cr: 74.7x Time for pack_array/unpack_array: 0.949/0.782 s (1.57/1.91 GB/s)) cr: 74.7x Time for pack_array2/unpack_array2: 0.426/0.175 s (3.50/8.49 GB/s)) cr: 74.7x *** the random distribution *** Using *** Codec.BLOSCLZ *** compressor: Time for compress/decompress: 0.240/0.119 s (6.22/12.48 GB/s)) cr: 4.0x Time for pack_array/unpack_array: 0.794/0.767 s (1.88/1.94 GB/s)) cr: 4.0x Time for pack_array2/unpack_array2: 0.578/0.162 s (2.58/9.20 GB/s)) cr: 4.0x Using *** Codec.LZ4 *** compressor: Time for compress/decompress: 0.250/0.114 s (5.97/13.11 GB/s)) cr: 4.0x Time for pack_array/unpack_array: 0.794/0.767 s (1.88/1.94 GB/s)) cr: 4.0x Time for pack_array2/unpack_array2: 0.590/0.161 s (2.53/9.24 GB/s)) cr: 4.0x Using *** Codec.LZ4HC *** compressor: Time for compress/decompress: 1.102/0.088 s (1.35/17.01 GB/s)) cr: 4.0x Time for pack_array/unpack_array: 1.690/0.758 s (0.88/1.97 GB/s)) cr: 4.0x Time for pack_array2/unpack_array2: 1.445/0.178 s (1.03/8.38 GB/s)) cr: 4.0x Using *** Codec.ZLIB *** compressor: Time for compress/decompress: 1.258/0.210 s (1.18/7.11 GB/s)) cr: 4.7x Time for pack_array/unpack_array: 1.822/0.898 s (0.82/1.66 GB/s)) cr: 4.7x Time for pack_array2/unpack_array2: 1.549/0.355 s (0.96/4.20 GB/s)) cr: 4.7x Using *** Codec.ZSTD *** compressor: Time for compress/decompress: 1.653/0.098 s (0.90/15.21 GB/s)) cr: 4.4x Time for pack_array/unpack_array: 2.206/0.796 s (0.68/1.87 GB/s)) cr: 4.4x Time for pack_array2/unpack_array2: 2.077/0.179 s (0.72/8.30 GB/s)) cr: 4.4x
As can be seen, is perfectly possible for python-blosc2 to go faster than a plain memcpy(). But more interestingly, you can easily choose the codecs and filters that better adapt to your datasets, and persist and transmit them faster and using less memory.
Start using compression in your data workflows and feel the experience of doing more with less.
License
The software is licenses under a 3-Clause BSD license. A copy of the python-blosc2 license can be found in LICENSE. A copy of all licenses can be found in LICENSES/.
Mailing list
Discussion about this module is welcome in the Blosc list:
Please follow @Blosc2 to get informed about the latest developments.
Enjoy data!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file blosc2-0.5.1.tar.gz
.
File metadata
- Download URL: blosc2-0.5.1.tar.gz
- Upload date:
- Size: 2.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 70438bd8050e505266aa83820894047240eac1eb0a2cd77abd1e6da28f6a9f8b |
|
MD5 | ca6c6e04aac9ca37bbea12792c7f5858 |
|
BLAKE2b-256 | 9a1a2a55da1ad10821666106bd96153d9660efc89add2f16c4491b2f24378dce |
File details
Details for the file blosc2-0.5.1-cp310-cp310-win_amd64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8b83dd79d86c295163f8f684e6d111dcb62821c4f7c8fcc7a889c73432fd24fd |
|
MD5 | ce0413297cd3c07b86cdb7d205d35708 |
|
BLAKE2b-256 | e8cbb0c174cc333220513a9ff82027a493051ea4d796b3f9ce747fcbf6d9cfce |
File details
Details for the file blosc2-0.5.1-cp310-cp310-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp310-cp310-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.10, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 868ea8322163404837474a08e0758734ac8592abf9a3fbf46b89c5256fc8bd7f |
|
MD5 | 6adad36b74d2a5c69c450ac46dfe03cd |
|
BLAKE2b-256 | b8135adf0d747ff6c645a0bf8229b092df8b7935123911cdb2282ed9d82bf969 |
File details
Details for the file blosc2-0.5.1-cp310-cp310-musllinux_1_1_i686.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp310-cp310-musllinux_1_1_i686.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.10, musllinux: musl 1.1+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5f8e77568a610c167c50ab64972cd0e40dab43a4b95ec4ecb160b19d5aa6a8a |
|
MD5 | 05e4a62b5d194f35f2a90e2a15297381 |
|
BLAKE2b-256 | 874402b215bc392e871184db92ae8dc13f61561485f91707c928fe31950db811 |
File details
Details for the file blosc2-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6c97c13f391406c8d7b28539734d7f2a1b9d6ca4b8e36d4a731ea6b29520da97 |
|
MD5 | 02216d5e20cc666d277ca27461243bfe |
|
BLAKE2b-256 | e30db9b8be7136e85703c8c32563e0c13ee9b6a076ade291d5258749011a67d4 |
File details
Details for the file blosc2-0.5.1-cp310-cp310-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp310-cp310-macosx_10_9_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.10, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1a1da3cacd71f40121cdf647df1d012a5f370b6a006112718995b16f8ff8d69 |
|
MD5 | d655329bf4b9e24ce2cc604ed0f9aace |
|
BLAKE2b-256 | 545652afc8c201c0786d1d4d06530c13518476d5f910b585540c774481a1ebd9 |
File details
Details for the file blosc2-0.5.1-cp39-cp39-win_amd64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e566483ba8dd4cc727acd7e941f3d0058c58822e945c2e86c9e76d2ea7cf7fa |
|
MD5 | 9af55eab1ad63a6b7b87539c5f080e09 |
|
BLAKE2b-256 | 8321582003fd8cc5a60afccd24ca9e72e799c53a0008c83b0058f0ef4db8d581 |
File details
Details for the file blosc2-0.5.1-cp39-cp39-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp39-cp39-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.9, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a08b5971226333f2da0f08dac2a37494431f2df95022d863008acf9023482cc6 |
|
MD5 | 442d7a5964cc102b3ae35eb69dc0cd45 |
|
BLAKE2b-256 | ab09b88c3757f13963ac01478ff1418f5e6c15052689916f5dded1f98e9f7d1f |
File details
Details for the file blosc2-0.5.1-cp39-cp39-musllinux_1_1_i686.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp39-cp39-musllinux_1_1_i686.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.9, musllinux: musl 1.1+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 21b715137bbb145c5d55ecda6914ff36d7b50831b45049b854df32aa87af1a82 |
|
MD5 | d06f43f5193bfbb32f345b153db0eb38 |
|
BLAKE2b-256 | c1538b6a1ce4b2a99e02a5774f105d018577728a03c5b50763a7888b1ab415c5 |
File details
Details for the file blosc2-0.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 776dd0e3d0592be30dfefef2e45014effa8662e063994672687dad33b8c58f4c |
|
MD5 | d220a9e935f308cc4f27b5d18fd0fa97 |
|
BLAKE2b-256 | 0954d0a2d7fdb593ac8d43810692c6072887dafc095883059f442ae6bba64b7b |
File details
Details for the file blosc2-0.5.1-cp39-cp39-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp39-cp39-macosx_10_9_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.9, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e18fcce4ec252acdeb44a1848cf044a13bad76fdf77a69cb79588dd9fd09aab2 |
|
MD5 | 9014984a13d51c30600680cdd0aace62 |
|
BLAKE2b-256 | 7d9e984adf3f53f70c9129c6b97a5de1de6f9ba3426969e6efe430fd29a45c15 |
File details
Details for the file blosc2-0.5.1-cp38-cp38-win_amd64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp38-cp38-win_amd64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.8, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 505c155ec634239ee6693a601c1cfacd8c7bb7b9d42b8368ade1ede53481df27 |
|
MD5 | d994af84dd28f27e3a7d856690611604 |
|
BLAKE2b-256 | 059c7516127b436bac6e8a2ff1edd080a01a9097f8b2e3319a50063b9aeeecce |
File details
Details for the file blosc2-0.5.1-cp38-cp38-musllinux_1_1_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp38-cp38-musllinux_1_1_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.8, musllinux: musl 1.1+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfe41170b70dd1972cc48e3e86f0994c4c8b1ed208d49359cd2fadd81bb39bdd |
|
MD5 | caef6bf02d42804d8ddba57b50f4a1b0 |
|
BLAKE2b-256 | 42025d64da3aacd00f2d70a5f1867ba93e079b8792f12b6c824b220bfe747ed1 |
File details
Details for the file blosc2-0.5.1-cp38-cp38-musllinux_1_1_i686.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp38-cp38-musllinux_1_1_i686.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.8, musllinux: musl 1.1+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f5d29474bade4dfdb3b9ce7ee6960d1fb15d0d510b2b9f53a27dd348af0302d |
|
MD5 | a8bbdb2a0fe4a11fc70f5803cc374dd8 |
|
BLAKE2b-256 | 8ba08f1d53818358a25f61995ba6f74f4e00e075dddccdaef3fbe746b4fe2f2f |
File details
Details for the file blosc2-0.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69d6ad33ed56a7e54d28883d7d71dd79d48171d31488125b1c1fd97873bfc644 |
|
MD5 | 8ee1f1bf6190ac565cff1272b9170362 |
|
BLAKE2b-256 | 6b01f4adff3437ded28a99e72206a1a0c48914d016c78a4bd4f9254ca1b15b9d |
File details
Details for the file blosc2-0.5.1-cp38-cp38-macosx_10_9_x86_64.whl
.
File metadata
- Download URL: blosc2-0.5.1-cp38-cp38-macosx_10_9_x86_64.whl
- Upload date:
- Size: 3.9 MB
- Tags: CPython 3.8, macOS 10.9+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f8700b6a4230edc4f946a116f9aed479f80e06bd08f074bf60c40174c791e3b |
|
MD5 | 1d126adaa2141241d15d9f3a7bf5966d |
|
BLAKE2b-256 | 6ac5cf88423b1b6a2678297f570a5cf96146c808907145495ea39b87e04f63bd |