Skip to main content

Python wrapper for the C-Blosc2 library.

Project description

A Python wrapper for the extremely fast Blosc2 compression library

Author:

The Blosc development team

Contact:

blosc@blosc.org

Github:

https://github.com/Blosc/python-blosc2

PyPi:

version

Gitter:

gitter

Code of Conduct:

Contributor Covenant

What it is

Blosc (http://blosc.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call.

Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc.

python-blosc2 is a Python package that wraps C-Blosc2, the newest version of the Blosc compressor. Currently python-blosc2 already reproduces the API of python-blosc, so the former can be used as a drop-in replacement for the later. However, there are a few exceptions for the complete compatibility that are listed here: https://github.com/Blosc/python-blosc2/blob/main/RELEASE_NOTES.md#changes-from-python-blosc-to-python-blosc2

In addition, python-blosc2 aims to leverage the new C-Blosc2 API so as to support super-chunks, serialization and all the features introduced in C-Blosc2. This is work in process and will be done incrementally in future releases.

Note: python-blosc2 is meant to be backward compatible with python-blosc data. That means that it can read data generated with python-blosc, but the opposite is not true (i.e. there is no forward compatibility).

Installing

Blosc is now offering Python wheels for the main OS (Win, Mac and Linux) and platforms. You can install binary packages from PyPi using pip:

pip install blosc2

Documentation

The documentation is here:

https://blosc.org/python-blosc2/python-blosc2.html

Also, some examples are available on:

https://github.com/Blosc/python-blosc2/tree/main/examples

Building

python-blosc2 comes with the Blosc sources with it and can be built with:

git clone https://github.com/Blosc/python-blosc2/
cd python-blosc2
git submodule update --init --recursive
python -m pip install -r requirements.txt
python setup.py build_ext --inplace

That’s all. You can proceed with testing section now.

Testing

After compiling, you can quickly check that the package is sane by running the doctests in blosc/test.py:

python -m pip install -r requirements-tests.txt
python -m pytest  (add -v for verbose mode)

Benchmarking

If curious, you may want to run a small benchmark that compares a plain NumPy array copy against compression through different compressors in your Blosc build:

PYTHONPATH=. python bench/pack_compress.py

Just to whet your appetite, here are some speed figures for an Intel box (i9-10940X CPU @ 3.30GHz, 14 cores) running Ubuntu 22.04. In particular, see how performance for pack_array2/unpack_array2 has improved vs the previous version (labeled as pack_array/unpack_array):

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
python-blosc2 version: 0.3.3.dev0
Blosc version: 2.4.2.dev ($Date:: 2022-09-16 #$)
Compressors available: ['blosclz', 'lz4', 'lz4hc', 'zlib', 'zstd']
Compressor library versions:
  BLOSCLZ: 2.5.1
  LZ4: 1.9.4
  LZ4HC: 1.9.4
  ZLIB: 1.2.11.zlib-ng
  ZSTD: 1.5.2
Python version: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21)
[GCC 10.3.0]
Platform: Linux-5.15.0-41-generic-x86_64 (#44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022)
Linux dist: Ubuntu 22.04 LTS
Processor: x86_64
Byte-ordering: little
Detected cores: 14.0
Number of threads to use by default: 8
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Creating NumPy arrays with 10**8 int64/float64 elements:
  Time for copying array with np.copy:                  0.394 s (3.79 GB/s))


*** the arange linear distribution ***
Using *** Codec.BLOSCLZ *** compressor:
  Time for compress/decompress:         0.051/0.101 s (29.08/14.80 GB/s))   cr: 444.3x
  Time for pack_array/unpack_array:     0.600/0.764 s (2.49/1.95 GB/s))     cr: 442.3x
  Time for pack_array2/unpack_array2:   0.059/0.158 s (25.28/9.44 GB/s))    cr: 444.2x
Using *** Codec.LZ4 *** compressor:
  Time for compress/decompress:         0.059/0.116 s (25.07/12.82 GB/s))   cr: 279.2x
  Time for pack_array/unpack_array:     0.615/0.758 s (2.42/1.97 GB/s))     cr: 277.9x
  Time for pack_array2/unpack_array2:   0.058/0.160 s (25.52/9.31 GB/s))    cr: 279.2x
Using *** Codec.LZ4HC *** compressor:
  Time for compress/decompress:         0.193/0.085 s (7.71/17.45 GB/s))    cr: 155.9x
  Time for pack_array/unpack_array:     0.786/0.754 s (1.89/1.98 GB/s))     cr: 155.4x
  Time for pack_array2/unpack_array2:   0.218/0.165 s (6.84/9.02 GB/s))     cr: 155.9x
Using *** Codec.ZLIB *** compressor:
  Time for compress/decompress:         0.250/0.141 s (5.96/10.55 GB/s))    cr: 273.8x
  Time for pack_array/unpack_array:     0.799/0.845 s (1.87/1.76 GB/s))     cr: 273.2x
  Time for pack_array2/unpack_array2:   0.261/0.243 s (5.71/6.13 GB/s))     cr: 273.8x
Using *** Codec.ZSTD *** compressor:
  Time for compress/decompress:         0.189/0.079 s (7.89/18.92 GB/s))    cr: 644.9x
  Time for pack_array/unpack_array:     0.725/0.770 s (2.06/1.94 GB/s))     cr: 630.9x
  Time for pack_array2/unpack_array2:   0.206/0.143 s (7.25/10.39 GB/s))    cr: 644.8x

*** the linspace linear distribution ***
Using *** Codec.BLOSCLZ *** compressor:
  Time for compress/decompress:         0.091/0.113 s (16.34/13.21 GB/s))   cr:  50.1x
  Time for pack_array/unpack_array:     0.623/0.751 s (2.39/1.98 GB/s))     cr:  50.0x
  Time for pack_array2/unpack_array2:   0.124/0.163 s (11.98/9.12 GB/s))    cr:  50.1x
Using *** Codec.LZ4 *** compressor:
  Time for compress/decompress:         0.077/0.114 s (19.33/13.12 GB/s))   cr:  55.7x
  Time for pack_array/unpack_array:     0.624/0.740 s (2.39/2.01 GB/s))     cr:  55.8x
  Time for pack_array2/unpack_array2:   0.098/0.190 s (15.19/7.83 GB/s))    cr:  55.7x
Using *** Codec.LZ4HC *** compressor:
  Time for compress/decompress:         0.352/0.075 s (4.23/19.98 GB/s))    cr:  53.6x
  Time for pack_array/unpack_array:     0.918/0.781 s (1.62/1.91 GB/s))     cr:  53.6x
  Time for pack_array2/unpack_array2:   0.389/0.139 s (3.83/10.72 GB/s))    cr:  53.6x
Using *** Codec.ZLIB *** compressor:
  Time for compress/decompress:         0.395/0.148 s (3.77/10.08 GB/s))    cr:  50.4x
  Time for pack_array/unpack_array:     0.940/0.824 s (1.59/1.81 GB/s))     cr:  50.4x
  Time for pack_array2/unpack_array2:   0.433/0.252 s (3.44/5.92 GB/s))     cr:  50.4x
Using *** Codec.ZSTD *** compressor:
  Time for compress/decompress:         0.402/0.098 s (3.71/15.22 GB/s))    cr:  74.7x
  Time for pack_array/unpack_array:     0.949/0.782 s (1.57/1.91 GB/s))     cr:  74.7x
  Time for pack_array2/unpack_array2:   0.426/0.175 s (3.50/8.49 GB/s))     cr:  74.7x

*** the random distribution ***
Using *** Codec.BLOSCLZ *** compressor:
  Time for compress/decompress:         0.240/0.119 s (6.22/12.48 GB/s))    cr:   4.0x
  Time for pack_array/unpack_array:     0.794/0.767 s (1.88/1.94 GB/s))     cr:   4.0x
  Time for pack_array2/unpack_array2:   0.578/0.162 s (2.58/9.20 GB/s))     cr:   4.0x
Using *** Codec.LZ4 *** compressor:
  Time for compress/decompress:         0.250/0.114 s (5.97/13.11 GB/s))    cr:   4.0x
  Time for pack_array/unpack_array:     0.794/0.767 s (1.88/1.94 GB/s))     cr:   4.0x
  Time for pack_array2/unpack_array2:   0.590/0.161 s (2.53/9.24 GB/s))     cr:   4.0x
Using *** Codec.LZ4HC *** compressor:
  Time for compress/decompress:         1.102/0.088 s (1.35/17.01 GB/s))    cr:   4.0x
  Time for pack_array/unpack_array:     1.690/0.758 s (0.88/1.97 GB/s))     cr:   4.0x
  Time for pack_array2/unpack_array2:   1.445/0.178 s (1.03/8.38 GB/s))     cr:   4.0x
Using *** Codec.ZLIB *** compressor:
  Time for compress/decompress:         1.258/0.210 s (1.18/7.11 GB/s))     cr:   4.7x
  Time for pack_array/unpack_array:     1.822/0.898 s (0.82/1.66 GB/s))     cr:   4.7x
  Time for pack_array2/unpack_array2:   1.549/0.355 s (0.96/4.20 GB/s))     cr:   4.7x
Using *** Codec.ZSTD *** compressor:
  Time for compress/decompress:         1.653/0.098 s (0.90/15.21 GB/s))    cr:   4.4x
  Time for pack_array/unpack_array:     2.206/0.796 s (0.68/1.87 GB/s))     cr:   4.4x
  Time for pack_array2/unpack_array2:   2.077/0.179 s (0.72/8.30 GB/s))     cr:   4.4x

As can be seen, is perfectly possible for python-blosc2 to go faster than a plain memcpy(). But more interestingly, you can easily choose the codecs and filters that better adapt to your datasets, and persist and transmit them faster and using less memory.

Start using compression in your data workflows and feel the experience of doing more with less.

License

The software is licenses under a 3-Clause BSD license. A copy of the python-blosc2 license can be found in LICENSE. A copy of all licenses can be found in LICENSES/.

Mailing list

Discussion about this module is welcome in the Blosc list:

blosc@googlegroups.com

http://groups.google.es/group/blosc

Twitter

Please follow @Blosc2 to get informed about the latest developments.


Enjoy data!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blosc2-0.4.1.tar.gz (2.5 MB view details)

Uploaded Source

Built Distributions

blosc2-0.4.1-cp310-cp310-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

blosc2-0.4.1-cp310-cp310-musllinux_1_1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

blosc2-0.4.1-cp310-cp310-musllinux_1_1_i686.whl (4.0 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

blosc2-0.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

blosc2-0.4.1-cp310-cp310-macosx_10_9_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

blosc2-0.4.1-cp39-cp39-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

blosc2-0.4.1-cp39-cp39-musllinux_1_1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

blosc2-0.4.1-cp39-cp39-musllinux_1_1_i686.whl (4.0 MB view details)

Uploaded CPython 3.9 musllinux: musl 1.1+ i686

blosc2-0.4.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

blosc2-0.4.1-cp39-cp39-macosx_10_9_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

blosc2-0.4.1-cp38-cp38-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

blosc2-0.4.1-cp38-cp38-musllinux_1_1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

blosc2-0.4.1-cp38-cp38-musllinux_1_1_i686.whl (4.0 MB view details)

Uploaded CPython 3.8 musllinux: musl 1.1+ i686

blosc2-0.4.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

blosc2-0.4.1-cp38-cp38-macosx_10_9_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file blosc2-0.4.1.tar.gz.

File metadata

  • Download URL: blosc2-0.4.1.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.1.tar.gz
Algorithm Hash digest
SHA256 ba190aa5570a9d85384686849ab3fb330e2faf1e66c7bdcac411668c4cba11e3
MD5 4b7ecc848613d5c9ff71f3835c5bbca6
BLAKE2b-256 74a86d36099c077d5eb694161288b7b52df663b9be6777d5ae02600768894911

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: blosc2-0.4.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 6ae7ca12af92e501d62a2e9e80c472754cdec0e0f8e8e5a2b0da1291299b4076
MD5 809acf61260805db6269295318b6cdb8
BLAKE2b-256 c8596c67ee83324120d962790e3751586cccc4efe9177536944fefde335fdbaa

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 97acc796a38fcc3edf9b4fcf7098a5a6616dd8c74b64ccacae0850a2df2350f4
MD5 4779d246fea54bfe37ad6a6608c93540
BLAKE2b-256 c1a573c273c398582b8551fb6fb95d647bb7ed9f6b977af27a982ee98f49c5f9

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp310-cp310-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp310-cp310-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 9a74327e071624dab73f18cabc4561752a98cfa36c56389af9b608b3b3cb00bb
MD5 aef101632325ac897e23baab559f887b
BLAKE2b-256 7d5495d87e2e081504f96e472aef5dece52bd85cf2f8d2ae5e8e4389822cc368

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4a9d01da0631d3ef9113eff6985c7f24ac92ff95e99c69f0292906f53a1a81a2
MD5 0bf7381fd2e041f64747f2291588255b
BLAKE2b-256 532aff35a113f55a510c67256a4f3a8a5009bab15d7f6d0b5aa4bd39194a7078

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 47c82c59215b172567a222d7f75c2516253d41258ef4b20ef5071d95d79d714d
MD5 47ab82c66021001e6643b2cfb2bd5ece
BLAKE2b-256 443559a18b1664dba6e8babc4ef3b3e2717a9c9491d541bc21a1557071070912

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: blosc2-0.4.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f6da6d2bff621b96552c2379ffb89c8479f7170d46e4431bc9ab28f175beefe4
MD5 9f6a57c1abfbbc3735d71bb6d7ad9b66
BLAKE2b-256 35f926a5ed67b57a719c4478a82c076f668178fe1ea94a2c9b2658006a82f349

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 1a94043f104ba29831a77c7d3aa96c28f990bc9be2ce86c550755cdde2356eb9
MD5 81342ab58e86087d5e8f1a9d92972622
BLAKE2b-256 e20db36f6b9a6b31d227639f7141caee6e089145c2cc1782afb187e22f20a7f4

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp39-cp39-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp39-cp39-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 3f956020f9a5dcdd123093c51be2334767f07efe65b23c23d121b950f7a027de
MD5 582464d100b59bffc687aba8a5a87745
BLAKE2b-256 60945ff84a86065dc1fde67456d18f2a85a8eb100ca465eb87101b682b13651a

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8672b15d3f964ade764f0dd88affafd7890f62cdf29d3b154aac498418331382
MD5 27b214a6e44767c5fbce758f647aa18e
BLAKE2b-256 22557fddafcad3f61490772f20f5abaf020b67ad98ac0b6bb0ae771a74fd2858

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e0adb7eef0b649752abc94a9744decb31b2f2767768068086fd2bd9ab566b7ca
MD5 4a59e843f2672852c15816844ec49d65
BLAKE2b-256 dbc705c5fbe7c23cdb33b78c7bac55d92eb889c9fc111462d83e7669ba493229

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: blosc2-0.4.1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 4a88d9c810ea6247c021b9c82142d540d95903c28ad2d5b4762d2dbff6fb76ac
MD5 36a48e5e3dae73331761deaa38967863
BLAKE2b-256 e7b84a280f30eef414d3ca0bddf1aa9ba57f32667f1512e19c1cb54176f8e892

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp38-cp38-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 bbdead31f04436f536bfd86287049307554ff31984f9387c405b4b49c3776387
MD5 c2d79923d1ed95cb07e3b51bf72e852f
BLAKE2b-256 2876a0608280a91e95cabbdd75765922195996835607b4b004b5c397bfbff0c2

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp38-cp38-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp38-cp38-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 1ff78988cf26ac7428f2881689b8d248efc7fa3da333423efd242d08cad7c03d
MD5 67966da2af76e90d988a3720addca87e
BLAKE2b-256 121c40546e3ca9f091371e01307c94f1b66a42030552b06048489e52bacaf36d

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 265affcc25db852bec700783fb02d62b6b4f0507e6dbebd0ee00741dbf30cc63
MD5 0e646efbd57667ccc789572a2a36939b
BLAKE2b-256 ad52406760847c2b731fd950c52fb9449959ff12c6c0a5f66572762fa016ff19

See more details on using hashes here.

File details

Details for the file blosc2-0.4.1-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6b48b8e62726db0fad3bce7f17391dca3a89489aa7819df69c2acfb9841cc0a6
MD5 93f220af94e6da685149350abfeacd33
BLAKE2b-256 0477a84b4b5e05e3c64d58f67c243645649919d8994b927d331b80690f5a4ffc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page