Skip to main content

Python wrapper for the C-Blosc2 library.

Project description

A Python wrapper for the extremely fast Blosc2 compression library

Author:

The Blosc development team

Contact:

blosc@blosc.org

Github:

https://github.com/Blosc/python-blosc2

PyPi:

version

Gitter:

gitter

Code of Conduct:

Contributor Covenant

What it is

Blosc (http://blosc.org) is a high performance compressor optimized for binary data. It has been designed to transmit data to the processor cache faster than the traditional, non-compressed, direct memory fetch approach via a memcpy() OS call.

Blosc works well for compressing numerical arrays that contains data with relatively low entropy, like sparse data, time series, grids with regular-spaced values, etc.

python-blosc2 is a Python package that wraps C-Blosc2, the newest version of the Blosc compressor. Currently python-blosc2 already reproduces the API of python-blosc, so the former can be used as a drop-in replacement for the later. However, there are a few exceptions for the complete compatibility that are listed here: https://github.com/Blosc/python-blosc2/blob/main/RELEASE_NOTES.md#changes-from-python-blosc-to-python-blosc2

In addition, python-blosc2 aims to leverage the new C-Blosc2 API so as to support super-chunks, serialization and all the features introduced in C-Blosc2. This is work in process and will be done incrementally in future releases.

Note: python-blosc2 is meant to be backward compatible with python-blosc data. That means that it can read data generated with python-blosc, but the opposite is not true (i.e. there is no forward compatibility).

Installing

Blosc is now offering Python wheels for the main OS (Win, Mac and Linux) and platforms. You can install binary packages from PyPi using pip:

pip install blosc2

Documentation

The documentation is here:

https://blosc.org/python-blosc2/python-blosc2.html

Also, some examples are available on:

https://github.com/Blosc/python-blosc2/tree/main/examples

Building

python-blosc2 comes with the Blosc sources with it and can be built with:

git clone https://github.com/Blosc/python-blosc2/
cd python-blosc2
git submodule update --init --recursive
python -m pip install -r requirements.txt
python setup.py build_ext --inplace

That’s all. You can proceed with testing section now.

Testing

After compiling, you can quickly check that the package is sane by running the doctests in blosc/test.py:

python -m pip install -r requirements-tests.txt
python -m pytest  (add -v for verbose mode)

Benchmarking

If curious, you may want to run a small benchmark that compares a plain NumPy array copy against compression through different compressors in your Blosc build:

PYTHONPATH=. python bench/pack_compress.py

Just to whet your appetite, here are some speed figures for an Intel box (i9-10940X CPU @ 3.30GHz, 14 cores) running Ubuntu 22.04. In particular, see how performance for pack_array2/unpack_array2 has improved vs the previous version (labeled as pack_array/unpack_array):

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
python-blosc2 version: 0.3.3.dev0
Blosc version: 2.4.2.dev ($Date:: 2022-09-16 #$)
Compressors available: ['blosclz', 'lz4', 'lz4hc', 'zlib', 'zstd']
Compressor library versions:
  BLOSCLZ: 2.5.1
  LZ4: 1.9.4
  LZ4HC: 1.9.4
  ZLIB: 1.2.11.zlib-ng
  ZSTD: 1.5.2
Python version: 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:56:21)
[GCC 10.3.0]
Platform: Linux-5.15.0-41-generic-x86_64 (#44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022)
Linux dist: Ubuntu 22.04 LTS
Processor: x86_64
Byte-ordering: little
Detected cores: 14.0
Number of threads to use by default: 8
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Creating NumPy arrays with 10**8 int64/float64 elements:
  Time for copying array with np.copy:                  0.394 s (3.79 GB/s))


*** the arange linear distribution ***
Using *** Codec.BLOSCLZ *** compressor:
  Time for compress/decompress:         0.051/0.101 s (29.08/14.80 GB/s))   cr: 444.3x
  Time for pack_array/unpack_array:     0.600/0.764 s (2.49/1.95 GB/s))     cr: 442.3x
  Time for pack_array2/unpack_array2:   0.059/0.158 s (25.28/9.44 GB/s))    cr: 444.2x
Using *** Codec.LZ4 *** compressor:
  Time for compress/decompress:         0.059/0.116 s (25.07/12.82 GB/s))   cr: 279.2x
  Time for pack_array/unpack_array:     0.615/0.758 s (2.42/1.97 GB/s))     cr: 277.9x
  Time for pack_array2/unpack_array2:   0.058/0.160 s (25.52/9.31 GB/s))    cr: 279.2x
Using *** Codec.LZ4HC *** compressor:
  Time for compress/decompress:         0.193/0.085 s (7.71/17.45 GB/s))    cr: 155.9x
  Time for pack_array/unpack_array:     0.786/0.754 s (1.89/1.98 GB/s))     cr: 155.4x
  Time for pack_array2/unpack_array2:   0.218/0.165 s (6.84/9.02 GB/s))     cr: 155.9x
Using *** Codec.ZLIB *** compressor:
  Time for compress/decompress:         0.250/0.141 s (5.96/10.55 GB/s))    cr: 273.8x
  Time for pack_array/unpack_array:     0.799/0.845 s (1.87/1.76 GB/s))     cr: 273.2x
  Time for pack_array2/unpack_array2:   0.261/0.243 s (5.71/6.13 GB/s))     cr: 273.8x
Using *** Codec.ZSTD *** compressor:
  Time for compress/decompress:         0.189/0.079 s (7.89/18.92 GB/s))    cr: 644.9x
  Time for pack_array/unpack_array:     0.725/0.770 s (2.06/1.94 GB/s))     cr: 630.9x
  Time for pack_array2/unpack_array2:   0.206/0.143 s (7.25/10.39 GB/s))    cr: 644.8x

*** the linspace linear distribution ***
Using *** Codec.BLOSCLZ *** compressor:
  Time for compress/decompress:         0.091/0.113 s (16.34/13.21 GB/s))   cr:  50.1x
  Time for pack_array/unpack_array:     0.623/0.751 s (2.39/1.98 GB/s))     cr:  50.0x
  Time for pack_array2/unpack_array2:   0.124/0.163 s (11.98/9.12 GB/s))    cr:  50.1x
Using *** Codec.LZ4 *** compressor:
  Time for compress/decompress:         0.077/0.114 s (19.33/13.12 GB/s))   cr:  55.7x
  Time for pack_array/unpack_array:     0.624/0.740 s (2.39/2.01 GB/s))     cr:  55.8x
  Time for pack_array2/unpack_array2:   0.098/0.190 s (15.19/7.83 GB/s))    cr:  55.7x
Using *** Codec.LZ4HC *** compressor:
  Time for compress/decompress:         0.352/0.075 s (4.23/19.98 GB/s))    cr:  53.6x
  Time for pack_array/unpack_array:     0.918/0.781 s (1.62/1.91 GB/s))     cr:  53.6x
  Time for pack_array2/unpack_array2:   0.389/0.139 s (3.83/10.72 GB/s))    cr:  53.6x
Using *** Codec.ZLIB *** compressor:
  Time for compress/decompress:         0.395/0.148 s (3.77/10.08 GB/s))    cr:  50.4x
  Time for pack_array/unpack_array:     0.940/0.824 s (1.59/1.81 GB/s))     cr:  50.4x
  Time for pack_array2/unpack_array2:   0.433/0.252 s (3.44/5.92 GB/s))     cr:  50.4x
Using *** Codec.ZSTD *** compressor:
  Time for compress/decompress:         0.402/0.098 s (3.71/15.22 GB/s))    cr:  74.7x
  Time for pack_array/unpack_array:     0.949/0.782 s (1.57/1.91 GB/s))     cr:  74.7x
  Time for pack_array2/unpack_array2:   0.426/0.175 s (3.50/8.49 GB/s))     cr:  74.7x

*** the random distribution ***
Using *** Codec.BLOSCLZ *** compressor:
  Time for compress/decompress:         0.240/0.119 s (6.22/12.48 GB/s))    cr:   4.0x
  Time for pack_array/unpack_array:     0.794/0.767 s (1.88/1.94 GB/s))     cr:   4.0x
  Time for pack_array2/unpack_array2:   0.578/0.162 s (2.58/9.20 GB/s))     cr:   4.0x
Using *** Codec.LZ4 *** compressor:
  Time for compress/decompress:         0.250/0.114 s (5.97/13.11 GB/s))    cr:   4.0x
  Time for pack_array/unpack_array:     0.794/0.767 s (1.88/1.94 GB/s))     cr:   4.0x
  Time for pack_array2/unpack_array2:   0.590/0.161 s (2.53/9.24 GB/s))     cr:   4.0x
Using *** Codec.LZ4HC *** compressor:
  Time for compress/decompress:         1.102/0.088 s (1.35/17.01 GB/s))    cr:   4.0x
  Time for pack_array/unpack_array:     1.690/0.758 s (0.88/1.97 GB/s))     cr:   4.0x
  Time for pack_array2/unpack_array2:   1.445/0.178 s (1.03/8.38 GB/s))     cr:   4.0x
Using *** Codec.ZLIB *** compressor:
  Time for compress/decompress:         1.258/0.210 s (1.18/7.11 GB/s))     cr:   4.7x
  Time for pack_array/unpack_array:     1.822/0.898 s (0.82/1.66 GB/s))     cr:   4.7x
  Time for pack_array2/unpack_array2:   1.549/0.355 s (0.96/4.20 GB/s))     cr:   4.7x
Using *** Codec.ZSTD *** compressor:
  Time for compress/decompress:         1.653/0.098 s (0.90/15.21 GB/s))    cr:   4.4x
  Time for pack_array/unpack_array:     2.206/0.796 s (0.68/1.87 GB/s))     cr:   4.4x
  Time for pack_array2/unpack_array2:   2.077/0.179 s (0.72/8.30 GB/s))     cr:   4.4x

As can be seen, is perfectly possible for python-blosc2 to go faster than a plain memcpy(). But more interestingly, you can easily choose the codecs and filters that better adapt to your datasets, and persist and transmit them faster and using less memory.

Start using compression in your data workflows and feel the experience of doing more with less.

License

The software is licenses under a 3-Clause BSD license. A copy of the python-blosc2 license can be found in LICENSE. A copy of all licenses can be found in LICENSES/.

Mailing list

Discussion about this module is welcome in the Blosc list:

blosc@googlegroups.com

http://groups.google.es/group/blosc

Twitter

Please follow @Blosc2 to get informed about the latest developments.


Enjoy data!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blosc2-0.4.0.tar.gz (2.5 MB view details)

Uploaded Source

Built Distributions

blosc2-0.4.0-cp310-cp310-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

blosc2-0.4.0-cp310-cp310-musllinux_1_1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

blosc2-0.4.0-cp310-cp310-musllinux_1_1_i686.whl (4.0 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

blosc2-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

blosc2-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

blosc2-0.4.0-cp39-cp39-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

blosc2-0.4.0-cp39-cp39-musllinux_1_1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

blosc2-0.4.0-cp39-cp39-musllinux_1_1_i686.whl (4.0 MB view details)

Uploaded CPython 3.9 musllinux: musl 1.1+ i686

blosc2-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

blosc2-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

blosc2-0.4.0-cp38-cp38-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

blosc2-0.4.0-cp38-cp38-musllinux_1_1_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

blosc2-0.4.0-cp38-cp38-musllinux_1_1_i686.whl (4.0 MB view details)

Uploaded CPython 3.8 musllinux: musl 1.1+ i686

blosc2-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

blosc2-0.4.0-cp38-cp38-macosx_10_9_x86_64.whl (3.9 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file blosc2-0.4.0.tar.gz.

File metadata

  • Download URL: blosc2-0.4.0.tar.gz
  • Upload date:
  • Size: 2.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.0.tar.gz
Algorithm Hash digest
SHA256 c6dae94d9f232e14e9650641b2a8cc55ec63e5003f703e16b12ede02bdf24ea0
MD5 04daa71d8c12b64328a3a093bf0db642
BLAKE2b-256 222218c546239a23909af190cdeb334b6945fed22d8c1a09567dca1221096c5e

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: blosc2-0.4.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 84038aa2a107e9696522fe00161ce4fc81682bc18aad26e5247c36f9ce6f76c4
MD5 5c86fc85f5fa2a192fc32189168fb2dc
BLAKE2b-256 f1a8aa0abbbb9ed572fdff99af57b198d131e17da468e442e41c87953cbe9736

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 62ebfcdb1fd3638a0337a9fe050fa644a815d726b6b313eef6025bb6ba858f83
MD5 fe50a742a12ed005c4852c408771cc0d
BLAKE2b-256 4afc06c27d0a7116c6fdc44415c6842002e2cadfa501671ac93726d514a96ca2

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp310-cp310-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp310-cp310-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 e069eb8c23d04adfce27def84b09f7e3a1b44d1002c9048292d16cddd548f6f6
MD5 e37524f88737fb49a602553cf891a80f
BLAKE2b-256 9c2c6c7547ccaf44fa57316fedc3e90ff46c8594a987d70d1196475fd82a2080

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 61914ae241bdaaae03114230650ee1af010b86554dd8752fb399fdc16554a772
MD5 830af10b77d04718e8c655d3fd1d8b14
BLAKE2b-256 5ce857ee844be910b2731741317f725104ce3234cb08bbf068fc790891233dc4

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c1753564f21db425ff65407b6d5b8ba2d2896a9c28df83923e8dc214487efa2f
MD5 47affb163fb4be8efcad18e9547677a9
BLAKE2b-256 fbf617d24c568585cda8892f8451d2c1a99a17faa705a4e218fe7f964d3600f4

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: blosc2-0.4.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 bb7f23371475d09225bf40b3b61e674cf4ec514d6e5b5b369118803b88778505
MD5 4de647f10ec4774cdc41169e7707ef7c
BLAKE2b-256 9a424fe29430459558629ac97903dd1cc18a0b6915633cab367871c966dee764

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp39-cp39-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp39-cp39-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 b356431527ff486f8d0c6e7e3961745c139e1c38eb37c00d167d4916c6997f40
MD5 8fc8bcbeeeb90c5e19b1f4b3315689b5
BLAKE2b-256 1774e38aec8705d6ef2f546780617468755c944e5a1b92d20ba3ecccd6397526

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp39-cp39-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp39-cp39-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 65283be07c3823d1b8897ae64ea20f31a92fdbe1f63726a09539c4562af93925
MD5 cb8ce29b731ab94493d68b9a7704bfdf
BLAKE2b-256 c668c2dad5103e661a183c363bb9f678ec70967fefb375603adbdbc779945656

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 772878d29a1f3e8217d8af90cffdb0eff1f926da89aea3dac685d44b06dd6822
MD5 9179027af11cee1d655b6c1cad3b5ca6
BLAKE2b-256 fc73f7fdb08a940a6c07100bff7f040b73bff2d32f0629d23c069de84c89fbc2

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 b14cce0d96ea7b95f4b1beb6cf0c8468a2d31956a2d7f7fdc33b106611aacda9
MD5 1d6e12741d928465f1cbd4f5fac1fed2
BLAKE2b-256 d73b2df2a40b3d541d4452485cb36e177fbcac158920c5b5424eb7f95a0c6cef

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: blosc2-0.4.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for blosc2-0.4.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e6784692b55e7837dd835ad7c5b4a86d08e6e1e3bea47866c095bd466f8cbfc7
MD5 b6cd7f8be9a58f42ed015bc95f369002
BLAKE2b-256 fbc00ab0a1fe3265bb9975ac5d33f4a94e7ae508e42ca164895732112982ee4b

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp38-cp38-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp38-cp38-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 e7fd764af49cdfff976854a02462668abbace7d226b1dd1b0f52d777e9961cee
MD5 22293b0e20e86fb5f293b6b5aa9a7f1e
BLAKE2b-256 edd088a42134ffe92b14cbef02dc150316ad285d1ce51936572bd48a8edf27d1

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp38-cp38-musllinux_1_1_i686.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp38-cp38-musllinux_1_1_i686.whl
Algorithm Hash digest
SHA256 51967e4caf155a4945dd016cf31eae312c6a49e5ce55e2917f75c20087c60650
MD5 667f20eb9dd713dafb8b572e19b6d4de
BLAKE2b-256 69def60142820b90976c547f0c27876fe66b6e8bd055d228cedac303b6eb21bf

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c888e156c79dd995ba1c6bb6832fb66557c50d94d89f0d7d41fb004af0dbd3ec
MD5 4f15886fa51041e77bfcd07940af7b55
BLAKE2b-256 03cee030739672a02add97b9ea43861c34d261a47f92ffd1f7bc3fba161a2c86

See more details on using hashes here.

File details

Details for the file blosc2-0.4.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for blosc2-0.4.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 95cf09a73c0ee8c3696659e71361d9e6816b8d80e258534f3184cab96644e040
MD5 8416cf868702cd0a2d7097e08d7c0667
BLAKE2b-256 9799a540d1fc672c55124767171f58553cd60531257b5f4d1e0510a0f363f96c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page