Skip to main content

simdjson bindings for python

Project description

PyPI - License Tests

pysimdjson

Python bindings for the simdjson project, a SIMD-accelerated JSON parser.

Bindings are currently tested on OS X, Linux, and Windows for Python 3.4+.

Installation

If binary wheels are available for your platform, you can install from pip with no further requirements:

pip install pysimdjson

The project is self-contained, and has no additional dependencies. If binary wheels are not available for your platform, or you want to build from source for the best performance, you'll need a C++11-capable compiler to compile the sources:

pip install 'pysimdjson[dev]' --no-binary :all:

Development and Testing

This project comes with a full test suite. To install development and testing dependencies, use:

pip install -e ".[dev]"

To also install 3rd party JSON libraries used for running benchmarks, use:

pip install -e ".[benchmark]"

To run the tests, just type pytest. To also run the benchmarks, use pytest --runslow.

To properly test on Windows, you need both a recent version of Visual Studio (VS) as well as VS2015, patch 3. Older versions of CPython required portable C/C++ extensions to be built with the same version of VS as the interpreter. Use the Developer Command Prompt to easily switch between versions.

How It Works

This project uses pybind11 to generate the low-level bindings on top of the simdjson project. You can use it just like the built-in json module, or use the simdjson-specific API for much better performance.

import simdjson
doc = simdjson.loads('{"hello": "world"}')

Making things faster

pysimdjson provides an api compatible with the built-in json module for convenience, and this API is pretty fast (beating or tying all other Python JSON libraries). However, it also provides a simdjson-specific API that can perform significantly better.

Don't load the entire document

95% of the time spent loading a JSON document into Python is spent in the creation of Python objects, not the actual parsing of the document. You can avoid all of this overhead by ignoring parts of the document you don't want.

pysimdjson supports this in two ways - the use of JSON pointers via at(), or proxies for objects and lists.

import simdjson
parser = simdjson.Parser()
doc = parser.parse(b'{"res": [{"name": "first"}, {"name": "second"}]')

For our sample above, we really just want the second entry in res, we don't care about anything else. We can do this two ways:

assert doc['res'][1]['name'] == 'second' # True
assert doc.at('res/1/name') == 'second' # True

Both of these approaches will be much faster than using load/s(), since they avoid loading the parts of the document we didn't care about.

Re-use the parser.

One of the easiest performance gains if you're working on many documents is to re-use the parser.

import simdjson
parser = simdjson.Parser()

for i in range(0, 100):
    doc = parser.parse(b'{"a": "b"})

This will drastically reduce the number of allocations being made, as it will reuse the existing buffer when possible. If it's too small, it'll grow to fit.

Performance Considerations

The actual parsing of a document is a small fraction (~5%) of the total time spent bringing a JSON document into CPython. However, even in the case of bringing the entire document into Python, pysimdjson will almost always be faster or equivelent to other high-speed Python libraries.

There are two things to keep in mind when trying to get the best performance:

  1. Do you really need the entire document? If you have a JSON document with thousands of keys but just need to check if the "published" key is True, use the JSON pointer interface to pull only a single field into Python.
  2. There is significant overhead in calling a C++ function from Python. Minimizing the number of function calls can offer significant speedups in some use cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysimdjson-2.0.8.tar.gz (204.8 kB view details)

Uploaded Source

Built Distributions

pysimdjson-2.0.8-pp36-pypy36_pp73-macosx_10_9_x86_64.whl (173.1 kB view details)

Uploaded PyPy macOS 10.9+ x86-64

pysimdjson-2.0.8-cp38-cp38-win_amd64.whl (140.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

pysimdjson-2.0.8-cp38-cp38-macosx_10_14_x86_64.whl (194.1 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pysimdjson-2.0.8-cp37-cp37m-win_amd64.whl (140.2 kB view details)

Uploaded CPython 3.7m Windows x86-64

pysimdjson-2.0.8-cp37-cp37m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pysimdjson-2.0.8-cp36-cp36m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

pysimdjson-2.0.8-cp35-cp35m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.5m macOS 10.14+ x86-64

File details

Details for the file pysimdjson-2.0.8.tar.gz.

File metadata

  • Download URL: pysimdjson-2.0.8.tar.gz
  • Upload date:
  • Size: 204.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for pysimdjson-2.0.8.tar.gz
Algorithm Hash digest
SHA256 87f4dcab837d83db5b45393b30501e9e86a7f1595524631892e242a5ce9c9499
MD5 35ed82dd8df8240e0f3c4907f150b0de
BLAKE2b-256 18d0e414836ecfb3325cef42fbbb76195267219d2d26124cb1b13edee614b010

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-pp36-pypy36_pp73-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 173.1 kB
  • Tags: PyPy, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 PyPy/7.3.1

File hashes

Hashes for pysimdjson-2.0.8-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 892faa00058ba0b9575307e7b41501c2d255abf37eef5aea4a4e878d16d18e58
MD5 42e3e5ed27d16c06bc65c44dd97aea76
BLAKE2b-256 6ba2c93cd2e2be029088e4c33a92c838095ac9e1d33882042376397a78a232f5

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 140.6 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.8-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1f98f57141cf9030da24b54616a21a158773b975f05b17e40a4db76c12137ba6
MD5 7718dbcfdc6744fb80cf6977827c2d70
BLAKE2b-256 7048158adf95d270dc4d8f992dfc0eeff064cd9df529998abfc99439ab133440

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 194.1 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.8-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4a621945026445ade0a07a955b515de0ae9e61a13f100a579143a9e7a77ac173
MD5 1548936472d0c2dc6432a49c7f0355a3
BLAKE2b-256 5c56b79bf7eaf7dba78d882a8c86a12e9c0d79e7c5e7a2cff5db64b375598ada

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 140.2 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.8-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9e10bd0cf6ed88cea7fecf7c94f93df087afbdb4a70bd3a37becae55bbabf234
MD5 6a272d34477fa72fd481f8647d553be2
BLAKE2b-256 f7306eb997b23dc8fb1790f6f07841c2740a22e31f0258613e2092e08f6c73de

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.8-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9f9b361fa65f6873c1470221daae1db0db0af7ef5e0f2abe7a7e8ff82033c8fa
MD5 970aeb1925555728f020e43020e51f8b
BLAKE2b-256 b0fb41c2ebb2186d0e9786106606609f541e448921881bd11d1530d9cc045c9d

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for pysimdjson-2.0.8-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1039f67cec93ce61ce98770c5002c24082cff5ac893b2d5993609013b6e0217c
MD5 e945399ceac531bab54444d9ccbd3220
BLAKE2b-256 f653cca56c451ea54fa0865a3b16fbd25c3fbdfdf29eb820819ca3fc0fec302b

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.8-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.8-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for pysimdjson-2.0.8-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 03eeaf58b92988a7768f225af14e04a6f0d7604792fb82be5e1f06d1b190e584
MD5 888ddb9bc3da3cfe459ccad7a65096dd
BLAKE2b-256 6fa829673ef986a401bb4b63542cf6c81b91014fffbad9c3d36d757bf5d42ba1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page