Skip to main content

simdjson bindings for python

Project description

PyPI - License Tests

pysimdjson

Python bindings for the simdjson project, a SIMD-accelerated JSON parser.

Bindings are currently tested on OS X, Linux, and Windows for Python 3.4+.

Installation

If binary wheels are available for your platform, you can install from pip with no further requirements:

pip install pysimdjson

The project is self-contained, and has no additional dependencies. If binary wheels are not available for your platform, or you want to build from source for the best performance, you'll need a C++11-capable compiler to compile the sources:

pip install 'pysimdjson[dev]' --no-binary :all:

Development and Testing

This project comes with a full test suite. To install development and testing dependencies, use:

pip install -e ".[dev]"

To also install 3rd party JSON libraries used for running benchmarks, use:

pip install -e ".[benchmark]"

To run the tests, just type pytest. To also run the benchmarks, use pytest --runslow.

To properly test on Windows, you need both a recent version of Visual Studio (VS) as well as VS2015, patch 3. Older versions of CPython required portable C/C++ extensions to be built with the same version of VS as the interpreter. Use the Developer Command Prompt to easily switch between versions.

How It Works

This project uses pybind11 to generate the low-level bindings on top of the simdjson project. You can use it just like the built-in json module, or use the simdjson-specific API for much better performance.

import simdjson
doc = simdjson.loads('{"hello": "world"}')

Making things faster

pysimdjson provides an api compatible with the built-in json module for convenience, and this API is pretty fast (beating or tying all other Python JSON libraries). However, it also provides a simdjson-specific API that can perform significantly better.

Don't load the entire document

95% of the time spent loading a JSON document into Python is spent in the creation of Python objects, not the actual parsing of the document. You can avoid all of this overhead by ignoring parts of the document you don't want.

pysimdjson supports this in two ways - the use of JSON pointers via at(), or proxies for objects and lists.

import simdjson
parser = simdjson.Parser()
doc = parser.parse(b'{"res": [{"name": "first"}, {"name": "second"}]')

For our sample above, we really just want the second entry in res, we don't care about anything else. We can do this two ways:

assert doc['res'][1]['name'] == 'second' # True
assert doc.at('res/1/name') == 'second' # True

Both of these approaches will be much faster than using load/s(), since they avoid loading the parts of the document we didn't care about.

Re-use the parser.

One of the easiest performance gains if you're working on many documents is to re-use the parser.

import simdjson
parser = simdjson.Parser()

for i in range(0, 100):
    doc = parser.parse(b'{"a": "b"})

This will drastically reduce the number of allocations being made, as it will reuse the existing buffer when possible. If it's too small, it'll grow to fit.

Performance Considerations

The actual parsing of a document is a small fraction (~5%) of the total time spent bringing a JSON document into CPython. However, even in the case of bringing the entire document into Python, pysimdjson will almost always be faster or equivelent to other high-speed Python libraries.

There are two things to keep in mind when trying to get the best performance:

  1. Do you really need the entire document? If you have a JSON document with thousands of keys but just need to check if the "published" key is True, use the JSON pointer interface to pull only a single field into Python.
  2. There is significant overhead in calling a C++ function from Python. Minimizing the number of function calls can offer significant speedups in some use cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysimdjson-2.0.9.tar.gz (204.8 kB view details)

Uploaded Source

Built Distributions

pysimdjson-2.0.9-pp36-pypy36_pp73-macosx_10_9_x86_64.whl (173.1 kB view details)

Uploaded PyPy macOS 10.9+ x86-64

pysimdjson-2.0.9-cp38-cp38-win_amd64.whl (140.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

pysimdjson-2.0.9-cp38-cp38-macosx_10_14_x86_64.whl (194.1 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pysimdjson-2.0.9-cp37-cp37m-win_amd64.whl (140.2 kB view details)

Uploaded CPython 3.7m Windows x86-64

pysimdjson-2.0.9-cp37-cp37m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pysimdjson-2.0.9-cp36-cp36m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

pysimdjson-2.0.9-cp35-cp35m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.5m macOS 10.14+ x86-64

File details

Details for the file pysimdjson-2.0.9.tar.gz.

File metadata

  • Download URL: pysimdjson-2.0.9.tar.gz
  • Upload date:
  • Size: 204.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for pysimdjson-2.0.9.tar.gz
Algorithm Hash digest
SHA256 07a8f1cddefce2bde6f13970c304f6807da63a89c4c4d1e40d5e01ea22d577ce
MD5 8898de18ffc1eb1adad6fc34a2393b7a
BLAKE2b-256 4b6b65c37ac94520bf4428dee5333f4d282264b19e1fd7ea317a8f9c0426db9a

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-pp36-pypy36_pp73-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 173.1 kB
  • Tags: PyPy, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 PyPy/7.3.1

File hashes

Hashes for pysimdjson-2.0.9-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a127c838b1e34e137d3be9bd9aef1649c307cce84b6c113e520e50c572a2afb0
MD5 6a200a0f93758485b6245fce01c1470f
BLAKE2b-256 7cffcb9f73dc955aa96d0b1ecf56c1be79e00a8db841dc21cc3b0240176eeae3

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 140.6 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.9-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 40d8e1bfc7433b75088ca48d40cccbb3f66326f8f22b8200cecf7b7e7f85a6bd
MD5 4ce8c4e01833c802afd4a04c833c888e
BLAKE2b-256 763eed254900a5943ed604079ff4d2e693368294419c68144529978818234005

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 194.1 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.9-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1115740e518a1ead4779f1f80a8e779ceec4c16d1e18b514b77723f29ab11e02
MD5 d35e1467b34f1863adc206ace4a872a1
BLAKE2b-256 25f37cfb87d4339f48c99d546060d23e24ba0ed917ea6bf05c6c2a1fa8a8c539

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 140.2 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.9-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 431fb93c2eec623553477be142733dfa302170d93e930070472d3f289a898ed7
MD5 518af17fb87c6ffc9630e557524ee668
BLAKE2b-256 55c8b5723cc1327ff38ec91a9479bd8321e27e45628eeeec8b01a80a7a1c8b90

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.9-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f7ef4c774ac423c8ab04d17fa3306766ff2d6ce0d17276c0d4c19d22ba2c203c
MD5 31e3cfac30c07ae85a6edaaf94ccdf96
BLAKE2b-256 84810588c2f4b456d7b765ed86adf063afafb3c4973886d6bf4031e652b41f3c

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for pysimdjson-2.0.9-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8a35cbdad37d5ee3b6d8669283b40b5fbaeec996480a0595017a459f30abf4d8
MD5 59fd4d6e1601b6d2b5fd630a3e2ca054
BLAKE2b-256 4dc3cc7c9e2d0d4bdc45c451f7d24dff463e055f274421e772f30426457d957a

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.9-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.9-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for pysimdjson-2.0.9-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cba64268e5e23bcb82f5b4dbc714aeb7a9bdf510ed0f02fe808d3bcc7dbd910d
MD5 3e737a46d4c53a3eb88fcbd38a01598e
BLAKE2b-256 a614ebdf004394e3043a37958b6bebbd92eb9624a3c7803858351059793b1753

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page