Skip to main content

simdjson bindings for python

Project description

PyPI - License Tests

pysimdjson

Python bindings for the simdjson project, a SIMD-accelerated JSON parser.

Bindings are currently tested on OS X, Linux, and Windows for Python 3.4+.

Installation

If binary wheels are available for your platform, you can install from pip with no further requirements:

pip install pysimdjson

The project is self-contained, and has no additional dependencies. If binary wheels are not available for your platform, or you want to build from source for the best performance, you'll need a C++11-capable compiler to compile the sources:

pip install 'pysimdjson[dev]' --no-binary :all:

Development and Testing

This project comes with a full test suite. To install development and testing dependencies, use:

pip install -e ".[dev]"

To also install 3rd party JSON libraries used for running benchmarks, use:

pip install -e ".[benchmark]"

To run the tests, just type pytest. To also run the benchmarks, use pytest --runslow.

To properly test on Windows, you need both a recent version of Visual Studio (VS) as well as VS2015, patch 3. Older versions of CPython required portable C/C++ extensions to be built with the same version of VS as the interpreter. Use the Developer Command Prompt to easily switch between versions.

How It Works

This project uses pybind11 to generate the low-level bindings on top of the simdjson project. You can use it just like the built-in json module, or use the simdjson-specific API for much better performance.

import simdjson
doc = simdjson.loads('{"hello": "world"}')

Making things faster

pysimdjson provides an api compatible with the built-in json module for convenience, and this API is pretty fast (beating or tying all other Python JSON libraries). However, it also provides a simdjson-specific API that can perform significantly better.

Don't load the entire document

95% of the time spent loading a JSON document into Python is spent in the creation of Python objects, not the actual parsing of the document. You can avoid all of this overhead by ignoring parts of the document you don't want.

pysimdjson supports this in two ways - the use of JSON pointers via at(), or proxies for objects and lists.

import simdjson
parser = simdjson.Parser()
doc = parser.parse(b'{"res": [{"name": "first"}, {"name": "second"}]')

For our sample above, we really just want the second entry in res, we don't care about anything else. We can do this two ways:

assert doc['res'][1]['name'] == 'second' # True
assert doc.at('res/1/name') == 'second' # True

Both of these approaches will be much faster than using load/s(), since they avoid loading the parts of the document we didn't care about.

Re-use the parser.

One of the easiest performance gains if you're working on many documents is to re-use the parser.

import simdjson
parser = simdjson.Parser()

for i in range(0, 100):
    doc = parser.parse(b'{"a": "b"})

This will drastically reduce the number of allocations being made, as it will reuse the existing buffer when possible. If it's too small, it'll grow to fit.

Performance Considerations

The actual parsing of a document is a small fraction (~5%) of the total time spent bringing a JSON document into CPython. However, even in the case of bringing the entire document into Python, pysimdjson will almost always be faster or equivelent to other high-speed Python libraries.

There are two things to keep in mind when trying to get the best performance:

  1. Do you really need the entire document? If you have a JSON document with thousands of keys but just need to check if the "published" key is True, use the JSON pointer interface to pull only a single field into Python.
  2. There is significant overhead in calling a C++ function from Python. Minimizing the number of function calls can offer significant speedups in some use cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysimdjson-2.0.10.tar.gz (203.6 kB view details)

Uploaded Source

Built Distributions

pysimdjson-2.0.10-pp36-pypy36_pp73-macosx_10_9_x86_64.whl (173.1 kB view details)

Uploaded PyPy macOS 10.9+ x86-64

pysimdjson-2.0.10-cp38-cp38-win_amd64.whl (140.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

pysimdjson-2.0.10-cp38-cp38-macosx_10_14_x86_64.whl (194.1 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pysimdjson-2.0.10-cp37-cp37m-win_amd64.whl (140.2 kB view details)

Uploaded CPython 3.7m Windows x86-64

pysimdjson-2.0.10-cp37-cp37m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pysimdjson-2.0.10-cp36-cp36m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

pysimdjson-2.0.10-cp35-cp35m-macosx_10_14_x86_64.whl (190.0 kB view details)

Uploaded CPython 3.5m macOS 10.14+ x86-64

File details

Details for the file pysimdjson-2.0.10.tar.gz.

File metadata

  • Download URL: pysimdjson-2.0.10.tar.gz
  • Upload date:
  • Size: 203.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.10.tar.gz
Algorithm Hash digest
SHA256 8f6bb65f1afd579e8abb64427006286a4725a6d3aad426029810ea76ad9187aa
MD5 21385beb91ae47014189c71dbc2e56e0
BLAKE2b-256 57f81c2dd8edb58083668e2366dc184bb88d937a36d3afc79f089e8c4747a098

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-pp36-pypy36_pp73-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 173.1 kB
  • Tags: PyPy, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 PyPy/7.3.1

File hashes

Hashes for pysimdjson-2.0.10-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 760c60c5544a3f1a85a5f34d201a77ab536c5d365431f49bc902b7bd4d7f9cfd
MD5 88ad9d6d31dd12ac96753405b4bead80
BLAKE2b-256 73b3103fd5dea068c2ead0c18d305435ab5f7804118336708a4a6f6b063263b0

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 140.6 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.10-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5f745ca4f3ba6936166be4842043b09e168a3d0aa7a33b9933354d5714442e3d
MD5 e4f64a7f811b71fbb2333a9ea4949d12
BLAKE2b-256 0e381c413c9b1326f09dc525ce2ec5414afb140e8e2040592fc652942b2226f7

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 194.1 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.10-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 13941768e6acc250e326f3b8ef463f29535f4f0d2ecdf21e1e61a81715eb2528
MD5 8ff6ed5850f18aca0cc5f7f2f8be1554
BLAKE2b-256 dd2207e754b02e28e4ece58e32a61d0539e2b872403796747791bdcf0c96f943

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 140.2 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.10-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 aa9529ee4644b9b37f7317fcd3c46feef7838508c6e01b2da7dcae7b07891209
MD5 4403b3f5731039b6c17032e095cf5c09
BLAKE2b-256 fa5704e1672f87aacf52e1b9515c70eee949b73e163b90c636bb9e51bd1e9118

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.10-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6a86c36e443a37570be4fceb960cc77784a10fae3f0cd5e09f9e47eb940248f6
MD5 e26335f6dc6a8286314ffb6a24ec8db1
BLAKE2b-256 507b4a2ace58fb10474832e0a459b34cb9cc255ad5ee499dd05b4dba47bfa274

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for pysimdjson-2.0.10-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f128b7f8d9ad2056a3f2f191fc816073b06868a5710659fa094f2d131c6fb200
MD5 e5559d79ca14d966b24351d4e0fe9920
BLAKE2b-256 69e0433205e75a3cf5cbd5e4dae0ffbe030b63232f834a5d22ce2be91ad86a9f

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.10-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.10-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 190.0 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for pysimdjson-2.0.10-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3a3937b7bfcebfe2c3cec30ba192ee8fa980e2f9a581ddb356392dad2c0eecdc
MD5 aa5727b6c9d9873601a58c481a2a16bc
BLAKE2b-256 adb27d4f10c3190df0dd65cb391c7a6b612de1b37d98092ae2350c1681884be1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page