Skip to main content

simdjson bindings for python

Project description

PyPI - License Tests

pysimdjson

Python bindings for the simdjson project, a SIMD-accelerated JSON parser. If SIMD instructions are unavailable a fallback parser is used, making pysimdjson safe to use anywhere.

Bindings are currently tested on OS X, Linux, and Windows for Python version 3.5 to 3.9.

🎉 Installation

If binary wheels are available for your platform, you can install from pip with no further requirements:

pip install pysimdjson

Binary wheels are available for x86_64 on the following:

py3.5 py3.6 py3.7 py3.8 pypy3
OS X y y y y y
Windows x x y y x
Linux y y y y x

If binary wheels are not available for your platform, you'll need a C++11-capable compiler to compile the sources:

pip install 'pysimdjson[dev]' --no-binary :all:

Both simddjson and pysimdjson support FreeBSD and Linux on ARM when built from source.

⚗ Development and Testing

This project comes with a full test suite. To install development and testing dependencies, use:

pip install -e ".[dev]"

To also install 3rd party JSON libraries used for running benchmarks, use:

pip install -e ".[benchmark]"

To run the tests, just type pytest. To also run the benchmarks, use pytest --runslow.

To properly test on Windows, you need both a recent version of Visual Studio (VS) as well as VS2015, patch 3. Older versions of CPython required portable C/C++ extensions to be built with the same version of VS as the interpreter. Use the Developer Command Prompt to easily switch between versions.

How It Works

This project uses pybind11 to generate the low-level bindings on top of the simdjson project. You can use it just like the built-in json module, or use the simdjson-specific API for much better performance.

import simdjson
doc = simdjson.loads('{"hello": "world"}')

🚀 Making things faster

pysimdjson provides an api compatible with the built-in json module for convenience, and this API is pretty fast (beating or tying all other Python JSON libraries). However, it also provides a simdjson-specific API that can perform significantly better.

Don't load the entire document

95% of the time spent loading a JSON document into Python is spent in the creation of Python objects, not the actual parsing of the document. You can avoid all of this overhead by ignoring parts of the document you don't want.

pysimdjson supports this in two ways - the use of JSON pointers via at(), or proxies for objects and lists.

import simdjson
parser = simdjson.Parser()
doc = parser.parse(b'{"res": [{"name": "first"}, {"name": "second"}]')

For our sample above, we really just want the second entry in res, we don't care about anything else. We can do this two ways:

assert doc['res'][1]['name'] == 'second' # True
assert doc.at('res/1/name') == 'second' # True

Both of these approaches will be much faster than using load/s(), since they avoid loading the parts of the document we didn't care about.

Re-use the parser.

One of the easiest performance gains if you're working on many documents is to re-use the parser.

import simdjson
parser = simdjson.Parser()

for i in range(0, 100):
    doc = parser.parse(b'{"a": "b"})

This will drastically reduce the number of allocations being made, as it will reuse the existing buffer when possible. If it's too small, it'll grow to fit.

📈 Benchmarks

pysimdjson compares well against most libraries for the default load/loads(), which creates full python objects immediately.

pysimdjson performs significantly better when only part of the document is of interest. For each test file we show the time taken to completely deserialize the document into Python objects, as well as the time to get the deepest key in each file. The second approach avoids all unnecessary object creation.

jsonexamples/canada.json deserialization

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{canada} 11.06660 32.72010 0.00572 50.49244
orjson-{canada} 12.70790 26.67180 0.00424 55.58212
simplejson-{canada} 38.48230 50.24570 0.00447 22.64854
rapidjson-{canada} 39.67800 57.52180 0.00476 21.76358
json-{canada} 42.54410 54.52020 0.00345 20.03848

jsonexamples/canada.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{canada} 3.38700 10.86230 0.00101 222.97853
orjson-{canada} 13.34780 33.54840 0.00652 44.09761
simplejson-{canada} 38.01830 69.71770 0.00843 20.47350
rapidjson-{canada} 39.95230 59.47640 0.00667 20.16637
json-{canada} 42.18320 63.19890 0.00686 19.37501

jsonexamples/twitter.json deserialization

Name Min (μs) Max (μs) StdDev Ops
orjson-{twitter} 2.25720 8.80010 0.00090 395.26644
✨ simdjson-{twitter} 2.54290 17.11490 0.00189 250.52593
simplejson-{twitter} 3.35020 9.28810 0.00094 272.00414
rapidjson-{twitter} 4.39350 10.72390 0.00083 208.02216
json-{twitter} 5.24810 11.10900 0.00086 177.25899

jsonexamples/twitter.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{twitter} 0.35660 3.52600 0.00020 2099.88860
orjson-{twitter} 2.26630 9.70870 0.00104 384.04157
simplejson-{twitter} 3.36320 11.10170 0.00120 263.55812
rapidjson-{twitter} 4.40140 12.07110 0.00123 203.94305
json-{twitter} 5.21540 15.52110 0.00112 178.51096

jsonexamples/github_events.json deserialization

Name Min (μs) Max (μs) StdDev Ops
orjson-{github_events} 0.17800 0.84640 0.00004 5232.35967
✨ simdjson-{github_events} 0.20090 2.23790 0.00009 3685.60740
json-{github_events} 0.28770 1.01060 0.00005 3247.96256
simplejson-{github_events} 0.30560 1.19760 0.00003 3126.57352
rapidjson-{github_events} 0.33170 0.67080 0.00003 2860.73395

jsonexamples/github_events.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{github_events} 0.04050 0.33410 0.00001 21166.03658
orjson-{github_events} 0.17970 0.53880 0.00002 5235.09246
json-{github_events} 0.29140 0.98010 0.00004 3262.97633
simplejson-{github_events} 0.30800 2.07340 0.00006 3087.47964
rapidjson-{github_events} 0.33660 0.66480 0.00002 2799.91656

jsonexamples/citm_catalog.json deserialization

Name Min (μs) Max (μs) StdDev Ops
orjson-{citm_catalog} 5.42280 25.94490 0.00322 132.97562
✨ simdjson-{citm_catalog} 6.24880 23.46540 0.00487 97.93701
json-{citm_catalog} 9.20710 17.70010 0.00271 88.13338
simplejson-{citm_catalog} 9.96980 20.16560 0.00314 81.37851
rapidjson-{citm_catalog} 11.77450 41.98760 0.00442 70.71896

jsonexamples/citm_catalog.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{citm_catalog} 0.87760 3.74490 0.00026 980.67148
orjson-{citm_catalog} 5.42820 18.30850 0.00412 123.67493
json-{citm_catalog} 9.08970 23.72150 0.00399 85.87864
simplejson-{citm_catalog} 9.82740 24.80090 0.00447 79.48858
rapidjson-{citm_catalog} 11.71590 28.64550 0.00490 67.48895

jsonexamples/mesh.json deserialization

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{mesh} 2.61070 12.71490 0.00158 295.08199
orjson-{mesh} 2.93950 11.48570 0.00127 292.54539
json-{mesh} 5.60170 18.03550 0.00153 158.64945
rapidjson-{mesh} 7.21990 21.84580 0.00201 123.34775
simplejson-{mesh} 8.35430 16.78820 0.00182 106.41059

jsonexamples/mesh.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{mesh} 1.01200 2.90980 0.00019 909.77466
orjson-{mesh} 2.87630 10.38550 0.00133 299.85643
json-{mesh} 5.61810 14.77090 0.00139 163.70380
rapidjson-{mesh} 7.12080 19.14110 0.00210 119.10425
simplejson-{mesh} 8.33190 24.92470 0.00315 96.34057

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysimdjson-2.0.12.tar.gz (206.7 kB view details)

Uploaded Source

Built Distributions

pysimdjson-2.0.12-pp36-pypy36_pp73-macosx_10_9_x86_64.whl (168.6 kB view details)

Uploaded PyPy macOS 10.9+ x86-64

pysimdjson-2.0.12-cp38-cp38-win_amd64.whl (136.1 kB view details)

Uploaded CPython 3.8 Windows x86-64

pysimdjson-2.0.12-cp38-cp38-macosx_10_14_x86_64.whl (189.3 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pysimdjson-2.0.12-cp37-cp37m-win_amd64.whl (135.8 kB view details)

Uploaded CPython 3.7m Windows x86-64

pysimdjson-2.0.12-cp37-cp37m-manylinux2010_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

pysimdjson-2.0.12-cp37-cp37m-macosx_10_14_x86_64.whl (184.8 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pysimdjson-2.0.12-cp36-cp36m-manylinux2010_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

pysimdjson-2.0.12-cp36-cp36m-macosx_10_14_x86_64.whl (184.8 kB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

pysimdjson-2.0.12-cp35-cp35m-manylinux2010_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

pysimdjson-2.0.12-cp35-cp35m-macosx_10_14_x86_64.whl (184.8 kB view details)

Uploaded CPython 3.5m macOS 10.14+ x86-64

File details

Details for the file pysimdjson-2.0.12.tar.gz.

File metadata

  • Download URL: pysimdjson-2.0.12.tar.gz
  • Upload date:
  • Size: 206.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.12.tar.gz
Algorithm Hash digest
SHA256 218d9e9d4505818b800bec37ce46b5a369a603f76e8d00dbc810145ea0e5fa01
MD5 9186597d9017925e06cd756b91eff2b9
BLAKE2b-256 414ee5825942bc75e9e6b2caed4a666ac000019ad6281fb0eb923b44ab53359b

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-pp36-pypy36_pp73-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 168.6 kB
  • Tags: PyPy, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 PyPy/7.3.1

File hashes

Hashes for pysimdjson-2.0.12-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6ec8b068901e10a79545c9f12217d6f840019ed6b97a16fd2f8bfd6a7bca6e2e
MD5 5f9f5fe3b0b65dc38f65e9072883b038
BLAKE2b-256 6d8dcb73716c917113c8fd8e0c385f3a17aa4cebc5bba0d936731fd1a36135b8

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 136.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.12-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 ec6c7680b6ea588b7927871d31dc33fc94d5f2a106718bc700f54a11091ce0d1
MD5 085632aa66fa519065f7fabc3ba6faf2
BLAKE2b-256 d8841dc6a9fe8c09bddf6145fd1303bf3942cfcebede4e4a7fefe3535a305701

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 189.3 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.12-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d38edfb8961ad85773d3b9f72e7510cf624e74ea11dd1ffb1f15aff429002b8a
MD5 000e2a46482af5e3c88a6149fb37df12
BLAKE2b-256 b113bd178ffdcaac5c9bbfa853913f53036eb414bf1971dc7ca2477550df351c

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 135.8 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.12-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 99b9710a2a9f8682524dce99b05ace34632adf5052af164fd833867ad94ca217
MD5 a7f2d9720823bd5b2323e5d4452df38a
BLAKE2b-256 0850e852ec465992def0f95dac2397d96c69f09cb8f2d2e45d4a628ea16942c1

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.12-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d056ac3b98466e019a3fe9437a5822c74a3242a7340021978409b5c18539dd1e
MD5 86a53ba5a2c177ebd9b8801f6c014f80
BLAKE2b-256 b4d13ae174b2fc58deef3a355cac6a73d04d549d019c70daa2abc7a92d205600

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 184.8 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.12-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8b5915c1f8c1fc5ab2f290bf6c35c1f8ef9f4b32d59a86692833adead511376e
MD5 c51af42e995b2f931df754cc209ba4b8
BLAKE2b-256 4c998581a83497bba276f523a217e7a89d29eb594868949a8db67abfd88be713

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.12-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 424d4213e36366ad03cd52a5ae374a7c1a48d8003bcbe61e2409557cd721f71f
MD5 6c615df1633f649a05b80f83d1748211
BLAKE2b-256 39ab7fdb07fd84525de47031d018ee43ed63141b3a0b6b66311b13568ec9d997

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 184.8 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for pysimdjson-2.0.12-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 14e76b5fba0f40cffd4bf65e5515477ab13b31ef1f6928eebcb039c2cbc998f2
MD5 5f7050d05bf0c9edf4d80ae6af1b4097
BLAKE2b-256 dd050ba3224af5f80fd66ef39e4c56dc1c90a624c9a0f082b987025a22c62e1f

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp35-cp35m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp35-cp35m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.5m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.12-cp35-cp35m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 0f113c89be36e77c2a1dd11f245931d527161e891b1680b9de7ce162776f949f
MD5 75fd6d1c82edefdaca979625f395f178
BLAKE2b-256 50a29dbc6cbae420077593eea5951a272a88adb34ab81c7c7286077867184a01

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.12-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.12-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 184.8 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for pysimdjson-2.0.12-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0c0641c040ec53fb873e6fa5a18b0c255bd94c1ece0f6b5e0c5cdc188ac47df0
MD5 fdaf1415fa1fc2d99f8b9248692c9b80
BLAKE2b-256 50a4263fbea84d7c64a85d1b2a2476410bf432bd5e6149705352dcfe934f8284

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page