Skip to main content

simdjson bindings for python

Project description

PyPI - License Tests

pysimdjson

Python bindings for the simdjson project, a SIMD-accelerated JSON parser. If SIMD instructions are unavailable a fallback parser is used, making pysimdjson safe to use anywhere.

Bindings are currently tested on OS X, Linux, and Windows for Python version 3.5 to 3.9.

🎉 Installation

If binary wheels are available for your platform, you can install from pip with no further requirements:

pip install pysimdjson

Binary wheels are available for x86_64 on the following:

py3.5 py3.6 py3.7 py3.8 pypy3
OS X y y y y y
Windows x x y y x
Linux y y y y x

If binary wheels are not available for your platform, you'll need a C++11-capable compiler to compile the sources:

pip install 'pysimdjson[dev]' --no-binary :all:

Both simddjson and pysimdjson support FreeBSD and Linux on ARM when built from source.

⚗ Development and Testing

This project comes with a full test suite. To install development and testing dependencies, use:

pip install -e ".[dev]"

To also install 3rd party JSON libraries used for running benchmarks, use:

pip install -e ".[benchmark]"

To run the tests, just type pytest. To also run the benchmarks, use pytest --runslow.

To properly test on Windows, you need both a recent version of Visual Studio (VS) as well as VS2015, patch 3. Older versions of CPython required portable C/C++ extensions to be built with the same version of VS as the interpreter. Use the Developer Command Prompt to easily switch between versions.

How It Works

This project uses pybind11 to generate the low-level bindings on top of the simdjson project. You can use it just like the built-in json module, or use the simdjson-specific API for much better performance.

import simdjson
doc = simdjson.loads('{"hello": "world"}')

🚀 Making things faster

pysimdjson provides an api compatible with the built-in json module for convenience, and this API is pretty fast (beating or tying all other Python JSON libraries). However, it also provides a simdjson-specific API that can perform significantly better.

Don't load the entire document

95% of the time spent loading a JSON document into Python is spent in the creation of Python objects, not the actual parsing of the document. You can avoid all of this overhead by ignoring parts of the document you don't want.

pysimdjson supports this in two ways - the use of JSON pointers via at(), or proxies for objects and lists.

import simdjson
parser = simdjson.Parser()
doc = parser.parse(b'{"res": [{"name": "first"}, {"name": "second"}]')

For our sample above, we really just want the second entry in res, we don't care about anything else. We can do this two ways:

assert doc['res'][1]['name'] == 'second' # True
assert doc.at('res/1/name') == 'second' # True

Both of these approaches will be much faster than using load/s(), since they avoid loading the parts of the document we didn't care about.

Re-use the parser.

One of the easiest performance gains if you're working on many documents is to re-use the parser.

import simdjson
parser = simdjson.Parser()

for i in range(0, 100):
    doc = parser.parse(b'{"a": "b"})

This will drastically reduce the number of allocations being made, as it will reuse the existing buffer when possible. If it's too small, it'll grow to fit.

📈 Benchmarks

pysimdjson compares well against most libraries for the default load/loads(), which creates full python objects immediately.

pysimdjson performs significantly better when only part of the document is of interest. For each test file we show the time taken to completely deserialize the document into Python objects, as well as the time to get the deepest key in each file. The second approach avoids all unnecessary object creation.

jsonexamples/canada.json deserialization

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{canada} 10.61630 27.12380 0.00442 58.42790
orjson-{canada} 11.97230 29.95960 0.00469 56.21902
ujson-{canada} 19.12120 60.73670 0.01320 26.66618
simplejson-{canada} 39.64180 59.80270 0.00535 20.51313
rapidjson-{canada} 40.57460 78.20690 0.01444 17.10311
json-{canada} 42.95370 62.18130 0.00470 20.21549

jsonexamples/canada.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{canada} 3.38440 7.60380 0.00071 255.69203
ujson-{canada} 11.10420 34.35320 0.00742 49.72907
orjson-{canada} 12.92510 45.33800 0.00745 41.44936
simplejson-{canada} 38.92410 64.06250 0.00856 19.70330
rapidjson-{canada} 41.22570 66.68340 0.00756 19.22791
json-{canada} 43.08250 64.75990 0.00661 18.15876

jsonexamples/twitter.json deserialization

Name Min (μs) Max (μs) StdDev Ops
orjson-{twitter} 2.29380 8.67020 0.00094 372.10773
✨ simdjson-{twitter} 2.49010 22.30540 0.00198 281.95565
ujson-{twitter} 2.74350 12.06470 0.00105 317.20009
simplejson-{twitter} 3.35320 19.56840 0.00202 217.32882
rapidjson-{twitter} 4.32850 13.21370 0.00119 194.83892
json-{twitter} 5.27190 11.25140 0.00117 167.84380

jsonexamples/twitter.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{twitter} 0.35740 2.01060 0.00009 2423.86485
orjson-{twitter} 2.29750 11.01000 0.00105 366.48762
ujson-{twitter} 2.76260 14.13210 0.00143 285.69895
simplejson-{twitter} 3.35340 13.34750 0.00118 257.05624
rapidjson-{twitter} 4.31330 12.43220 0.00141 192.75979
json-{twitter} 5.23560 13.85480 0.00126 168.04882

jsonexamples/github_events.json deserialization

Name Min (μs) Max (μs) StdDev Ops
orjson-{github_events} 0.17850 0.62230 0.00002 5331.74983
✨ simdjson-{github_events} 0.19760 2.36700 0.00009 3905.95971
ujson-{github_events} 0.25860 0.67530 0.00003 3642.89767
json-{github_events} 0.28910 1.09600 0.00009 2924.08415
simplejson-{github_events} 0.30620 1.29520 0.00005 3007.32539
rapidjson-{github_events} 0.33290 1.15310 0.00006 2654.55940

jsonexamples/github_events.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{github_events} 0.03950 3.31210 0.00005 15973.82108
orjson-{github_events} 0.18030 0.65220 0.00005 4911.43253
ujson-{github_events} 0.26070 0.96760 0.00005 3549.92113
json-{github_events} 0.29040 1.54090 0.00007 3047.37921
simplejson-{github_events} 0.30920 0.98670 0.00008 2953.84031
rapidjson-{github_events} 0.33390 1.56730 0.00010 2461.45389

jsonexamples/citm_catalog.json deserialization

Name Min (μs) Max (μs) StdDev Ops
orjson-{citm_catalog} 5.24950 18.22640 0.00323 129.49044
✨ simdjson-{citm_catalog} 6.05650 29.15550 0.00584 70.17580
ujson-{citm_catalog} 6.24130 18.69410 0.00373 109.60956
json-{citm_catalog} 9.10930 26.54630 0.00414 76.55235
simplejson-{citm_catalog} 13.69630 28.63450 0.00401 57.28718
rapidjson-{citm_catalog} 21.78300 65.30240 0.01055 28.63350

jsonexamples/citm_catalog.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{citm_catalog} 0.87070 2.86480 0.00019 1056.22226
orjson-{citm_catalog} 5.40520 26.24650 0.00551 102.43563
ujson-{citm_catalog} 6.38280 26.49210 0.00562 96.65066
json-{citm_catalog} 9.16770 29.45910 0.00498 76.90314
simplejson-{citm_catalog} 13.66750 30.54480 0.00471 57.54416
rapidjson-{citm_catalog} 19.16620 49.23040 0.00714 36.04769

jsonexamples/mesh.json deserialization

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{mesh} 2.60850 17.85500 0.00189 276.39681
ujson-{mesh} 2.80000 11.36520 0.00148 297.40696
orjson-{mesh} 2.87780 14.34770 0.00156 272.06333
json-{mesh} 5.69520 22.03140 0.00282 132.44125
rapidjson-{mesh} 7.28240 24.61470 0.00249 113.59051
simplejson-{mesh} 8.37720 18.80480 0.00201 104.81092

jsonexamples/mesh.json deepest key

Name Min (μs) Max (μs) StdDev Ops
✨ simdjson-{mesh} 1.01600 12.12980 0.00067 619.16472
ujson-{mesh} 2.75500 14.19920 0.00166 309.06497
orjson-{mesh} 2.84420 24.41680 0.00248 245.50994
json-{mesh} 5.63860 14.53620 0.00160 154.31889
rapidjson-{mesh} 7.11940 18.68600 0.00208 117.20282
simplejson-{mesh} 8.27930 19.76000 0.00207 106.66946

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysimdjson-2.0.14.tar.gz (207.0 kB view details)

Uploaded Source

Built Distributions

pysimdjson-2.0.14-pp36-pypy36_pp73-macosx_10_9_x86_64.whl (168.9 kB view details)

Uploaded PyPy macOS 10.9+ x86-64

pysimdjson-2.0.14-cp38-cp38-win_amd64.whl (136.1 kB view details)

Uploaded CPython 3.8 Windows x86-64

pysimdjson-2.0.14-cp38-cp38-macosx_10_14_x86_64.whl (189.6 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pysimdjson-2.0.14-cp37-cp37m-win_amd64.whl (135.8 kB view details)

Uploaded CPython 3.7m Windows x86-64

pysimdjson-2.0.14-cp37-cp37m-manylinux2014_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.7m

pysimdjson-2.0.14-cp37-cp37m-macosx_10_14_x86_64.whl (185.1 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

pysimdjson-2.0.14-cp36-cp36m-manylinux2014_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.6m

pysimdjson-2.0.14-cp36-cp36m-macosx_10_14_x86_64.whl (185.1 kB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

pysimdjson-2.0.14-cp35-cp35m-manylinux2014_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.5m

pysimdjson-2.0.14-cp35-cp35m-macosx_10_14_x86_64.whl (185.1 kB view details)

Uploaded CPython 3.5m macOS 10.14+ x86-64

File details

Details for the file pysimdjson-2.0.14.tar.gz.

File metadata

  • Download URL: pysimdjson-2.0.14.tar.gz
  • Upload date:
  • Size: 207.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.14.tar.gz
Algorithm Hash digest
SHA256 0fe9a835f0a4ab0fa95187fd0173d5f3f1b54b89974c9aaa99929ccebc16f188
MD5 db7401c5eae5a6b59833ade3fbbf27a3
BLAKE2b-256 97fd926961a597651adb9b7ce062f916bf9a74b2d6976b0b44f58541efa2ce58

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-pp36-pypy36_pp73-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 168.9 kB
  • Tags: PyPy, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 PyPy/7.3.1

File hashes

Hashes for pysimdjson-2.0.14-pp36-pypy36_pp73-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2b4ff712e8af1a84dca1a541d5d88c0e166897c695ddf49bb20cf93889d148e3
MD5 6ca9fc35769cb2b127d7bcdf5df4c479
BLAKE2b-256 0dd24c30999dee3bf6ef13639bd82cbd9361a3bc78e36cb343bc005a60e84a32

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 136.1 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.14-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e2fcb5eab83e6d5950d76da8f0604ff89c97d7b3e431b9837785612d95a35bf9
MD5 ad33eb45d56001a38f150af8c762c02a
BLAKE2b-256 9ca6680efa68cc7c4babc2a3270275d1b76d4722adf7e7cd97183e23dfa35af9

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 189.6 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.14-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 09c885791b1d9b85752ce23f77667d4ff076965f71fd54a2d30a6ca82c878cf8
MD5 00815da19c73d237e0dfab034b68c20e
BLAKE2b-256 19c1a42a119a8ebaf0ce0281cdc866eb1f29ebebedc3eb788194c1f525e22e0d

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 135.8 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.14-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 62169ed435afc8df5df733f2e5660d5f29142038f8856c3f08b969e94ea2348f
MD5 a94182dd87378676e107df415a31eeb0
BLAKE2b-256 6ae5669f78928183f62903602bbd87b9992a2fc7c04d50fbfb420f46ed1ce997

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.14-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 40cc50804e2647aad06189c0fa4ae763f7773b9acef507be230862ce925ff6b5
MD5 168fb7e3e4ceca4485a99cb3f1c78d72
BLAKE2b-256 62fd996060e49eab0903203a4542cc41ac559a58f8efc4135056dc957656ba74

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 185.1 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.8

File hashes

Hashes for pysimdjson-2.0.14-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9bd0662de64d6e98a59320b74d05197933bb4eb83d5500314004a6d05879445c
MD5 ba1cf7d0e42db0d8c8e8f927ba930cb2
BLAKE2b-256 ea2ccad519aed0e8e5878687be5913addaf2a2691436b00c9a6519c1c8b31c11

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.14-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 346b38f69b5eb4220612f4cf38457b6504d8feaf987ab2b8825a5dcf229bda7f
MD5 2093244f8e013906866316d509fc103d
BLAKE2b-256 31d450ce1f28120f9221f1c0ffaaba2d07b445ac0e24338dfef506e968e65671

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp36-cp36m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 185.1 kB
  • Tags: CPython 3.6m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.6.11

File hashes

Hashes for pysimdjson-2.0.14-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 57f4524d8152636d688522c6ef01134dd85a4ab699696f589820eb014234ea71
MD5 1e9991198e042918e1675ff6aaf268f5
BLAKE2b-256 03c8bbc06a3ec5d9469b85fc8062c65889a3031ea5d07420e27c907a763e88e1

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp35-cp35m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp35-cp35m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.8.3

File hashes

Hashes for pysimdjson-2.0.14-cp35-cp35m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 eb216e535502b030b465357bf3563e257cc0c6945c038d18e690a2eacc0a34a5
MD5 0c2757716d614e709233cab62603d845
BLAKE2b-256 9b9c4230235cc2e293fd4b3c18df0ec047d51593f4617c220fd8fa88f3882884

See more details on using hashes here.

File details

Details for the file pysimdjson-2.0.14-cp35-cp35m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pysimdjson-2.0.14-cp35-cp35m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 185.1 kB
  • Tags: CPython 3.5m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.5.9

File hashes

Hashes for pysimdjson-2.0.14-cp35-cp35m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 35f8fc2dc5351c488ce6667b46c08e7827443f8ba12dc642caaa61c28b6bc303
MD5 a5e5c0ba58acdc43bfc36fde3e0e455c
BLAKE2b-256 345c189c6cd7f22fcf9419088a63eccfdb5541e31f678309160b2634493a42d4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page