Skip to main content

simdjson bindings for python

Project description

pysimdjson

Quick-n'dirty Python bindings for simdjson just to see if going down this path might yield some parse time improvements in real-world applications.

These bindings are currently only tested on OS X, but should work everywhere simdjson does although you'll probably have to tweak your build flags.

Example

import pysimdjson

with open('sample.json', 'rb') as fin:
    doc = pysimdjson.loads(fin)

AVX2

simdjson requires AVX2 support to function. Check to see if your OS/processor supports it:

OS X: sysctl -a | grep machdep.cpu.leaf7_features Linux: grep avx2 /proc/cpuinfo

Early Benchmark

Comparing the built-in json module on py3.7 to pysimdjson.

File json time pysimdjson time
jsonexamples/apache_builds.json 0.09916733999999999 0.074089268
jsonexamples/canada.json 5.305393378 1.6547515810000002
jsonexamples/citm_catalog.json 1.3718639709999998 1.0438697340000003
jsonexamples/github_events.json 0.04840242700000097 0.034239397999998644
jsonexamples/gsoc-2018.json 1.5382746889999996 0.9597240750000005
jsonexamples/instruments.json 0.24350973299999978 0.13639699600000021
jsonexamples/marine_ik.json 4.505123285000002 2.8965093270000004
jsonexamples/mesh.json 1.0325923849999974 0.38916503499999777
jsonexamples/mesh.pretty.json 1.7129034710000006 0.46509220500000126
jsonexamples/numbers.json 0.16577519699999854 0.04843887400000213
jsonexamples/random.json 0.6930746310000018 0.6175370539999996
jsonexamples/twitter.json 0.6069602610000011 0.41049074900000093
jsonexamples/twitterescaped.json 0.7587005720000022 0.41576198399999953
jsonexamples/update-center.json 0.5577604210000011 0.4961777420000004

The overhead of constructing the python dict is immense, as is decoding strings to python's UTF-8 objects. JSON files that are 99% strings and dicts will see little improvements, but all others see significant improvement.

Providing an API for iteration without converting the entire document into a python object would yield significant improvements when you only care about part of the document.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysimdjson-1.0.tar.gz (38.0 kB view details)

Uploaded Source

Built Distributions

pysimdjson-1.0-py3.7-macosx-10.12-x86_64.egg (100.2 kB view details)

Uploaded Source

pysimdjson-1.0-cp37-cp37m-macosx_10_12_x86_64.whl (102.8 kB view details)

Uploaded CPython 3.7m macOS 10.12+ x86-64

File details

Details for the file pysimdjson-1.0.tar.gz.

File metadata

  • Download URL: pysimdjson-1.0.tar.gz
  • Upload date:
  • Size: 38.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for pysimdjson-1.0.tar.gz
Algorithm Hash digest
SHA256 affa8707de0e557301a1825eec6c17d8eecbb4f59b2e6df036d88360283ec71e
MD5 f7e40a492df31d855b5920eb183cd171
BLAKE2b-256 177e5f2740e0d1103aec08c63e2c41d15332e93cf0c381c0c3947720d05f38e7

See more details on using hashes here.

File details

Details for the file pysimdjson-1.0-py3.7-macosx-10.12-x86_64.egg.

File metadata

  • Download URL: pysimdjson-1.0-py3.7-macosx-10.12-x86_64.egg
  • Upload date:
  • Size: 100.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for pysimdjson-1.0-py3.7-macosx-10.12-x86_64.egg
Algorithm Hash digest
SHA256 fd90769a0eaa1ad23fa6bc83dfe9a1a3cadd3d5f6cc512031536cebec5932b9d
MD5 8f11826c48240c521ecab415170da911
BLAKE2b-256 5bb092193c398495a31fe52c94bef5f780e70c3b4d18cdeed89ec5e1e06d6509

See more details on using hashes here.

File details

Details for the file pysimdjson-1.0-cp37-cp37m-macosx_10_12_x86_64.whl.

File metadata

  • Download URL: pysimdjson-1.0-cp37-cp37m-macosx_10_12_x86_64.whl
  • Upload date:
  • Size: 102.8 kB
  • Tags: CPython 3.7m, macOS 10.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for pysimdjson-1.0-cp37-cp37m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8961ef35f5484019fe6e2f0f3b5300fea31d161041c5fa9aa3c9fcd091816f0c
MD5 489b95d7c2cd3d7ed18fa9dae387ad45
BLAKE2b-256 30bb69b0e904b3b1d3cb5caaa5b3a5d74bef0c91595a5d11f00c2fbafbfb4020

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page