simdjson bindings for python
Project description
pysimdjson
Quick-n'dirty Python bindings for simdjson just to see if going down this path might yield some parse time improvements in real-world applications.
These bindings are currently only tested on OS X, but should work everywhere simdjson does although you'll probably have to tweak your build flags.
Installation
There are binary wheels available for OS X 10.12. On other platforms you'll need a C++11-capable compiler.
pip install pysimdjson
or from source:
git clone https://github.com/TkTech/pysimdjson.git
cd pysimdjson
python setup.py install
Example
import pysimdjson
with open('sample.json', 'rb') as fin:
doc = pysimdjson.loads(fin.read())
AVX2
simdjson requires AVX2 support to function. Check to see if your OS/processor supports it:
- OS X:
sysctl -a | grep machdep.cpu.leaf7_features
- Linux:
grep avx2 /proc/cpuinfo
Early Benchmark
Comparing the built-in json module loads
on py3.7 to pysimdjson loads
.
File | json time |
pysimdjson time |
---|---|---|
jsonexamples/apache_builds.json |
0.09916733999999999 | 0.074089268 |
jsonexamples/canada.json |
5.305393378 | 1.6547515810000002 |
jsonexamples/citm_catalog.json |
1.3718639709999998 | 1.0438697340000003 |
jsonexamples/github_events.json |
0.04840242700000097 | 0.034239397999998644 |
jsonexamples/gsoc-2018.json |
1.5382746889999996 | 0.9597240750000005 |
jsonexamples/instruments.json |
0.24350973299999978 | 0.13639699600000021 |
jsonexamples/marine_ik.json |
4.505123285000002 | 2.8965093270000004 |
jsonexamples/mesh.json |
1.0325923849999974 | 0.38916503499999777 |
jsonexamples/mesh.pretty.json |
1.7129034710000006 | 0.46509220500000126 |
jsonexamples/numbers.json |
0.16577519699999854 | 0.04843887400000213 |
jsonexamples/random.json |
0.6930746310000018 | 0.6175370539999996 |
jsonexamples/twitter.json |
0.6069602610000011 | 0.41049074900000093 |
jsonexamples/twitterescaped.json |
0.7587005720000022 | 0.41576198399999953 |
jsonexamples/update-center.json |
0.5577604210000011 | 0.4961777420000004 |
Creating python objects is expensive, as is decoding a C const char *
to a
CPython unicode string. On examples that are mostly strings and dicts,
pysimdjson doesn't offer much improvement when using loads
, since the entire
document will be converted to Python objects.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pysimdjson-1.0.1.tar.gz
.
File metadata
- Download URL: pysimdjson-1.0.1.tar.gz
- Upload date:
- Size: 40.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3bc9fe85c4002e82c0ffe135296e8a7d9adf80ea7b9827c6654e41476aa807e1 |
|
MD5 | e38c4d1b3ad350519d53d54807bf1977 |
|
BLAKE2b-256 | 733292d35d606398a00ab37fa068bd89310b54d5a3c2085f48bfa80857741b51 |
File details
Details for the file pysimdjson-1.0.1-cp37-cp37m-macosx_10_12_x86_64.whl
.
File metadata
- Download URL: pysimdjson-1.0.1-cp37-cp37m-macosx_10_12_x86_64.whl
- Upload date:
- Size: 104.6 kB
- Tags: CPython 3.7m, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcdd26bf43c2a1ca212b986ee988052af19cf8d9314c443ed26ed2f3a4b6a223 |
|
MD5 | d760dd3787d48b0dd48974882dec0768 |
|
BLAKE2b-256 | 5d288cc08d38829a2fdfa5615cf8748d8817a4958a6d675bb97b7b98f9042ce7 |