Skip to main content

Fast MUTF-8 encoder & decoder

Project description

Tests

mutf-8

This package contains simple pure-python as well as C encoders and decoders for the MUTF-8 character encoding. In most cases, you can also parse the even-rarer CESU-8.

These days, you'll most likely encounter MUTF-8 when working on files or protocols related to the JVM. Strings in a Java .class file are encoded using MUTF-8, strings passed by the JNI, as well as strings exported by the object serializer.

This library was extracted from Lawu, a Python library for working with JVM class files.

📈 Benchmarks

The C extension is significantly faster - often 20x to 40x faster.

MUTF-8 Decoding

Name Min (μs) Max (μs) StdDev Ops
cmutf8-decode_modified_utf8 0.00009 0.00080 0.00000 9957678.56358
pymutf8-decode_modified_utf8 0.00190 0.06040 0.00000 450455.96019

MUTF-8 Encoding

Name Min (μs) Max (μs) StdDev Ops
cmutf8-encode_modified_utf8 0.00008 0.00151 0.00000 11897361.05101
pymutf8-encode_modified_utf8 0.00180 0.16650 0.00000 474390.98091

C Extension

The C extension is optional. If a binary package is not available, or a C compiler is not present, the pure-python version will be used instead. If you want to ensure you're using the C version, import it directly:

from mutf8.cmutf8 import decode_modified_utf8

decode_modified_utf(b'\xED\xA1\x80\xED\xB0\x80')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mutf8-1.0.0.tar.gz (5.1 kB view details)

Uploaded Source

File details

Details for the file mutf8-1.0.0.tar.gz.

File metadata

  • Download URL: mutf8-1.0.0.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.0

File hashes

Hashes for mutf8-1.0.0.tar.gz
Algorithm Hash digest
SHA256 31dd3ab425a994e830d8ce4f2a6ac02f971ea6ff0c554303c8f7c9a415022251
MD5 ccdecf9d72a0e13c426a08e79a8e05e6
BLAKE2b-256 5891951f0922839ce0d1a0fac6e3fb3b947c507d82dcaefbd993b64ece43e40e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page