Skip to main content

Fast and memory efficient DAWG (DAFSA) for Python

Project description

DAWG

https://travis-ci.org/kmike/DAWG.png?branch=master

This package provides DAWG(DAFSA)-based dictionary-like read-only objects for Python (2.x and 3.x).

String data in a DAWG may take 200x less memory than in a standard Python dict and the raw lookup speed is comparable; it also provides fast advanced methods like prefix search.

License

Wrapper code is licensed under MIT License. Bundled dawgdic C++ library is licensed under BSD license. Bundled libb64 is Public Domain.

Changes

0.7.8 (2015-04-18)

  • extra type annotations are added to make the code a bit faster;

  • mercurial mirror at bitbucket is dropped;

  • wrapper is rebuilt with Cython 0.22.

0.7.7 (2014-11-19)

  • DAWG.b_prefixes method for avoiding utf8 encoding/decoding (thanks Ikuya Yamada);

  • wrapper is rebuilt with Cython 0.21.1.

0.7.6 (2014-08-10)

  • Wrapper is rebuilt with Cython 0.20.2 to fix some issues.

0.7.5 (2014-06-05)

  • Switched to setuptools;

  • some wheels are uploaded to pypi.

0.7.4 (2014-05-29)

  • Fixed a bug in DAWG building: input should be sorted according to its binary representation.

0.7.3 (2014-05-29)

  • Wrapper is rebuilt with Cython 0.21dev;

  • Python 3.4 compatibility is verified.

0.7.2 (2013-10-03)

0.7.1 (2013-05-25)

  • Extension is rebuilt with Cython 0.19.1;

  • fixed segfault that happened on lookup from incorrectly loaded DAWG (thanks Alex Moiseenko).

0.7 (2013-04-05)

  • IntCompletionDAWG

0.6.1 (2013-03-23)

  • Installation issues in environments with LC_ALL=C are fixed;

  • PyPy is officially unsupported now (use DAWG-Python with PyPy).

0.6 (2013-03-22)

  • many thread-safety bugs are fixed (at the cost of slowing library down).

0.5.5 (2013-02-19)

  • fix installation under PyPy (note: DAWG is slow under PyPy and may have bugs).

0.5.4 (2013-02-14)

  • small tweaks for docstrings;

  • the extension is rebuilt using Cython 0.18.

0.5.3 (2013-01-03)

  • small improvements to .compile_replaces method;

  • benchmarks for .similar_items method;

  • the extension is rebuilt with Cython pre-0.18; this made .prefixes and .iterprefixes methods faster (up to 6x in some cases).

0.5.2 (2013-01-02)

  • tests are included in source distribution;

  • benchmark results in README was nonrepresentative because of my broken (slow) Python 3.2 install;

  • installation is fixed under Python 3.x with LC_ALL=C (thanks Jakub Wilk).

0.5.1 (2012-10-11)

  • better error reporting while building DAWGs;

  • __contains__ is fixed for keys with zero bytes;

  • dawg.Error exception class;

  • building of BytesDAWG and RecordDAWG fails instead of producing incorrect results if some of the keys has unsupported characters.

0.5 (2012-10-08)

The storage scheme of BytesDAWG and RecordDAWG is changed in this release in order to provide the alphabetical ordering of items.

This is a backwards-incompatible release. In order to read BytesDAWG or RecordDAWG created with previous versions of DAWG use payload_separator constructor argument:

>>> BytesDAWG(payload_separator=b'\xff').load('old.dawg')

0.4.1 (2012-10-01)

  • Segfaults with empty DAWGs are fixed by updating dawgdic to latest svn.

0.4 (2012-09-26)

  • iterkeys, iteritems and iterprefixes methods (thanks Dan Blanchard).

0.3.2 (2012-09-24)

  • prefixes method for finding all prefixes of a given key.

0.3.1 (2012-09-20)

  • bundled dawgdic C++ library is updated to the latest version.

0.3 (2012-09-13)

  • similar_keys, similar_items and similar_item_values methods for more permissive lookups (they may be useful e.g. for umlaut handling);

  • load method returns self;

  • Python 3.3 support.

0.2 (2012-09-08)

Greatly improved memory usage for DAWGs loaded with load method.

There is currently a bug somewhere in a wrapper so DAWGs loaded with read() method or unpickled DAWGs uses 3x-4x memory compared to DAWGs loaded with load() method. load() is fixed in this release but other methods are not.

0.1 (2012-09-08)

Initial release.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DAWG-0.7.8.tar.gz (255.8 kB view details)

Uploaded Source

Built Distributions

DAWG-0.7.8-cp36-cp36m-macosx_10_12_x86_64.whl (142.1 kB view details)

Uploaded CPython 3.6m macOS 10.12+ x86-64

DAWG-0.7.8-cp35-cp35m-macosx_10_11_x86_64.whl (147.2 kB view details)

Uploaded CPython 3.5m macOS 10.11+ x86-64

DAWG-0.7.8-cp34-cp34m-macosx_10_10_x86_64.whl (151.3 kB view details)

Uploaded CPython 3.4m macOS 10.10+ x86-64

DAWG-0.7.8-cp27-none-macosx_10_9_x86_64.whl (146.4 kB view details)

Uploaded CPython 2.7 macOS 10.9+ x86-64

File details

Details for the file DAWG-0.7.8.tar.gz.

File metadata

  • Download URL: DAWG-0.7.8.tar.gz
  • Upload date:
  • Size: 255.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for DAWG-0.7.8.tar.gz
Algorithm Hash digest
SHA256 30d5da3e48b8cbe5ec94c5a202d2962780d3895ba0883123e6788565f71b2953
MD5 51e91616c9a4db9931bc944c1e012ee2
BLAKE2b-256 29c0d8d967bcaa0b572f9dc1d878bbf5a7bfd5afa2102a5ae426731f6ce3bc26

See more details on using hashes here.

File details

Details for the file DAWG-0.7.8-cp36-cp36m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for DAWG-0.7.8-cp36-cp36m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3a5ea13d5a424542d1a7fa908db974e712be90ccdd86cec9e24c6b20794f5f5e
MD5 b9c2a5a22c9d581f9906a0562fef0f65
BLAKE2b-256 874ae2933c2e02abe8034ea7e61f0694d5d170b2facf5f8e68d91f89f133da65

See more details on using hashes here.

File details

Details for the file DAWG-0.7.8-cp35-cp35m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for DAWG-0.7.8-cp35-cp35m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 402659e3044a5fb79dadefeaabb15ba9c0ef56c844bb4bcde6b102afbf4788f8
MD5 6a558b550481fdd3123d5650491e9f3e
BLAKE2b-256 bb150aa44dc0d70450a3364e8899e2aacca65379ad28b8c9770b08921cc83a7f

See more details on using hashes here.

File details

Details for the file DAWG-0.7.8-cp34-cp34m-macosx_10_10_x86_64.whl.

File metadata

File hashes

Hashes for DAWG-0.7.8-cp34-cp34m-macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 b1f9c72bb3eca530f78fcf82f2d60ff41298f10e1c9f018b402af0ecbe246171
MD5 57a38ee7a52781972c620a26d48decd2
BLAKE2b-256 e499d35b9459c15988d869ff04b698e59ccf10e4f17c498ab03eab165b7ec762

See more details on using hashes here.

File details

Details for the file DAWG-0.7.8-cp27-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for DAWG-0.7.8-cp27-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7accbfe484a353e1f02a947f84f817846f30738d1170d4e855f536d5708632a3
MD5 e2ef2a4e6fb4861cb856df1b2f984385
BLAKE2b-256 82f0b5c567db487355a8d14fed1b2eb5af9209b52faf175b4111f1f6924d9365

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page