Skip to main content

Extensions to standard Python's heapq for performance applications

Project description

Fast CPython extensions to Python standard library with focus on performance.

This library provides CPython native extensions to mimic some of the well known built-in types. The implementation relies on enforced protocol - all the objects and abstract data types are implemented in C/C++ to provide highly effective manipulation.

Extended dict - fext.ExtDict

The implementation of fext.ExtDict was introduce to provide a data structure capable of limiting number of items stored in a dictionary (a hash table).

https://raw.githubusercontent.com/thoth-station/fext/master/fig/fext_extdict.png

fext.ExtDict is implemented as a standard Python dict and supports most dict methods. To restrict number of items stored in the dictionary, fext.edict.ExtDict uses a min-heap queue that stores key-value pairs (key to the dictionary with it’s value). When an item is inserted into the dictionary, fext.ExtDict checks its size and possibly performs pushpop-like operation on the heap to limit number of items stored.

The comparision is done solely on the actual value. Keys are stored in min-heap queue together with values as pairs to guarantee O(1) time when removing an item from the fext.edictExtDict.

Extended heapq - fext.ExtHeapQueue

The extended heap queue acts as a min-heap queue from the standard Python library. It uses a hash table for storing information about indexes (where values sit in the min-heap queue) to optimize removals from the heap.

https://raw.githubusercontent.com/thoth-station/fext/master/fig/fext_extheapq.png

Using fext in a C++ project

The design of this library allows you to use sources in your C++ project as well. The eheapq.hpp file defines the extended heap queue and edict.hpp the extended dictionary. Python files then act as a bindings to their respective Python interfaces. Mind the API design for the templated classes - it was meant to be used with pointers to objects (so avoid possible copy constructors).

Building the extensions

To build extensions, install the following packages (Fedora):

dnf install -y python3-devel g++ gcc

Now you can build extensions:

python3 setup.py build

If you would like to produce binaries with debug information:

CFLAGS='-Wall -O0 -ggdb' python3 setup.py build

Check sections below for more info on testing the C/C++ parts of extensions.

Reference count and memory leak checks

You can find Makefile in the Git repo. This repo defines targets to perform leak checks and reference count checks. Note they use different Python interpreters so make sure you do not mix virtual environments when running the tests.

make check

Developing the extension

First, prepare your environment:

dnf install -y make
make deps

To develop or adjust sources, simply change sources and verify your change is accepted by the test suite:

make check

The check target will run the actual test suite (see also make test). Besides it, the test suite will be executed two more times to check test suite and its behaviour with respect to Python object reference counting (python3-debug dependency will be automatically installed with the provided make deps). This part of the test suite can be executed using make check-refcount. The last part of the test suite runs valgrind against the test suite - you can explicitly trigger this part by calling make check-leaks.

Mind make-refcount and make check-leaks will take some time given the checks and processing that is done on the background. To verify your changes more iterativelly, make test should do the trick (don’t forget to do make check after that though).

To clean up your environment, perform:

make clean

Building and releasing

First, let’s run a containerized environment (make sure you are in the root of this repo):

podman run --rm --workdir /home --entrypoint bash -it --volume ./:/home:Z centos:8

The following commands (run in the container stated above) will install all the necessary tools:

dnf install -y make
make deps

Once tests pass, clean the environment:

make clean

Now we should be ready to produce bdist and sdist distribution for PyPI:

python3 setup.py bdist
python3 setup.py sdist

Finally, upload artifacts to PyPI:

twine upload dist/*

Usage

These data structures were designed for Thoth’s adviser - for data kept in resolver’s internal state as well as in the reinforcement learning part.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fext-0.1.0.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

fext-0.1.0-cp36-cp36m-manylinux2014_x86_64.whl (211.6 kB view details)

Uploaded CPython 3.6m

File details

Details for the file fext-0.1.0.tar.gz.

File metadata

  • Download URL: fext-0.1.0.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for fext-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ceeb550afd601fae4926a3a126da2a7230ce29077f0d3b98127f95d738dbbb56
MD5 d5d5ce2b2409487136b41b493a36fc55
BLAKE2b-256 5c4a8207f953fd461ca4eeaf60c641916bf8819d5639b96622377934e94f26ec

See more details on using hashes here.

File details

Details for the file fext-0.1.0-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: fext-0.1.0-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 211.6 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.8

File hashes

Hashes for fext-0.1.0-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f4cc87cfca7a3f40da3d32c94c79fabbfa352906550440848e9ded2363bc95df
MD5 6f82b696ce5e236ac21a5244f09186c8
BLAKE2b-256 15a742489f6af612ca829e9c89216055dda09acc2401896ba2b831754a87fe9b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page