Simple on-disk dictionary
Project description
A dictionary that spills to disk.
Chest acts likes a dictionary but it can write its contents to disk. This is useful in the following two occasions:
Chest can hold datasets that are larger than memory
Chest persists and so can be saved and loaded for later use
LICENSE
New BSD. See License
Install
chest is on the Python Package Index (PyPI):
pip install chest
Example
>>> from chest import Chest
>>> c = Chest()
>>> # Acts like a normal dictionary
>>> c['x'] = [1, 2, 3]
>>> c['x']
[1, 2, 3]
>>> # Data persists to local files
>>> c.flush()
>>> import os
>>> os.listdir(c.path)
['.keys', 'x']
>>> # These files hold pickled results
>>> import pickle
>>> pickle.load(open(c.path))
[1, 2, 3]
>>> # Though one normally accesses these files with chest itself
>>> c2 = Chest(path=c.path)
>>> c2.keys()
['x']
>>> c2['x']
[1, 2, 3]
>>> # Chest is configurable, so one can use json instead of pickle
>>> import json
>>> c = Chest(path='my-chest', dump=json.dump, load=json.load)
>>> c['x'] = [1, 2, 3]
>>> c.flush()
>>> json.load(open('my-chest'))
[1, 2, 3]
Known Failings
Chest was designed to hold a moderate amount of largish numpy arrays. It doesn’t handle the very many small key-value pairs usecase (though could with small effort). In particular chest has the following deficiencies
It determines what values to spill to disk by size. The largest values are spilled. This can be improved by better determination of size (see the nbytes function) and consideration of time-of-use (like an LRU mechanism.)
Spill conditions are checked after every action. Spill conditions often involve an n log(n) sorting process. This could be improved by tracking and efficiently updating the size of all values iteratively.
Chest is not multi-process safe. We should institute a file lock at least around the .keys file.
Chest does not support mutation of variables on disk.
Dependencies
Chest supports Python 2.6+ and Python 3.2+ with a common codebase. It is pure Python and requires no dependencies beyond the standard library.
It is, in short, a light weight dependency.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file chest-0.1.0.tar.gz
.
File metadata
- Download URL: chest-0.1.0.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ede42695b757d48bff51a8b68aecaf27e9b4cb3968eafc879609f58d098d5d92 |
|
MD5 | 8adca786f96e78b5b61c343660216c38 |
|
BLAKE2b-256 | 8f633efa3123b18623767997710b2cc4896279e509b9c6d9baf20dc5a988a615 |