Skip to main content

Saves and loads to the cache a transformed versions of a source object.

Project description

Latest Version License Python Versions CI LINTER Coverage

flexcache

An robust and extensible package to cache on disk the result of expensive calculations.

Consider an expensive function parse that takes a path and returns a parsed version:

>>> content = parse("source.txt")

It would be nice to automatically and persistently cache this result and this is where flexcache comes in.

First, we create a DiskCache object:

>>> from flexcache import DiskCacheByMTime
>>> dc = DiskCacheByMTime(cache_folder="/my/cache/folder")

and then is loaded:

>>> content, basename = dc.load("source.txt", converter=parse)

If this is the first call, as the cached result is not available, parse will be called on source.txt and the output will be saved and returned. The next time, the cached will be loaded and returned.

When the source is changed, the DiskCache detects that the cached file is older, calls parse again storing and returning the new result.

In certain cases you would rather detect that the file has changed by hashing the file. Simply use DiskCacheByHash instead of DiskCacheByMTime.

Cached files are saved using the pickle protocol, and each has a companion json file with the header content.

This idea is completely flexible, and apply not only to parser. In flexcache we say there are two types of objects: source object and converted object. The conversion function maps the former in to the latter. The cache stores the latter by looking a customizable aspect of the former.

Building your own caching logic

In certain cases you would like to customize how caching and invalidation is done.

You can achieve this by subclassing the DiskCache.

>>> from flexcache import DiskCache
>>> class MyDiskCache(DiskCache):
...
...    @dataclass(frozen=True)
...    class MyHeader(NameByPathHeader, InvalidateByExist, BasicPythonHeader):
...         pass
...
...    _header_classes = {pathlib.Path: MyHeader}

Here we created a custom Header class and use it to handle pathlib.Path objects. You can even have multiple headers registered in the same class to handle different source object types.

We provide a convenient set of mixable classes to achieve almost any behavior. These are divided in three categories and you must choose at least one from every kind.

Headers

These classes store the information that will be saved along side the cached file.

  • BaseHeader: source object and identifier of the converter function.

  • BasicPythonHeader: source and identifier of the converter function, platform, python implementation, python version.

Invalidate

These classes define how the cache will decide if the cached converted object is an actual representation of the source object.

  • InvalidateByExist: the cached file must exists.

  • InvalidateByPathMTime: the cached file exists and is newer than the source object (which has to be pathlib.Path)

  • InvalidateByMultiPathsMtime: the cached file exists and is newer than the each path in the source object (which has to be tuple[pathlib.Path])

Naming

These classes define how the name is generated. The basename for the cache file is a hash hexdigest built by feeding a collection of values determined by the Header object.

  • NameByFields: all fields except the source_object.

  • NameByPath: resolved path of the source object (which has to be pathlib.Path).

  • NameByMultiPaths: resolved path of each path source object (which has to be tuple[pathlib.Path]), sorted in ascending order.

  • NameByFileContent: the bytes content of the file referred by the source object (which has to be pathlib.Path).

  • NameByHashIter: the values in the source object. (which has to be tuple[str]), sorted in ascending order

  • NameByObj: the pickled version of the source object (which has to be pickable), using the highest available protocol. This also adds pickle_protocol to the header.

You can mix and match as you see it fit, and of course, you can make your own.

Finally, you can also avoid saving the header by setting the _store_header class attribute to False.


This project was started as a part of Pint, the python units package.

See AUTHORS for a list of the maintainers.

To review an ordered list of notable changes for each version of a project, see CHANGES

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flexcache-0.3.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

flexcache-0.3-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file flexcache-0.3.tar.gz.

File metadata

  • Download URL: flexcache-0.3.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for flexcache-0.3.tar.gz
Algorithm Hash digest
SHA256 18743bd5a0621bfe2cf8d519e4c3bfdf57a269c15d1ced3fb4b64e0ff4600656
MD5 11e710fd4049053b2c1a939aa46fcc54
BLAKE2b-256 55b08a21e330561c65653d010ef112bf38f60890051d244ede197ddaa08e50c1

See more details on using hashes here.

Provenance

File details

Details for the file flexcache-0.3-py3-none-any.whl.

File metadata

  • Download URL: flexcache-0.3-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for flexcache-0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d43c9fea82336af6e0115e308d9d33a185390b8346a017564611f1466dcd2e32
MD5 bf9972ca7d2645390c1cbf4e9fb943ae
BLAKE2b-256 27cdc883e1a7c447479d6e13985565080e3fea88ab5a107c21684c813dba1875

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page