Easily dump python objects to files, and then load them back.
Project description
Vlermv makes it easy to save Python objects to files with meaningful identifiers. The package Vlermv provides two interfaces.
- Vlermv
vlermv.Vlermv is a dictionary interface.
- Vlermv Cache
vlermv.cache is a decorator that you can use for caching the output of a function.
Using Vlermv
Vlermv provides a dictionary-like object that is associated with a particular directory on your computer.
from vlermv import Vlermv vlermv = Vlermv('/tmp/a-directory')
The keys correspond to files, and the values get pickled to the files.
vlermv['filename'] = range(100) import pickle range(100) == pickle.load(open('/tmp/a-directory/filename', 'rb'))
You can also read and delete things.
# Read range(100) == vlermv['filename'] # Delete del(vlermv['filename'])
The coolest part is that the key gets interpreted in a fancy way. Aside from strings and string-like objects, you can use iterables of strings; all of these indices refer to the file /tmp/a-directory/foo/bar/baz:
vlermv[('foo','bar','baz')] vlermv[['foo','bar','baz']]
If you pass a relative path to a file, it will be broken up as you’d expect; that is, strings get split on slashes and backslashes.
vlermv['foo/bar/baz'] vlermv['foo\\bar\\baz']
Note well: Specifying an absolute path won’t save things outside the vlermv directory.
vlermv['/foo/bar/baz'] # -> foo, bar, baz vlermv['C:\\foo\\bar\\baz'] # -> c, foo, bar, baz # (lowercase "c")
If you pass a URL, it will also get broken up in a reasonable way.
# /tmp/a-directory/http/thomaslevine.com/!/?foo=bar#baz vlermv['http://thomaslevine.com/!/?foo=bar#baz'] # /tmp/a-directory/thomaslevine.com/!?foo=bar#baz vlermv['thomaslevine.com/!?foo=bar#baz']
Dates and datetimes get converted to YYYY-MM-DD
format.
import datetime # /tmp/a-directory/2014-02-26 vlermv[datetime.date(2014,2,26)] vlermv[datetime.datetime(2014,2,26,13,6,42)]
And you can mix these formats!
# /tmp/a-directory/http/thomaslevine.com/open-data/2014-02-26 vlermv[('http://thomaslevine.com/open-data', datetime.date(2014,2,26))]
It also has typical dictionary methods like keys
, values
, items
,
and update
.
Using Vlermv Cache
A function receives input, does something, and then returns output. If you decorate a function with Vlermv Cache, it caches the output; if you call the function again with the same input, it loads the output from the cache instead of doing what it would normally do.
The simplest usage is to decorate the function with @vlermv.cache(). For example,
@vlermv.cache() def is_prime(number): for n in range(2, number): if number % n == 0: return False return True
Now you can call is_prime as if it’s a normal function, and if you call it twice, the second call will load from the cache.
Some fancier uses are discussed below.
Non-default directory
If you pass no arguments to cache, as in the example above, the cache will be stored in a directory named after the function. To set a different directory, pass it as an argument.
@vlermv.cache('~/.primes') def is_prime(number): for n in range(2, number): if number % n == 0: return False return True
I recommend storing your caches in dotted directories under your home directory, as you see above.
Configuration
The kwargs get passed to vlermv.Vlermv, so you can do fun things like changing the serialization function.
@vlermv.cache('~/.http', serializer = vlermv.serializers.identity) def get(url): return requests.get(url).text
Read more about the keyword arguments in the Vlermv section above.
Non-identifying arguments
If you want to pass an argument but not use it as an identifier, pass a non-keyword argument; those get passed along to the function but don’t form the identifier. For example,
@vlermv.cache('~/.http') def get(url, auth = None): return requests.get(url, auth = auth) get('http://this.website.com', auth = ('username', 'password')
Refreshing the cache
I find that I sometimes want to refresh the cache for a particular file, only. This is usually because an error occurred and I have fixed the error. You can delete the cache like this.
@vlermv.cache() def is_prime(number): for n in range(2, number): if number % n == 0: return False return True is_prime(100) del(is_prime[100])
Vlermv Cache has all of Vlermv’s features
The above method for refreshing the cache works because is_prime isn’t really a function; it’s actually a VlermvCache object, which is a sub-class of Vlermv. Thus, you can use it in all of the ways that you can use Vlermv.
@vlermv.cache() def f(x, y): return x + y print(f(3,4)) # 7 print(list(f.keys())) # ['3/4']
You can even set the value to be something weird.
f[('a', 8)] = None, {'key':'value'} print(f('a', 8)) # 0
Each value in f is a tuple of the error and the actual value. Exactly one of these is always None. If the error is None, the value is returned, and if the value is None, the error is raised.
Better than Mongo
Vlermv is nearly better than Mongo, so you should use it anywhere where you were previously using Mongo. Vlermv is designed for write-heavy workloads that need scalability (easy sharding), flexible schemas, and highly configurable indexing.
Things that are missing for a full Mongo replacement
Protection against inode exhaustion
Ability to treate a directory as a document and thus to make atomic edits within a document
Transactions (Mongo doesn’t have them, but they would be cool.)
Indices maybe? In case you want an index on something other than the filename
ACID properties
- Atomicity
Writes are made to a temporary file that gets renamed.
- Consistency
No validation is supported, so the database is always consistent by definition.
- Isolation
Vlermv has isolation within files/documents/values but not across. You may implement your own multi-file transactions.
- Durability
All data are saved to disk right away.
History
I wrote pickle warehouse so I could easily store pickles in files with meaningful names. Then I extended it to support more than just pickles. Then I wrote picklecache, mainly for caching responses to HTTP requests.
And then I finally changed the names because these two packages don’t really have much to do with pickles. pickle_warehouse.Warehouse became vlermv.Vlermv, and picklecache.cache became vlermv.cache. I chose the name “vlermv” by banging on the keyboard; this is how I have been naming things now that I have discovered Dada.
During the transition from pickle warehouse and pickle cache to vlermv, I also added a particular feature that I wanted: the ability to refresh the cache. (And I implemented this by subclassing vlermv.Vlermv, as discussed in the documentation above.)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.