application-level distributed file system written in Python
Project description
KoboldFS is an application-level distributed file system written in Python. Inspired by MogileFS[1], it shares some of its properties and features:
Application level – no special kernel modules required;
No single point of failure – all the components of a KoboldFS setup (servers and database) can be run on multiple machines, so there’s no single point of failure (a minimum of 2 machines is recommended);
Automatic file replication – files are automatically replicated between all the servers. In KoboldFS there is no concept of “class”, so it is not possible to specify if a given file has to be replicated only in a subset of the available servers;
“Better than RAID” – in a non-SAN RAID setup, the disks are redundant, but the host isn’t. If you lose the entire machine, the files are inaccessible. KoboldFS replicates the files between devices which are on different hosts, so files are always available;
Flat name space – Files are identified by named keys in a flat, global name space. You can create as many name spaces as you’d like, so multiple applications with potentially conflicting keys can run on the same MogileFS installation;
Shared-Nothing – KoboldFS doesn’t depend on a pricey SAN with shared disks. Every machine maintains its own local disks;
No RAID required – Local disks on KoboldFS storage nodes can be in a RAID, or not. It’s cheaper not to, as RAID doesn’t buy you any safety that MogileFS doesn’t already provide;
Local file system agnostic – Local disks on KoboldFS storage nodes can be formatted with your file system of choice (ext3, XFS, etc..). KoboldFS does its own internal directory hashing so it doesn’t hit file system limits such as “max files per directory” or “max directories per directory”. Use what you’re comfortable with;
Completely portable – it is a python-only module, thus can be run on any operating system and architecture which is supported by Python;
Database-agnostic – it can run with any SQL database; actually only the PostgreSQL support is implemented, but adding support for new databases is quick and easy;
Support for serving the stored files directly by an external web server, reducing the load on the application servers.
KoboldFS is not:
POSIX Compliant – you don’t run regular Unix applications or databases against KoboldFS; it’s meant for archiving write-once files and doing only sequential reads (though you can modify a file by way of overwriting it with a new version).
Sample usage:
>>> from StringIO import StringIO >>> from koboldfs import Client>>> client = Client('demo', servers=['127.0.0.1:9876', '127.0.0.1:9875'])>>> print client.ping() True>>> print client.put('motd', '/etc/motd') True>>> output = StringIO() >>> if client.get('motd', output): >>> output.seek(0) >>> print output.read() Linux...>>> print client.get_url('motd') http://...>>> print client.delete('motd') True>>> client.get('motd', output) False>>> assert client.get_url('motd') is None True
References:
0.3.2 (2011-01-13)
Added the method “client.put_stat(key, source)” to the client; it works like the put method, but instead of a boolean it returns None in case of error, and a dictionary {‘digest’: …, ‘size’: …} in case of success.
0.3.1 (2011-01-10)
Require SQLAlchemy >= 0.6; this fixes some corner-cases in the UTF-8 encoding of data retrieved from the database; updated the models to the new syntax of SQLAlchemy.
Added support for parsing of ini files.
0.3.0 (2010-04-01)
Removed the koboldfs.zope module and related dependencies; the same functionality can be achieved using koboldfs.client.ClientPool, without depending on any zope package.
Introduced koboldfs.client.TransactionalClientPool, which supports (optionally two-phase) transactions.
Use SQLAlchemy instead of directly depending on psycopg2 for the database connection; koboldfs is now (virtually) compatible with any database back-end which is supported by SQLAlchemy.
Added init scripts using buildout.
Added unit tests using sqlite as database back-end.
0.2.2 (2009-07-05)
Fixes in the Data Manager: use the connection pool instead of always keeping a database connection open.
0.2.1 (2009-05-28)
koboldfs.zope is an extra package now.
0.2.0 (2009-05-28)
First public release.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.