Skip to main content

Export data from a SQL database to a set of file formats.

Project description

Build Status

datafreeze creates static extracts of SQL databases for use in interactive web applications. SQL databases are a great way to manage relational data, but exposing them on the web to drive data apps can be cumbersome. Often, the capacities of a proper database are not actually required, a few static JSON files and a bit of JavaScript can have the same effect. Still, exporting JSON by hand (or with a custom script) can also become a messy process.

With datafreeze, exports are scripted in a Makefile-like description, making them simple to repeat and replicate.

Installation

The easiest way to install datafreeze is to retrieve it from the Python package index using pip:

pip install datafreeze

Basic Usage

Calling DataFreeze is simple, the application is called with a freeze file as its argument:

datafreeze Freezefile.yaml

Freeze files can be either written in JSON or in YAML. The database URI indicated in the Freezefile can also be overridden via the command line:

datafreeze --db sqlite:///foo.db Freezefile.yaml

Example Freezefile.yaml

A freeze file is composed of a set of scripted queries and specifications on how their output is to be handled. An example could look like this:

common:

  database: "postgresql://user:password@localhost/operational_database"
  prefix: my_project/dumps/
  format: json

exports:

  - query: "SELECT id, title, date FROM events"
    filename: "index.json"

  - query: "SELECT id, title, date, country FROM events"
    filename: "countries/{{country}}.csv"
    format: csv

  - query: "SELECT * FROM events"
    filename: "events/{{id}}.json"
    mode: item

  - query: "SELECT * FROM events"
    filename: "all.json"
    format: tabson

An identical JSON configuration can be found in this repository.

Options in detail

The freeze file has two main sections, common and exports. Both accept many of the same arguments, with exports specifying a list of exports while common defines some shared properties, such as the database connection string.

The following options are recognized:

  • database is a database URI, including the database type, username and password, hostname and database name. Valid database types include sqlite, mysql and postgresql (requires psycopg2).

  • prefix specifies a common root directory for all extracted files.

  • format identifies the format to be generated, csv, json and tabson are supported. tabson is a condensed JSON representation in which rows are not represented by objects but by lists of values.

  • query needs to be a valid SQL statement. All selected fields will become keys or columns in the output, so it may make sense to define proper aliases if any overlap is to be expected.

  • mode specifies whether the query output is to be combined into a single file (list) or whether a file should be generated for each result row (item).

  • filename is the output file name, appended to prefix. All occurences of {{field}} are expanded to a fields value to allow the generation of file names e.g. by primary key. In list mode, templating can be used to group records into several buckets, e.g. by country or category.

  • wrap can be used to specify whether the output should be wrapped in a results hash in JSON output. This defaults to true for list-mode output and false for item-mode.

Contributors

dataset is written and maintained by Friedrich Lindenberg, Gregor Aisch and Stefan Wehrmeyer. We’re standing on the shoulders of giants.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datafreeze-0.1.0.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

datafreeze-0.1.0-py2.py3-none-any.whl (13.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file datafreeze-0.1.0.tar.gz.

File metadata

  • Download URL: datafreeze-0.1.0.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for datafreeze-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e9611df781b34d7183d06755979014230216ce9c274360dd186a9a20fcca317f
MD5 ccbde0ffaeffa3d9a802b7704aa38e21
BLAKE2b-256 82c3e90b668e08a185712f06467943284a4fc817eb689a262e16c383bef70a08

See more details on using hashes here.

File details

Details for the file datafreeze-0.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for datafreeze-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 0a7ff7e5e92c01d7b74cfff294618b799dd6e5698ad585be60877fc807bc5d26
MD5 526c0f63f6250db6d3a22ccd878ed4cc
BLAKE2b-256 e0419e86652a1293a0c4664e5aa34a8f5b1a2fe5a04f7cae287ae79385f65ec0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page