Skip to main content

a build tool for data

Project description

make for your data.

An automation tool for data manipulation.

Inspired by Open Refine.

The general principles in Databuild are:

  • Low entry barrier

  • Easy to install

  • Easy to grasp

  • Extensible

Databuild can be useful for scenarios such as:

  • Documenting data transformations in your infoviz project

  • Automate data processing in a declarative way

Installation

Install databuild:

$ pip install databuild

Quickstart

For more details, see the Extended Documentation.

$ data-build.py buildfile.json

buildfile.json contains a list of operations to be performed on data. Think of it as a script for a spreadsheet.

An example of build file could be:

[
  {
    "function": "sheets.import_data",
    "description": "Importing data from csv file",
    "params": {
      "sheet": "dataset1",
      "format": "csv",
      "filename": "dataset1.csv",
      "skip_last_lines": 1
    }
  },
  {
    "function": "columns.add_column",
    "description": "Calculate the gender ratio",
    "params": {
      "sheet": "dataset1",
      "name": "Gender Ratio",
      "expression": {
        "language": "python",
        "content": "return float(row['Male Total']) / float(row['Female Total'])"
      }
    }
  },
  {
    "function": "sheets.export_data",
    "description": "",
    "params": {
      "sheet": "dataset1",
      "format": "csv",
      "filename": "dataset2.csv"
    }
  }
]

YAML buildfiles are also supported. databuild will guess the type based on the extension.

License

Licensed under BSD 3-clauses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databuild-0.0.6.tar.gz (16.3 kB view details)

Uploaded Source

File details

Details for the file databuild-0.0.6.tar.gz.

File metadata

  • Download URL: databuild-0.0.6.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for databuild-0.0.6.tar.gz
Algorithm Hash digest
SHA256 b4a7c7c24a3518ec372fd2bc358e14fedea3935a760aa6d3afbeb3c24d41c563
MD5 e7deeb6bcb87c1a7945a9eb65c031e59
BLAKE2b-256 1373cb55b8ad8f8aa0679feffb2440150990e6e2ef1fef136a9ac499fe01483a

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page