Skip to main content

CKAN extension that adds a "Download all" button to a dataset

Project description

https://travis-ci.org/davidread/ckanext-downloadall.svg?branch=master Latest Version Supported Python versions Development Status License

ckanext-downloadall

This CKAN extension adds a “Download all” button to datasets. This downloads a zip file containing all the resource files and a datapackage.json.

demo.png

This zip file is a good way to package data for storing or sending, because:

  • you keep all the data files together

  • you include the documentation (metadata) - avoids the common problem of being handed some data files and not know anything about it or where to find info

  • the metadata is machine-readable, so can be used by tools, software and in automated workflows. For example:

    • validating a series of data releases all meet a standard schema

    • loading it into a database, using the column types and foreign key relations specified in the metadata

The datapackage.json is a Frictionless Data standard, also known as a Data Package.

Technical notes

If the resource is pushed/xloaded to DataStore then the schema (column types) is also included in the datapackage.json file.

This extension uses a CKAN background job to create the zip every time a dataset is created or updated (or its data dictionary is changed). This suits CKANs where all files are uploaded - if the underlying data file changes without the CKAN URL changing, then the zip will not include the update (until something else triggers the zip to update).

(This extension is inspired by ckanext-packagezip, but that is old and relied on ckanext-archiver and IPipe.)

Requirements

Designed to work with CKAN 2.7+

Ideally it is used in conjunction with DataStore and xloader (or datapusher), so that the Data Dictionary is included as a schema in the datapackage.json, to describe the column types.

Installation

To install ckanext-downloadall:

  1. Activate your CKAN virtual environment, for example:

    . /usr/lib/ckan/default/bin/activate
  2. Install the ckanext-downloadall Python package into your virtual environment:

    pip install ckanext-downloadall
  3. Add downloadall to the ckan.plugins setting in your CKAN config file (by default the config file is located at /etc/ckan/default/production.ini). e.g.

    ckan.plugins = downloadall
  4. Restart the CKAN worker. For example if you’ve deployed it with supervisord:

    sudo supervisorctl restart ckan-worker:ckan-worker-00
  5. Restart CKAN server. For example if you’ve deployed CKAN with Apache on Ubuntu:

    sudo service apache2 reload

6. Ensure the background job ‘worker’ process is running - see https://docs.ckan.org/en/2.8/maintaining/background-tasks.html#running-background-jobs

Config Settings

# Include additional fields from the dataset in the datapackage.json (e.g.
# those defined in a ckanext-scheming schema)
# (optional, space separated list).
ckanext.downloadall.dataset_fields_to_add_to_datapackage = district county

Command-line interface

There is a command-line interface:

downloadall --help

Examples of use:

downloadall update-zip gold-prices
downloadall update-all-zips

Troubleshooting

“All resource data” appears as a normal resource, instead of seeing a “Download All” button

You need to enable this extension in the CKAN config and restart the server. See the Installation section above.

ImportError: No module named datapackage

This means you have an older version of ckanapi, which is a dependency of ckanext-downloadall. Install a newer version.

OSError: [Errno 13] Permission denied: ‘/data/ckan/resources/c89’

You are trying to update zips from the command-line but running the tasks synchronously, rather than with the normal worker process. In this case you need to run it as the www-data user e.g.:

sudo -u www-data /usr/lib/ckan/default/bin/downloadall -c /etc/ckan/default/production.ini update-all-zips --synchronous

Development Installation

To install ckanext-downloadall for development, activate your CKAN virtualenv and do:

git clone https://github.com/davidread/ckanext-downloadall.git
cd ckanext-downloadall
python setup.py develop
pip install -r dev-requirements.txt

Remember to run the worker (in a separate terminal):

paster --plugin=ckan jobs worker --config=/etc/ckan/default/development.ini

Running the Tests

To run the tests, do:

nosetests --nologcapture --with-pylons=test.ini

To run the tests and produce a coverage report, first make sure you have coverage installed in your virtualenv (pip install coverage) then run:

nosetests --nologcapture --with-pylons=test.ini --with-coverage --cover-package=ckanext.downloadall --cover-inclusive --cover-erase --cover-tests

Releasing a New Version of ckanext-downloadall

ckanext-downloadall is availabe on PyPI as https://pypi-hypernode.com/project/ckanext-downloadall/. To publish a new version to PyPI follow these steps:

  1. Update the version number in the setup.py file. See PEP 440 for how to choose version numbers.

  2. Update the CHANGELOG.md with details of this release.

  3. Make sure you have the latest version of necessary packages:

    pip install --upgrade setuptools wheel twine
  4. Create a source and binary distributions of the new version:

    python setup.py sdist bdist_wheel && twine check dist/*

    Fix any errors you get.

  5. Upload the source distribution to PyPI:

    twine upload dist/*
  6. Commit any outstanding changes:

    git commit -a
    git push
  7. Tag the new release of the project on GitHub with the version number from the setup.py file. For example if the version number in setup.py is 0.0.1 then do:

    git tag 0.0.1
    git push --tags

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ckanext-downloadall-0.1.0.tar.gz (31.8 kB view details)

Uploaded Source

Built Distribution

ckanext_downloadall-0.1.0-py2-none-any.whl (32.3 kB view details)

Uploaded Python 2

File details

Details for the file ckanext-downloadall-0.1.0.tar.gz.

File metadata

  • Download URL: ckanext-downloadall-0.1.0.tar.gz
  • Upload date:
  • Size: 31.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.6.0 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/2.7.15

File hashes

Hashes for ckanext-downloadall-0.1.0.tar.gz
Algorithm Hash digest
SHA256 11ab0ea253746fef6d110add584652eac199a3421fcb7c4167262e1f47d79e86
MD5 65ab4c2ceb9d39d6d5a357334f129b8f
BLAKE2b-256 73173808a5afbd14a304200281f1d65f60c93c2208a04dc85aaed9b84ec5ee64

See more details on using hashes here.

File details

Details for the file ckanext_downloadall-0.1.0-py2-none-any.whl.

File metadata

  • Download URL: ckanext_downloadall-0.1.0-py2-none-any.whl
  • Upload date:
  • Size: 32.3 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.11.1 setuptools/41.6.0 requests-toolbelt/0.8.0 tqdm/4.29.1 CPython/2.7.15

File hashes

Hashes for ckanext_downloadall-0.1.0-py2-none-any.whl
Algorithm Hash digest
SHA256 0595e96686c81f5b7feb9768cd8c2fd2d2047bcb016f52a68f206a5db3c6e51b
MD5 50f9a18f87d6ee19a740f55315297b2b
BLAKE2b-256 b86819203626ceaae0ea48d31fe3c54e1dd2bde9d09d06d07c3f55197471f822

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page