Integrate datapackage-pipelines with CKAN
Project description
# ckanext-datapackage_pipelines
[![CKAN pipelines server Docker image: orihoch/datapackage-pipelines-ckan](https://img.shields.io/badge/CKAN%20pipelines%20server%20Docker%20image-orihoch/datapackage--pipelines--ckanext-darkgreen.svg)](https://hub.docker.com/r/orihoch/datapackage-pipelines-ckanext/)
Integrate [datapackage-pipelines](https://github.com/frictionlessdata/datapackage-pipelines) with CKAN
Minimal supported CKAN version: 2.8.1
## Installation
### Install the plugin
* Create a directory to hold the pipelines, ckan pipeline extensions write to that directory
* `sudo mkdir -p /var/ckan/pipelines`
* `sudo chown -R $USER:$GROUP /var/ckan`
* This directory should be shared between the pipelines server and CKAN
* Activate your CKAN virtual environment
* Install the ckanext-datapackage_pipelines package into your virtual environment:
* `pip install ckanext-datapackage_pipelines`
* Add `datapackage_pipelines` to the `ckan.plugins` setting in your CKAN
* Restart CKAN.
### Start the datapackage-pipelines server
The following command starts a local pipelines server for development on the host network
CKAN_API_KEY should be a CKAN user's api key which has sysadmin privileges
If you are running the CKAN Redis server on the same host, you should modify the port to prevent collision
with the pipelines server Redis which runs on port 6379.
The pipelines server runs on port 5050.
```
docker run -v /var/ckan/pipelines:/pipelines:rw \
-e CKAN_API_KEY=*** \
-e CKAN_URL=http://localhost:5000 \
--net=host \
orihoch/datapackage-pipelines-ckanext server
```
## Usage
Pipelines dashboard is available publically at http://your-ckan-url/pipelines
CKAN plugins can use the pipelines server by implementing the `IDatapackagePipelines` interface which contains the following methods:
* `register_pipelines` - returns the pipelines name (usually the name of the plugin) and directory to get the plugin's
pipelines from. When CKAN is restarted the pipelines are copied by default to /var/ckan/pipelines - this directory should be
shared between CKAN and the pipelines server. If the plugin pipelines directories contains a `requirements.txt` it will be
installed on restart of the pipelines server.
* `get_pipelines_config` - returns a dict of key-value pairs containing the plugin's configuration or other data which should be available to the pipeline processors.
Pipeline processors can get this configuration using `datapackage_pipelines_ckanext.helpers.get_plugin_configuration(plugin_name)`.
The following pipelines processors are available:
* `ckanext.dump_to_path` - same as standard library `dump.to_path` but dumps to the CKAN data directory.
* parameters:
* `plugin`: **required** name of the plugin
* `out-path`: relative path within the plugin's data directory
* `ckanext.load_resource` - same as standard library `load_resource` but loads from CKAN data directory.
* parameters:
* `path`: **required** relative path to the datapackage in the plugin's data directory
* `plugin`: **required** the plugin's name
To support pipeline dependencies, rename your `pipeline-spec.yaml` to `ckanext.source-spec.yaml`
Following is an example pipeline spec where the `download_data` pipeline will run on a schedule
and after each scheduled run the `load_data_to_ckan` pipeline will run:
```
download_data:
schedule:
crontab: "1 2 * * *"
pipeline:
- ...
load_data_to_ckan:
dependencies:
- ckanext-pipeline: your_plugin_name download_data
```
## CKAN Plugin Configuration
Following are the supported configurations and default values
```
ckanext.datapackage_pipelines.directory = /var/ckan/pipelines
ckanext.datapackage_pipelines.dashboard_url = http://localhost:5050
```
[![CKAN pipelines server Docker image: orihoch/datapackage-pipelines-ckan](https://img.shields.io/badge/CKAN%20pipelines%20server%20Docker%20image-orihoch/datapackage--pipelines--ckanext-darkgreen.svg)](https://hub.docker.com/r/orihoch/datapackage-pipelines-ckanext/)
Integrate [datapackage-pipelines](https://github.com/frictionlessdata/datapackage-pipelines) with CKAN
Minimal supported CKAN version: 2.8.1
## Installation
### Install the plugin
* Create a directory to hold the pipelines, ckan pipeline extensions write to that directory
* `sudo mkdir -p /var/ckan/pipelines`
* `sudo chown -R $USER:$GROUP /var/ckan`
* This directory should be shared between the pipelines server and CKAN
* Activate your CKAN virtual environment
* Install the ckanext-datapackage_pipelines package into your virtual environment:
* `pip install ckanext-datapackage_pipelines`
* Add `datapackage_pipelines` to the `ckan.plugins` setting in your CKAN
* Restart CKAN.
### Start the datapackage-pipelines server
The following command starts a local pipelines server for development on the host network
CKAN_API_KEY should be a CKAN user's api key which has sysadmin privileges
If you are running the CKAN Redis server on the same host, you should modify the port to prevent collision
with the pipelines server Redis which runs on port 6379.
The pipelines server runs on port 5050.
```
docker run -v /var/ckan/pipelines:/pipelines:rw \
-e CKAN_API_KEY=*** \
-e CKAN_URL=http://localhost:5000 \
--net=host \
orihoch/datapackage-pipelines-ckanext server
```
## Usage
Pipelines dashboard is available publically at http://your-ckan-url/pipelines
CKAN plugins can use the pipelines server by implementing the `IDatapackagePipelines` interface which contains the following methods:
* `register_pipelines` - returns the pipelines name (usually the name of the plugin) and directory to get the plugin's
pipelines from. When CKAN is restarted the pipelines are copied by default to /var/ckan/pipelines - this directory should be
shared between CKAN and the pipelines server. If the plugin pipelines directories contains a `requirements.txt` it will be
installed on restart of the pipelines server.
* `get_pipelines_config` - returns a dict of key-value pairs containing the plugin's configuration or other data which should be available to the pipeline processors.
Pipeline processors can get this configuration using `datapackage_pipelines_ckanext.helpers.get_plugin_configuration(plugin_name)`.
The following pipelines processors are available:
* `ckanext.dump_to_path` - same as standard library `dump.to_path` but dumps to the CKAN data directory.
* parameters:
* `plugin`: **required** name of the plugin
* `out-path`: relative path within the plugin's data directory
* `ckanext.load_resource` - same as standard library `load_resource` but loads from CKAN data directory.
* parameters:
* `path`: **required** relative path to the datapackage in the plugin's data directory
* `plugin`: **required** the plugin's name
To support pipeline dependencies, rename your `pipeline-spec.yaml` to `ckanext.source-spec.yaml`
Following is an example pipeline spec where the `download_data` pipeline will run on a schedule
and after each scheduled run the `load_data_to_ckan` pipeline will run:
```
download_data:
schedule:
crontab: "1 2 * * *"
pipeline:
- ...
load_data_to_ckan:
dependencies:
- ckanext-pipeline: your_plugin_name download_data
```
## CKAN Plugin Configuration
Following are the supported configurations and default values
```
ckanext.datapackage_pipelines.directory = /var/ckan/pipelines
ckanext.datapackage_pipelines.dashboard_url = http://localhost:5050
```
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file ckanext-datapackage_pipelines-0.0.4.tar.gz
.
File metadata
- Download URL: ckanext-datapackage_pipelines-0.0.4.tar.gz
- Upload date:
- Size: 19.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/38.4.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ea6f997090fc9e57ea7ac4333401a39642ac142c8e7dddd93cc0a402883151d9 |
|
MD5 | dfa765ad489e22e22ee9dc991bc219e1 |
|
BLAKE2b-256 | 9b74ffcd5e9638e16df6d4fc34e1360e956f48a5bce5b4c786e64111b7cd6f60 |