Python SDK and CLI for the Renku platform.
Project description
A Python library for the Renku collaborative data science platform. It allows the user to create projects, manage datasets, and capture data provenance while performing analysis tasks.
- NOTE:
renku-python is the python library for Renku that provides an SDK and a command-line interface (CLI). It does not start the Renku platform itself - for that, refer to the Renku docs on running the platform.
Installation
The latest release is available on PyPI and can be installed using pip:
$ pip install renku
The latest development versions are available on PyPI or from the Git repository:
$ pip install --pre renku # - OR - $ pip install -e git+https://github.com/SwissDataScienceCenter/renku-python.git#egg=renku
Use following installation steps based on your operating system and preferences if you would like to work with the command line interface and you do not need the Python library to be importable.
Homebrew
The recommended way of installing Renku on MacOS and Linux is via Homebrew.
$ brew tap swissdatasciencecenter/renku $ brew install renku
Isolated environments using pipx
Install and execute Renku in an isolated environment using pipx. It will guarantee that there are no version conflicts with dependencies you are using for your work and research.
Install pipx and make sure that the $PATH is correctly configured.
$ python3 -m pip install --user pipx $ pipx ensurepath
Once pipx is installed use following command to install renku.
$ pipx install renku $ which renku ~/.local/bin/renku
Prevously we have recommended to use pipsi. You can still use it or migrate to **pipx**.
Docker
The containerized version of the CLI can be launched using Docker command.
$ docker run -it -v "$PWD":"$PWD" -w="$PWD" renku/renku-python renku
It makes sure your current directory is mounted to the same place in the container.
Usage
Initialize a renku project:
$ mkdir -p ~/temp/my-renku-project $ cd ~/temp/my-renku-project $ renku init
Create a dataset and add data to it:
$ renku dataset create my-dataset $ renku dataset add my-dataset https://raw.githubusercontent.com/SwissDataScienceCenter/renku-python/master/README.rst
Run an analysis:
$ renku run wc < data/my-dataset/README.rst > wc_readme
Trace the data provenance:
$ renku log wc_readme
These are the basics, but there is much more that Renku allows you to do with your data analysis workflows. The full documentation will soon be available at: https://renku-python.readthedocs.io/
Developing Renku
For development it’s convenient to install renku in editable mode. This is still possible with pipx. First clone the repository and then do:
$ pipx install \ --editable \ --spec <path-to-renku-python>[all] \ renku
This will install all the extras for testing and debugging.
Using External Debuggers
To run renku via e.g. the Visual Studio Code debugger you need run it via the python executable in whatever virtual environment was used to install renku. If there is a package needed for the debugger, you need to inject it into the virtual environment first, e.g.:
$ pipx inject renku ptvsd
Finally, run renku via the debugger:
$ ~/.local/pipx/venvs/renku/bin/python -m ptvsd --host localhost --wait -m renku.cli <command>
If using Visual Studio Code, you may also want to set the Remote Attach configuration PathMappings so that it will find your source code, e.g.
{ "name": "Python: Remote Attach", "type": "python", "request": "attach", "port": 5678, "host": "localhost", "pathMappings": [ { "localRoot": "<path-to-renku-python-source-code>", "remoteRoot": "<path-to-renku-python-source-code>" } ] },
Changes
0.8.0 (2019-11-21)
Bug Fixes
Features
0.7.0 (2019-10-15)
Bug Fixes
0.6.1 (2019-10-10)
Bug Fixes
Features
0.6.0 (2019-09-18)
Bug Fixes
adds _label and commit data to imported dataset files, single commit for imports (#651) (75ce369)
always add commit to dataset if possible (#648) (7659bc8), closes #646
cleanup needed for integration tests on py35 (#653) (fdd7215)
fixed serialization of datetime to iso format (#629) (693d59d)
hide image, pull, runner, show, workon and deactivate commands (#672) (a3e9998)
Removes unneccesary call to git lfs with no paths (#658) (e32d48b)
use latest_html for version check (#647) (c6b0309), closes #641
zenodo export failing with relative paths (d40967c)
Features
0.5.2 (2019-07-26)
Bug Fixes
Features
0.5.1 (2019-07-12)
Bug Fixes
ensure external storage is handled correctly (#592) (7938ac4)
cli: allow renku run with many inputs (f60783e), closes #552
modify json-ld for datasets (#534) (ab6a719), closes #525 #526
refactored tests and docs to align with updated pydoctstyle (#586) (6f981c8)
cli: add check of missing references (9a373da)
cli: fail when removing non existing dataset (dd728db)
status: fix renku status output when not in root folder (#564) (873270d), closes #551
datasets: strip query string from data filenames (450898b)
cli: remove dataset aliases (6206e62)
cwl: detect script as input parameter (e23b75a), closes #495
deps: updated dependencies (691644d)
Features
added support for working on dirty repo (ae67be7)
0.5.0 (2019-03-28)
Bug Fixes
Features
api: list datasets from a commit (04a9fe9)
cli: add dataset rm command (a70c7ce)
cli: add rm command (cf0f502)
cli: configurable format of dataset output (d37abf3)
dataset: add existing file from current repo (575686b), closes #99
datasets: added ls-files command (ccc4f59)
models: reference context for relative paths (5d1e8e7), closes #452
add JSON-LD output format for datasets (c755d7b), closes #426
generate Makefile with log –format Makefile (1e440ce)
v0.4.0
(released 2019-03-05)
Adds renku mv command which updates dataset metadata, .gitattributes and symlinks.
Pulls LFS objects from submodules correctly.
Adds listing of datasets.
Adds reduced dot format for renku log.
Adds doctor command to check missing files in datasets.
Moves dataset metadata to .renku/datasets and adds migrate datasets command and uses UUID for metadata path.
Gets git attrs for files to prevent duplicates in .gitattributes.
Fixes renku show outputs for directories.
Runs Git LFS checkout in a worktrees and lazily pulls necessary LFS files before running commands.
Asks user before overriding an existing file using renku init or renku runner template.
Fixes renku init --force in an empty dir.
Renames CommitMixin._location to _project.
Addresses issue with commits editing multiple CWL files.
Exports merge commits for full lineage.
Exports path and parent directories.
Adds an automatic check for the latest version.
Simplifies issue submission from traceback to GitHub or Sentry. Requires SENTRY_DSN variable to be set and sentry-sdk package to be installed before sending any data.
Removes outputs before run.
Allows update of directories.
Improves readability of the status message.
Checks ignored path when added to a dataset.
Adds API method for finding ignored paths.
Uses branches for init --force.
Fixes CVE-2017-18342.
Fixes regex for parsing Git remote URLs.
Handles --isolation option using git worktree.
Renames client.git to client.repo.
Supports python -m renku.
Allows ‘.’ and ‘-’ in repo path.
v0.3.3
(released 2018-12-07)
Fixes generated Homebrew formula.
Renames renku pull path to renku storage pull with deprecation warning.
v0.3.2
(released 2018-11-29)
Fixes display of workflows in renku log.
v0.3.1
(released 2018-11-29)
Fixes issues with parsing remote Git URLs.
v0.3.0
(released 2018-11-26)
Adds JSON-LD context to objects extracted from the Git repository (see renku show context --list).
Uses PROV-O and WFPROV as provenance vocabularies and generates “stable” object identifiers (@id) for RDF and JSON-LD output formats.
Refactors the log output to allow linking files and directories.
Adds support for aliasing tools and workflows.
Adds option to install shell completion (renku --install-completion).
Fixes initialization of Git submodules.
Uses relative submodule paths when appropriate.
Simplifies external storage configuration.
v0.2.0
(released 2018-09-25)
Refactored version using Git and Common Workflow Language.
v0.1.0
(released 2017-09-06)
Initial public release as Renga.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file renku-0.8.3.dev53.tar.gz
.
File metadata
- Download URL: renku-0.8.3.dev53.tar.gz
- Upload date:
- Size: 372.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45f9fe2dba11e33b839b4c872bfc82a9840a765348c8c35cb47a77b2c9b81bb2 |
|
MD5 | 5b834f401974f36d8c8626e229bd3686 |
|
BLAKE2b-256 | c195c0b7b8e7b75860ff1323a1f7ef74bfb54ebdf9b23625fcab0dc54958261a |
File details
Details for the file renku-0.8.3.dev53-py2.py3-none-any.whl
.
File metadata
- Download URL: renku-0.8.3.dev53-py2.py3-none-any.whl
- Upload date:
- Size: 433.6 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c264680c690b026d476f0fbdc36bc7ddebee27ba5f65776bcc695fcf02146c2 |
|
MD5 | 239b76bf08ebcd18a2927417b2968ae9 |
|
BLAKE2b-256 | 8ba986aedd382e38d035edf9baa76f5caecda6f18e618e1c2acdc4cc8dbd32dd |