Tekton Compiler for Kubeflow Pipelines
Project description
Compiler for Tekton
The Kubeflow Pipelines SDK allows data scientists to define end-to-end machine learning and data pipelines. The output of the Kubeflow Pipelines SDK compiler is YAML for Argo. We are extending the compiler of the Kubeflow Pipelines SDK to generate YAML for Tekton.
Table of Contents
- Project Prerequisites
- Tested Pipelines
- How to use the KFP-Tekton Compiler
- Build Tekton from Master
- Additional Features
- List of Available Features
- Troubleshooting
Project Prerequisites
Follow the instructions for installing project prerequisites and take note of some important caveats.
Tested Pipelines
We are testing the compiler on more than 80 pipelines
found in the Kubeflow Pipelines repository, specifically the pipelines in KFP compiler
testdata
folder, the KFP core samples and the samples contributed by third parties.
A report card of Kubeflow Pipelines samples that are currently supported by the kfp-tekton
compiler can be found here.
If you work on a PR that enables another of the missing features please ensure that
your code changes are improving the number of successfully compiled KFP pipeline samples.
How to use the KFP-Tekton Compiler
Installation
You can install the latest release of the kfp-tekton
compiler from
PyPi. We recommend to create a Python
virtual environment first:
python3 -m venv .venv
source .venv/bin/activate
pip install kfp-tekton
Alternatively you can install the latest version of the kfp-tekton
compiler
from source by cloning the repository https://github.com/kubeflow/kfp-tekton:
-
Clone the
kfp-tekton
repo:git clone https://github.com/kubeflow/kfp-tekton.git cd kfp-tekton
-
Setup Python environment with Conda or a Python virtual environment:
python3 -m venv .venv source .venv/bin/activate
-
Build the compiler:
pip install -e sdk/python
-
Run the compiler tests (optional):
make test
Compiling a Kubeflow Pipelines DSL Script
The kfp-tekton
Python package comes with the dsl-compile-tekton
command line
executable, which should be available in your terminal shell environment after
installing the kfp-tekton
Python package.
If you cloned the kfp-tekton
project, you can find example pipelines in the
samples
folder or under sdk/python/tests/compiler/testdata
folder.
dsl-compile-tekton \
--py sdk/python/tests/compiler/testdata/parallel_join.py \
--output pipeline.yaml
Running the Pipeline on a Tekton Cluster
After compiling the sdk/python/tests/compiler/testdata/parallel_join.py
DSL script
in the step above, we need to deploy the generated Tekton YAML to our Kubernetes
cluster with kubectl
and start a pipeline run with tkn
:
kubectl apply -f pipeline.yaml
tkn pipeline start parallel-pipeline --showlog
A prompt should be asking for the pipeline arguments. Press enter
and
accept the defaults:
? Value for param `url1` of type `string`? (Default is `gs://ml-pipeline-playground/shakespeare1.txt`) gs://ml-pipeline-playground/shakespeare1.txt
? Value for param `url2` of type `string`? (Default is `gs://ml-pipeline-playground/shakespeare2.txt`) gs://ml-pipeline-playground/shakespeare2.txt
Pipelinerun started: parallel-pipeline-run-th4x6
Once the Tekton Pipeline is running, the logs should start streaming:
Waiting for logs to be available...
[gcs-download-2 : gcs-download-2] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[gcs-download : gcs-download] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : echo] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : echo] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
Build Tekton from Master
In order to utilize the latest features and functions of the kfp-tekton
compiler,
we suggest to install Tekton from a nightly built or build it from the
master
branch. Features that require a special build, different from the 'Tested Version',
will be listed below.
Additional Features
1. Compile Kubeflow Pipelines as a Tekton PipelineRun
By default, a Tekton PipelineRun
is generated by the tkn
CLI so that users can interactively change their pipeline
parameters during each execution.
However, tkn
CLI is lagging several important features when generating a PipelineRun
.
Therefore, we added support for generating pipelineRun using dsl-compile-tekton
with all the latest kfp-tekton
compiler features. The comparison between Tekton
pipeline and Argo workflow is described in our
design docs.
Compiling Kubeflow Pipelines into a Tekton PipelineRun
is currently in the experimental
stage. Here is
the list of supported features in PipelineRun
.
As of today, the below PipelineRun
features are available within dsl-compile-tekton
:
- Affinity
- Node Selector
- Tolerations
To compile Kubeflow Pipelines as Tekton pipelineRun, add the --generate-pipelinerun
parameter to the dsl-compile-tekton
command:
dsl-compile-tekton \
--py sdk/python/tests/compiler/testdata/tolerations.py \
--output pipeline.yaml \
--generate-pipelinerun
2. Compile Kubeflow Pipelines with Artifacts Enabled
Prerequisites: Install Kubeflow Pipelines.
By default, artifacts are disabled because they are dependent on Kubeflow Pipeline's Minio storage. When artifacts are enabled, all the output parameters are also treated as artifacts and persisted to the default object storage. Enabling artifacts also allows files to be downloaded or stored as artifact inputs/outputs. Since artifacts are dependent on the Kubeflow Pipeline's deployment, the generated Tekton pipeline must be deployed to the same namespace as Kubeflow Pipelines.
To compile Kubeflow Pipelines as a Tekton PipelineRun
, add the --enable-artifacts
argument to your dsl-compile-tekton
commands. Then, run the pipeline in the same
namespace that is used by Kubeflow Pipelines (typically kubeflow
) by using the
-n
flag. e.g.:
dsl-compile-tekton \
--py sdk/python/tests/compiler/testdata/parallel_join.py \
--output pipeline.yaml \
--enable-artifacts
kubectl apply -f pipeline.yaml -n kubeflow
tkn pipeline start parallel-pipeline --showlog -n kubeflow
You should see log messages saying the artifacts were stored in the object storage you specified:
? Value for param `url1` of type `string`? (Default is `gs://ml-pipeline-playground/shakespeare1.txt`) gs://ml-pipeline-playground/shakespeare1.txt
? Value for param `url2` of type `string`? (Default is `gs://ml-pipeline-playground/shakespeare2.txt`) gs://ml-pipeline-playground/shakespeare2.txt
Pipelinerun started: parallel-pipeline-run-g87bs
Waiting for logs to be available...
[gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[gcs-download : copy-artifacts] Added `storage` successfully.
[gcs-download : copy-artifacts] tekton/results/data
[gcs-download : copy-artifacts] tar: removing leading '/' from member names
[gcs-download : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline-run/gcs-download/data.tgz`
[gcs-download : copy-artifacts] Total: 0 B, Transferred: 194 B, Speed: 12.07 KiB/s
[gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[gcs-download-2 : copy-artifacts] Added `storage` successfully.
[gcs-download-2 : copy-artifacts] tar: removing leading '/' from member names
[gcs-download-2 : copy-artifacts] tekton/results/data
[gcs-download-2 : copy-artifacts] `data.tgz` -> `storage/mlpipeline/artifacts/parallel-pipeline-run/gcs-download-2/data.tgz`
[gcs-download-2 : copy-artifacts] Total: 0 B, Transferred: 204 B, Speed: 22.86 KiB/s
[echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : main]
[echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[echo : main]
List of Available Features
To understand how each feature is implemented and its current status, please visit the FEATURES doc.
Troubleshooting
-
When you encounter permission issues related to ServiceAccount, refer to Servince Account and RBAC doc
-
If you run into
bad interpreter: No such file or director
when trying to use python's venv, remove the current virtual environment in the.venv
directory and create a new one usingvirtualenv .venv
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file kfp-tekton-0.1.0.tar.gz
.
File metadata
- Download URL: kfp-tekton-0.1.0.tar.gz
- Upload date:
- Size: 38.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10a9d66dfa4a11be32a7e90d7d54acb110614179e49abc86721ca083b98ecd90 |
|
MD5 | d46f38ee618d3707f743e6a74b48b1e4 |
|
BLAKE2b-256 | 0748e904059178d02df7dabfd3d4457aead35c4e7ad08f0a7a9169d357f13307 |