Tekton Compiler for Kubeflow Pipelines
Project description
Kubeflow Pipelines SDK for Tekton
The Kubeflow Pipelines SDK allows data scientists to define end-to-end machine learning and data pipelines. The output of the Kubeflow Pipelines SDK compiler is YAML for Argo.
The kfp-tekton
SDK is extending the Compiler
and the Client
of the Kubeflow
Pipelines SDK to generate Tekton YAML
and to subsequently upload and run the pipeline with the Kubeflow Pipelines engine
backed by Tekton.
Table of Contents
- SDK Packages Overview
- Project Prerequisites
- Installation
- Compiling a Kubeflow Pipelines DSL Script
- Big data passing workspace configuration
- Running the Compiled Pipeline on a Tekton Cluster
- List of Available Features
- Tested Pipelines
- Troubleshooting
SDK Packages Overview
The kfp-tekton
SDK is an extension to the Kubeflow Pipelines SDK
adding the TektonCompiler
and the TektonClient
:
-
kfp_tekton.compiler
includes classes and methods for compiling pipeline Python DSL into a Tekton PipelineRun YAML spec. The methods in this package include, but are not limited to, the following:kfp_tekton.compiler.TektonCompiler.compile
compiles your Python DSL code into a single static configuration (in YAML format) that the Kubeflow Pipelines service can process. The Kubeflow Pipelines service converts the static configuration into a set of Kubernetes resources for execution.
-
kfp_tekton.TektonClient
contains the Python client libraries for the Kubeflow Pipelines API. Methods in this package include, but are not limited to, the following:kfp_tekton.TektonClient.upload_pipeline
uploads a local file to create a new pipeline in Kubeflow Pipelines.kfp_tekton.TektonClient.create_experiment
creates a pipeline experiment and returns an experiment object.kfp_tekton.TektonClient.run_pipeline
runs a pipeline and returns a run object.kfp_tekton.TektonClient.create_run_from_pipeline_func
compiles a pipeline function and submits it for execution on Kubeflow Pipelines.kfp_tekton.TektonClient.create_run_from_pipeline_package
runs a local pipeline package on Kubeflow Pipelines.
Project Prerequisites
- Python:
3.5.3
or later - Tekton:
v0.16.3
or later - Tekton CLI:
0.11.0
- Kubeflow Pipelines: KFP with Tekton backend
Follow the instructions for installing project prerequisites and take note of some important caveats.
Installation
You can install the latest release of the kfp-tekton
compiler from
PyPi. We recommend to create a Python
virtual environment first:
python3 -m venv .venv
source .venv/bin/activate
pip install kfp-tekton
Alternatively you can install the latest version of the kfp-tekton
compiler
from source by cloning the repository https://github.com/kubeflow/kfp-tekton:
-
Clone the
kfp-tekton
repo:git clone https://github.com/kubeflow/kfp-tekton.git cd kfp-tekton
-
Setup Python environment with Conda or a Python virtual environment:
python3 -m venv .venv source .venv/bin/activate
-
Build the compiler:
pip install -e sdk/python
-
Run the compiler tests (optional):
pip install pytest make test
Compiling a Kubeflow Pipelines DSL Script
The kfp-tekton
Python package comes with the dsl-compile-tekton
command line
executable, which should be available in your terminal shell environment after
installing the kfp-tekton
Python package.
If you cloned the kfp-tekton
project, you can find example pipelines in the
samples
folder or under sdk/python/tests/compiler/testdata
folder.
dsl-compile-tekton \
--py sdk/python/tests/compiler/testdata/parallel_join.py \
--output pipeline.yaml
Note: If the KFP DSL script contains a __main__
method calling the
kfp_tekton.compiler.TektonCompiler.compile()
function:
if __name__ == "__main__":
from kfp_tekton.compiler import TektonCompiler
TektonCompiler().compile(pipeline_func, "pipeline.yaml")
... then the pipeline can be compiled by running the DSL script with python3
executable from a command line shell, producing a Tekton YAML file pipeline.yaml
in the same directory:
python3 pipeline.py
Big data passing workspace configuration
When big data files are defined in KFP. Tekton will create a workspace to share these big data files among tasks that run in the same pipeline. By default, the workspace is a Read Write Many PVC with 2Gi storage. But you can change these configuration using the environment variables below:
export DEFAULT_ACCESSMODES=ReadWriteMany
export DEFAULT_STORAGE_SIZE=2Gi
Running the Compiled Pipeline on a Tekton Cluster
After compiling the sdk/python/tests/compiler/testdata/parallel_join.py
DSL script
in the step above, we need to deploy the generated Tekton YAML to Kubeflow Pipeline engine.
You can run the pipeline directly using a pre-compiled file and KFP-Tekton SDK. For more details, please look at the KFP-Tekton user guide SDK documentation
experiment = kfp_tekton.TektonClient.create_experiment(name=EXPERIMENT_NAME, namespace=KUBEFLOW_PROFILE_NAME)
run = client.run_pipeline(experiment.id, 'parallal-join-pipeline', 'pipeline.yaml')
You can also deploy directly on Tekton cluster with kubectl
. The Tekton server will automatically start a pipeline run.
We can then follow the logs using the tkn
CLI.
kubectl apply -f pipeline.yaml
tkn pipelinerun logs --last --follow
Once the Tekton Pipeline is running, the logs should start streaming:
Waiting for logs to be available...
[gcs-download : main] With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[gcs-download-2 : main] I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[echo : main] Text 1: With which he yoketh your rebellious necks Razeth your cities and subverts your towns And in a moment makes them desolate
[echo : main]
[echo : main] Text 2: I find thou art no less than fame hath bruited And more than may be gatherd by thy shape Let my presumption not provoke thy wrath
[echo : main]
List of Available Features
To understand how each feature is implemented and its current status, please visit the FEATURES doc.
Tested Pipelines
We are testing the compiler on more than 80 pipelines
found in the Kubeflow Pipelines repository, specifically the pipelines in KFP compiler
testdata
folder, the KFP core samples and the samples contributed by third parties.
A report card of Kubeflow Pipelines samples that are currently supported by the kfp-tekton
compiler can be found here.
If you work on a PR that enables another of the missing features please ensure that
your code changes are improving the number of successfully compiled KFP pipeline samples.
Troubleshooting
-
When you encounter ServiceAccount related permission issues, refer to the "Service Account and RBAC" doc
-
If you run into the error
bad interpreter: No such file or director
when trying to use Python's venv, remove the current virtual environment in the.venv
directory and create a new one usingvirtualenv .venv
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file kfp-tekton-0.5.0.tar.gz
.
File metadata
- Download URL: kfp-tekton-0.5.0.tar.gz
- Upload date:
- Size: 39.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.3.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd107e74615e39247c370c850f18dd43be4a520bd6c484a622863982e5bebf6d |
|
MD5 | 3755104d31e5f673821a932ff90d471e |
|
BLAKE2b-256 | f4dad86a9539ed6fb792196fdfa08e91255f028eb37c531425d0116dff292237 |