Skip to main content

Jupyter Notebook operator for Apache Airflow.

Project description

PyPI version Docker Image Version (latest by date)

airflow-notebook implements an Apache Airflow operator NotebookOp that supports running of notebooks and Python scripts in DAGs. To use the operator, configure Airflow to use the Elyra-enabled container image or install this package on the host(s) where the Apache Airflow webserver, scheduler, and workers are running.

Using the Elyra-enabled airflow container image

Follow the instructions in this document.

Installing the airflow-notebook package

You can install the airflow-notebook package from PyPI or source code.

Installing from PyPI

To install airflow-notebook from PyPI:

pip install airflow-notebook

Installing from source code

To build airflow-notebook from source, Python 3.6 (or later) must be installed.

git clone https://github.com/elyra-ai/airflow-notebook.git
cd airflow-notebook
make clean install

Test coverage

The operator was tested with Apache Airflow v1.10.12.

Usage

Example below on how to use the airflow operator. This particular DAG was generated with a jinja template in Elyra's pipeline editor.

from airflow import DAG
from airflow_notebook.pipeline import NotebookOp
from airflow.utils.dates import days_ago

# Setup default args with older date to automatically trigger when uploaded
args = {
    'project_id': 'untitled-0105163134',
}

dag = DAG(
    'untitled-0105163134',
    default_args=args,
    schedule_interval=None,
    start_date=days_ago(1),
    description='Created with Elyra 2.0.0.dev0 pipeline editor using untitled.pipeline.',
    is_paused_upon_creation=False,
)


notebook_op_6055fdfb_908c_43c1_a536_637205009c79 = NotebookOp(name='notebookA',
                                                              namespace='default',
                                                              task_id='notebookA',
                                                              notebook='notebookA.ipynb',
                                                              cos_endpoint='http://endpoint.com:31671',
                                                              cos_bucket='test',
                                                              cos_directory='untitled-0105163134',
                                                              cos_dependencies_archive='notebookA-6055fdfb-908c-43c1-a536-637205009c79.tar.gz',
                                                              pipeline_outputs=[
                                                                  'subdir/A.txt'],
                                                              pipeline_inputs=[],
                                                              image='tensorflow/tensorflow:2.3.0',
                                                              in_cluster=True,
                                                              env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
                                                                        'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
                                                              config_file="None",
                                                              dag=dag,
                                                              )


notebook_op_074355ce_2119_4190_8cde_892a4bc57bab = NotebookOp(name='notebookB',
                                                              namespace='default',
                                                              task_id='notebookB',
                                                              notebook='notebookB.ipynb',
                                                              cos_endpoint='http://endpoint.com:31671',
                                                              cos_bucket='test',
                                                              cos_directory='untitled-0105163134',
                                                              cos_dependencies_archive='notebookB-074355ce-2119-4190-8cde-892a4bc57bab.tar.gz',
                                                              pipeline_outputs=[
                                                                  'B.txt'],
                                                              pipeline_inputs=[
                                                                  'subdir/A.txt'],
                                                              image='elyra/tensorflow:1.15.2-py3',
                                                              in_cluster=True,
                                                              env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
                                                                        'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
                                                              config_file="None",
                                                              dag=dag,
                                                              )

notebook_op_074355ce_2119_4190_8cde_892a4bc57bab << notebook_op_6055fdfb_908c_43c1_a536_637205009c79


notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 = NotebookOp(name='notebookC',
                                                              namespace='default',
                                                              task_id='notebookC',
                                                              notebook='notebookC.ipynb',
                                                              cos_endpoint='http://endpoint.com:31671',
                                                              cos_bucket='test',
                                                              cos_directory='untitled-0105163134',
                                                              cos_dependencies_archive='notebookC-68120415-86c9-4dd9-8bd6-b2f33443fcc7.tar.gz',
                                                              pipeline_outputs=[
                                                                  'C.txt', 'C2.txt'],
                                                              pipeline_inputs=[
                                                                  'subdir/A.txt'],
                                                              image='elyra/tensorflow:1.15.2-py3',
                                                              in_cluster=True,
                                                              env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
                                                                        'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
                                                              config_file="None",
                                                              dag=dag,
                                                              )

notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 << notebook_op_6055fdfb_908c_43c1_a536_637205009c79

Generated Airflow DAG

Airflow DAG Example

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow-notebook-0.0.7.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

airflow_notebook-0.0.7-py3-none-any.whl (11.2 kB view details)

Uploaded Python 3

File details

Details for the file airflow-notebook-0.0.7.tar.gz.

File metadata

  • Download URL: airflow-notebook-0.0.7.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.9

File hashes

Hashes for airflow-notebook-0.0.7.tar.gz
Algorithm Hash digest
SHA256 ffcf2f45a773786e72492a690dba5a5291a6d2a652074c9e40dd7f91082f731e
MD5 cb5007ec8cea61e3a3de3a70eb80336c
BLAKE2b-256 ec30d67c4fd9832d6a896745c4670e3b6d29de42d77d815eab8d33d3d2cf05b8

See more details on using hashes here.

File details

Details for the file airflow_notebook-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: airflow_notebook-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 11.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.9

File hashes

Hashes for airflow_notebook-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 216b2e2e84625b32554261a11e2f8ea5b1beeb673fb1e7b906f0cb8e87a3265c
MD5 3f6a609c5077ff5e55794cfad19b1b00
BLAKE2b-256 de6693715fac2a49da3927ee335b6415abc04f8748be970fee4c8969bb3649f9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page