Skip to main content

Jupyter Notebook operator for Apache Airflow.

Project description

PyPI version

airflow-notebook implements an Apache Airflow operator NotebookOp that supports running of notebooks and Python scripts in DAGs. To use the operator, install this package on the host(s) where the Apache Airflow webserver, scheduler, and workers are running.

Installing the airflow-notebook package

You can install the airflow-notebook package from PyPI or source code.

Installing from PyPI

To install airflow-notebook from PyPI:

pip install airflow-notebook

Installing from source code

To build airflow-notebook from source, Python 3.6 (or later) must be installed.

git clone https://github.com/elyra-ai/airflow-notebook.git
cd airflow-notebook
make clean install

Test coverage

The operator was tested with Apache Airflow v1.10.12.

Usage

Example below on how to use the airflow operator. This particular DAG was generated with a jinja template in Elyra's pipeline editor.

from airflow import DAG
from airflow_notebook.pipeline import NotebookOp
from airflow.utils.dates import days_ago

# Setup default args with older date to automatically trigger when uploaded
args = {
    'project_id': 'untitled-0105163134',
}

dag = DAG(
    'untitled-0105163134',
    default_args=args,
    schedule_interval=None,
    start_date=days_ago(1),
    description='Created with Elyra 2.0.0.dev0 pipeline editor using untitled.pipeline.',
    is_paused_upon_creation=False,
)


notebook_op_6055fdfb_908c_43c1_a536_637205009c79 = NotebookOp(name='notebookA',
                                                              namespace='default',
                                                              task_id='notebookA',
                                                              notebook='notebookA.ipynb',
                                                              cos_endpoint='http://endpoint.com:31671',
                                                              cos_bucket='test',
                                                              cos_directory='untitled-0105163134',
                                                              cos_dependencies_archive='notebookA-6055fdfb-908c-43c1-a536-637205009c79.tar.gz',
                                                              pipeline_outputs=[
                                                                  'subdir/A.txt'],
                                                              pipeline_inputs=[],
                                                              image='tensorflow/tensorflow:2.3.0',
                                                              in_cluster=True,
                                                              env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
                                                                        'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
                                                              config_file="None",
                                                              dag=dag,
                                                              )


notebook_op_074355ce_2119_4190_8cde_892a4bc57bab = NotebookOp(name='notebookB',
                                                              namespace='default',
                                                              task_id='notebookB',
                                                              notebook='notebookB.ipynb',
                                                              cos_endpoint='http://endpoint.com:31671',
                                                              cos_bucket='test',
                                                              cos_directory='untitled-0105163134',
                                                              cos_dependencies_archive='notebookB-074355ce-2119-4190-8cde-892a4bc57bab.tar.gz',
                                                              pipeline_outputs=[
                                                                  'B.txt'],
                                                              pipeline_inputs=[
                                                                  'subdir/A.txt'],
                                                              image='elyra/tensorflow:1.15.2-py3',
                                                              in_cluster=True,
                                                              env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
                                                                        'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
                                                              config_file="None",
                                                              dag=dag,
                                                              )

notebook_op_074355ce_2119_4190_8cde_892a4bc57bab << notebook_op_6055fdfb_908c_43c1_a536_637205009c79


notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 = NotebookOp(name='notebookC',
                                                              namespace='default',
                                                              task_id='notebookC',
                                                              notebook='notebookC.ipynb',
                                                              cos_endpoint='http://endpoint.com:31671',
                                                              cos_bucket='test',
                                                              cos_directory='untitled-0105163134',
                                                              cos_dependencies_archive='notebookC-68120415-86c9-4dd9-8bd6-b2f33443fcc7.tar.gz',
                                                              pipeline_outputs=[
                                                                  'C.txt', 'C2.txt'],
                                                              pipeline_inputs=[
                                                                  'subdir/A.txt'],
                                                              image='elyra/tensorflow:1.15.2-py3',
                                                              in_cluster=True,
                                                              env_vars={'AWS_ACCESS_KEY_ID': 'a_key',
                                                                        'AWS_SECRET_ACCESS_KEY': 'a_secret_key', 'ELYRA_ENABLE_PIPELINE_INFO': 'True'},
                                                              config_file="None",
                                                              dag=dag,
                                                              )

notebook_op_68120415_86c9_4dd9_8bd6_b2f33443fcc7 << notebook_op_6055fdfb_908c_43c1_a536_637205009c79

Generated Airflow DAG

Airflow DAG Example

Project details


Release history Release notifications | RSS feed

This version

0.0.5

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow-notebook-0.0.5.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

airflow_notebook-0.0.5-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file airflow-notebook-0.0.5.tar.gz.

File metadata

  • Download URL: airflow-notebook-0.0.5.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.9

File hashes

Hashes for airflow-notebook-0.0.5.tar.gz
Algorithm Hash digest
SHA256 17902930374cdac612aab8568ef31ab6bd91b758d7aa1cb46c4c9f9215fabbb9
MD5 2469b7a04a9120df65f7dcd2a9f1da8e
BLAKE2b-256 b0009f6f4d8ba53598a0168aea3044cac35edc70b4e82e385d3adeac37f0fc07

See more details on using hashes here.

File details

Details for the file airflow_notebook-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: airflow_notebook-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.23.0 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.7.9

File hashes

Hashes for airflow_notebook-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 99de90d6ded10f3b9a8d7f5ccf12f6c21fb5f22ad3c4a980e87c1f9af54e24e5
MD5 4b1909e5130dfea32d1049445f73abc9
BLAKE2b-256 0ab39844c1300c7ff7f57b604e69c6a33e4c5fa67074191e814f1e2a2b251a12

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page