Skip to main content

High Availability (HA) DAG Utility

Project description

airflow-ha

High Availability (HA) DAG Utility

Build Status codecov License PyPI

Overview

This library provides an operator called HighAvailabilityOperator, which inherits from PythonSensor and does the following:

  • runs a user-provided python_callable as a sensor
    • if this returns "done", mark the DAG as passed and finish
    • if this returns "running", keep checking
    • if this returns "failed", mark the DAG as failed and re-run
  • if the sensor times out, mark the DAG as passed and re-run

Consider the following DAG:

from datetime import datetime, timedelta
from random import choice

from airflow import DAG
from airflow.operators.python import PythonOperator
from airflow_ha import HighAvailabilityOperator


with DAG(
    dag_id="test-high-availability",
    description="Test HA Operator",
    schedule=timedelta(days=1),
    start_date=datetime(2024, 1, 1),
    catchup=False,
):
    ha = HighAvailabilityOperator(
        task_id="ha",
        timeout=30,
        poke_interval=5,
        python_callable=lambda **kwargs: choice(("done", "failed", "running", ""))
    )
    
    pre = PythonOperator(task_id="pre", python_callable=lambda **kwargs: "test")
    pre >> ha
    
    fail = PythonOperator(task_id="fail", python_callable=lambda **kwargs: "test")
    ha.failed >> fail
    
    passed = PythonOperator(task_id="passed", python_callable=lambda **kwargs: "test")
    ha.passed >> passed

    done = PythonOperator(task_id="done", python_callable=lambda **kwargs: "test")
    ha.done >> done

This produces a DAG with the following topology:

This DAG exhibits cool behavior. If a check fails or the interval elapses, the DAG will re-trigger itself. If the check passes, the DAG will finish. This allows the one to build "always-on" DAGs without having individual long blocking tasks.

This library is used to build airflow-supervisor, which uses supervisor as a process-monitor while checking and restarting jobs via airflow-ha.

License

This software is licensed under the Apache 2.0 license. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_ha-0.1.0.tar.gz (9.0 kB view hashes)

Uploaded Source

Built Distribution

airflow_ha-0.1.0-py3-none-any.whl (8.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page