Skip to main content

Apache Airflow Providers containing 3rd party integrations supported natively in Airflow

Project description

https://badge.fury.io/py/astronomer-cosmos.svg

Astronomer Cosmos

A framework for generating Apache Airflow DAGs from other workflows.

Quickstart

Clone this repository to set up a local environment. Then, head over to our astronomer-cosmos/examples directory and follow its README!

cosmos_banner.png

Installation

Install and update using pip:

pip install astronomer-cosmos

This only installs dependencies for core provider. To install all dependencies, run:

pip install 'astronomer-cosmos[all]'

To only install the dependencies for a specific integration, specify the integration name as extra argument, example to install dbt integration dependencies, run:

pip install 'astronomer-cosmos[dbt]'

Extras

Extra Name

Installation Command

Dependencies

all

pip install 'astronomer-cosmos[all]'

All

dbt

pip install 'astronomer-cosmos[dbt]'

dbt core

Example Usage

Imagine we have dbt projects located at ./dbt/{{DBT_PROJECT_NAME}}. We can render these projects as a Airflow DAGs using the DbtDag class:

from pendulum import datetime
from airflow import DAG
from cosmos.providers.dbt.dag import DbtDag

# dag for the project jaffle_shop
jaffle_shop = DbtDag(
    dbt_project_name="jaffle_shop",
    conn_id="airflow_db",
    dbt_args={
        "schema": "public",
    },
    dag_id="jaffle_shop",
    start_date=datetime(2022, 11, 27),
    schedule="@daily",
    doc_md=__doc__,
    catchup=False,
    default_args={"owner": "01-EXTRACT"},
)

# dag for the project attribution-playbook
attribution_playbook = DbtDag(
    dbt_project_name="attribution-playbook",
    conn_id="airflow_db",
    dbt_args={
        "schema": "public",
    },
    dag_id="attribution_playbook",
    start_date=datetime(2022, 11, 27),
    schedule="@daily",
    doc_md=__doc__,
    catchup=False,
    default_args={"owner": "01-EXTRACT"},
)

# dag for the project mrr-playbook
mrr_playbook = DbtDag(
    dbt_project_name="mrr-playbook",
    conn_id="airflow_db",
    dbt_args={
        "schema": "public",
    },
    dag_id="mrr_playbook",
    start_date=datetime(2022, 11, 27),
    schedule="@daily",
    doc_md=__doc__,
    catchup=False,
    default_args={"owner": "01-EXTRACT"},
)

Simiarly, we can render these projects as Airflow TaskGroups using the DbtTaskGroup class. Here’s an example with the jaffle_shop project:

"""
## Extract DAG

This DAG is used to illustrate setting an upstream dependency from the dbt DAGs. Notice the `outlets` parameter on the
`EmptyOperator` object is creating a
[Dataset](https://airflow.apache.org/docs/apache-airflow/stable/concepts/datasets.html) that is used in the `schedule`
parameter of the dbt DAGs (`attribution-playbook`, `jaffle_shop`, `mrr-playbook`).

"""

from pendulum import datetime

from airflow import DAG
from airflow.datasets import Dataset
from airflow.operators.empty import EmptyOperator
from cosmos.providers.dbt.task_group import DbtTaskGroup


with DAG(
    dag_id="extract_dag",
    start_date=datetime(2022, 11, 27),
    schedule="@daily",
    doc_md=__doc__,
    catchup=False,
    default_args={"owner": "01-EXTRACT"},
) as dag:

    e1 = EmptyOperator(
        task_id="ingestion_workflow", outlets=[Dataset("DAG://EXTRACT_DAG")]
    )

    dbt_tg = DbtTaskGroup(
        group_id="dbt_tg",
        dbt_project_name="jaffle_shop",
        conn_id="airflow_db",
        dbt_args={
            "schema": "public",
        },
        dag=dag,
    )

    e2 = EmptyOperator(
        task_id="some_extraction", outlets=[Dataset("DAG://EXTRACT_DAG")]
    )

    e1 >> dbt_tg >> e2

Principles

Astronomer Cosmos provides a framework for generating Apache Airflow DAGs from other workflows. Every provider comes with two main components:

  • extractors: These are responsible for extracting the workflow from the provider and converting it into Task and Group objects.

  • operators: These are used when the workflow is converted into a DAG. They are responsible for executing the tasks in the workflow.

Astronomer Cosmos is not opinionated in the sense that it does not enforce any rendering method. Rather, it comes with the tools to render workflows as Airflow DAGs, task groups, or individual tasks.

Changelog

We follow Semantic Versioning for releases. Check CHANGELOG.rst for the latest changes.

Contributing Guide

All contributions, bug reports, bug fixes, documentation improvements, enhancements are welcome.

A detailed overview an how to contribute can be found in the Contributing Guide.

As contributors and maintainers to this project, you are expected to abide by the Contributor Code of Conduct.

Goals for the project

  • Goal 1

  • Goal 2

  • Goal 3

Limitations

  • List any limitations

License

Apache License 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astronomer-cosmos-0.0.6.tar.gz (14.4 kB view details)

Uploaded Source

File details

Details for the file astronomer-cosmos-0.0.6.tar.gz.

File metadata

  • Download URL: astronomer-cosmos-0.0.6.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.15

File hashes

Hashes for astronomer-cosmos-0.0.6.tar.gz
Algorithm Hash digest
SHA256 a41a8556d0557c638249e9d4f69862ba0c1893507f201cfd1814a33f2544f8e8
MD5 a4d0f9414d972eea663655f3ca31991a
BLAKE2b-256 de2ea9dc9e5638d4a76098436dd3504fc870f3d51c41b4bc083c6071c86fdeb7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page