Apache Airflow Providers containing 3rd party integrations supported natively in Airflow
Project description
Astronomer Cosmos
A framework for generating Apache Airflow DAGs from other workflows.
Quickstart
Clone this repository to set up a local environment. Then, head over to our astronomer-cosmos/examples
directory and follow its README!
Installation
Install and update using pip:
pip install astronomer-cosmos
This only installs dependencies for core provider. To install all dependencies, run:
pip install 'astronomer-cosmos[all]'
To only install the dependencies for a specific integration, specify the integration name as extra argument, example to install dbt integration dependencies, run:
pip install 'astronomer-cosmos[dbt]'
Extras
Extra Name |
Installation Command |
Dependencies |
---|---|---|
all |
pip install 'astronomer-cosmos[all]' |
All |
dbt |
pip install 'astronomer-cosmos[dbt]' |
dbt core |
Example Usage
Imagine we have dbt projects located at ./dbt/{{DBT_PROJECT_NAME}}. We can render these projects as a Airflow DAGs using the DbtDag class:
from pendulum import datetime
from airflow import DAG
from cosmos.providers.dbt.dag import DbtDag
# dag for the project jaffle_shop
jaffle_shop = DbtDag(
dbt_project_name="jaffle_shop",
conn_id="airflow_db",
dbt_args={
"schema": "public",
},
dag_id="jaffle_shop",
start_date=datetime(2022, 11, 27),
schedule="@daily",
doc_md=__doc__,
catchup=False,
default_args={"owner": "01-EXTRACT"},
)
# dag for the project attribution-playbook
attribution_playbook = DbtDag(
dbt_project_name="attribution-playbook",
conn_id="airflow_db",
dbt_args={
"schema": "public",
},
dag_id="attribution_playbook",
start_date=datetime(2022, 11, 27),
schedule="@daily",
doc_md=__doc__,
catchup=False,
default_args={"owner": "01-EXTRACT"},
)
# dag for the project mrr-playbook
mrr_playbook = DbtDag(
dbt_project_name="mrr-playbook",
conn_id="airflow_db",
dbt_args={
"schema": "public",
},
dag_id="mrr_playbook",
start_date=datetime(2022, 11, 27),
schedule="@daily",
doc_md=__doc__,
catchup=False,
default_args={"owner": "01-EXTRACT"},
)
Simiarly, we can render these projects as Airflow TaskGroups using the DbtTaskGroup class. Here’s an example with the jaffle_shop project:
"""
## Extract DAG
This DAG is used to illustrate setting an upstream dependency from the dbt DAGs. Notice the `outlets` parameter on the
`EmptyOperator` object is creating a
[Dataset](https://airflow.apache.org/docs/apache-airflow/stable/concepts/datasets.html) that is used in the `schedule`
parameter of the dbt DAGs (`attribution-playbook`, `jaffle_shop`, `mrr-playbook`).
"""
from pendulum import datetime
from airflow import DAG
from airflow.datasets import Dataset
from airflow.operators.empty import EmptyOperator
from cosmos.providers.dbt.task_group import DbtTaskGroup
with DAG(
dag_id="extract_dag",
start_date=datetime(2022, 11, 27),
schedule="@daily",
doc_md=__doc__,
catchup=False,
default_args={"owner": "01-EXTRACT"},
) as dag:
e1 = EmptyOperator(
task_id="ingestion_workflow", outlets=[Dataset("DAG://EXTRACT_DAG")]
)
dbt_tg = DbtTaskGroup(
group_id="dbt_tg",
dbt_project_name="jaffle_shop",
conn_id="airflow_db",
dbt_args={
"schema": "public",
},
dag=dag,
)
e2 = EmptyOperator(
task_id="some_extraction", outlets=[Dataset("DAG://EXTRACT_DAG")]
)
e1 >> dbt_tg >> e2
Principles
Astronomer Cosmos provides a framework for generating Apache Airflow DAGs from other workflows. Every provider comes with two main components:
extractors: These are responsible for extracting the workflow from the provider and converting it into Task and Group objects.
operators: These are used when the workflow is converted into a DAG. They are responsible for executing the tasks in the workflow.
Astronomer Cosmos is not opinionated in the sense that it does not enforce any rendering method. Rather, it comes with the tools to render workflows as Airflow DAGs, task groups, or individual tasks.
Changelog
We follow Semantic Versioning for releases. Check CHANGELOG.rst for the latest changes.
Contributing Guide
All contributions, bug reports, bug fixes, documentation improvements, enhancements are welcome.
A detailed overview an how to contribute can be found in the Contributing Guide.
As contributors and maintainers to this project, you are expected to abide by the Contributor Code of Conduct.
Goals for the project
Goal 1
Goal 2
Goal 3
Limitations
List any limitations
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file astronomer-cosmos-0.0.6.tar.gz
.
File metadata
- Download URL: astronomer-cosmos-0.0.6.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a41a8556d0557c638249e9d4f69862ba0c1893507f201cfd1814a33f2544f8e8 |
|
MD5 | a4d0f9414d972eea663655f3ca31991a |
|
BLAKE2b-256 | de2ea9dc9e5638d4a76098436dd3504fc870f3d51c41b4bc083c6071c86fdeb7 |