Skip to main content

An Apache Airflow provider package built by Astronomer to integrate with Ray.

Project description

Ray provider

:books: Docs   |   :rocket: Getting Started   |   :speech_balloon: Slack (#airflow-ray)  |   :fire: Contribute  

Orchestrate your Ray jobs using Apache Airflow® combining Airflow's workflow management with Ray's distributed computing capabilities.

Benefits of using this provider include:

  • Integration: Incorporate Ray jobs into Airflow DAGs for unified workflow management.
  • Distributed computing: Use Ray's distributed capabilities within Airflow pipelines for scalable ETL, LLM fine-tuning etc.
  • Monitoring: Track Ray job progress through Airflow's user interface.
  • Dependency management: Define and manage dependencies between Ray jobs and other tasks in DAGs.
  • Resource allocation: Run Ray jobs alongside other task types within a single pipeline.

Table of Contents

Quickstart

Check out the Getting Started guide in our docs. Sample DAGs are available at example_dags/.

Sample DAGs

Example 1: Using @ray.task for job life cycle

The below example showcases how to use the @ray.task decorator to manage the full lifecycle of a Ray cluster: setup, job execution, and teardown. The configuration for the decorator can provided statically or at runtime.

This approach is ideal for jobs that require a dedicated, short-lived cluster, optimizing resource usage by cleaning up after task completion

https://github.com/astronomer/astro-provider-ray/blob/bd6d847818be08fae78bc1e4c9bf3334adb1d2ee/example_dags/ray_taskflow_example.py#L1-L57

Example 2: Using SetupRayCluster, SubmitRayJob & DeleteRayCluster

This example shows how to use separate operators for cluster setup, job submission, and teardown, providing more granular control over the process.

This approach allows for more complex workflows involving Ray clusters.

Key Points:

  • Uses SetupRayCluster, SubmitRayJob, and DeleteRayCluster operators separately.
  • Allows for multiple jobs to be submitted to the same cluster before deletion.
  • Demonstrates how to pass cluster information between tasks using XCom.

This method is ideal for scenarios where you need fine-grained control over the cluster lifecycle, such as running multiple jobs on the same cluster or keeping the cluster alive for a certain period.

https://github.com/astronomer/astro-provider-ray/blob/bd6d847818be08fae78bc1e4c9bf3334adb1d2ee/example_dags/setup-teardown.py#L1-L44

Getting Involved

Platform Purpose Est. Response time
Discussion Forum General inquiries and discussions < 3 days
GitHub Issues Bug reports and feature requests < 1-2 days
Slack Quick questions and real-time chat 12 hrs

Changelog

We follow Semantic Versioning for releases. Check CHANGELOG.rst for the latest changes.

Contributing Guide

All contributions, bug reports, bug fixes, documentation improvements, enhancements are welcome.

A detailed overview on how to contribute can be found in the Contributing Guide.

License

Apache 2.0 License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astro_provider_ray-0.3.0a4.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

astro_provider_ray-0.3.0a4-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file astro_provider_ray-0.3.0a4.tar.gz.

File metadata

  • Download URL: astro_provider_ray-0.3.0a4.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.6

File hashes

Hashes for astro_provider_ray-0.3.0a4.tar.gz
Algorithm Hash digest
SHA256 7df4d1094705b43d7f99ca1bc2cdda4d32f848e88f00edd5db3c2c38d516eeba
MD5 30f6346621990d52712c9ba24f358bcc
BLAKE2b-256 b72abead99488a7be4e26844889e7e7d88badc023983180b9c2498f5f19c2213

See more details on using hashes here.

File details

Details for the file astro_provider_ray-0.3.0a4-py3-none-any.whl.

File metadata

File hashes

Hashes for astro_provider_ray-0.3.0a4-py3-none-any.whl
Algorithm Hash digest
SHA256 7ae890dc2c8ff386e1a3fb8972e9c393193df9ad87fa0c2a6b31b6662fb04712
MD5 39552e29754952e9df0b6cf0c70f890e
BLAKE2b-256 d766f01a88b32b21c383b20736d2f62b84b6c8cc6ff25b739de211c31b2ec99e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page