Provider for Apache Airflow. Implements apache-airflow-providers-apache-beam package
Project description
Package apache-airflow-providers-apache-beam
Release: 4.2.0rc1
Provider package
This is a provider package for apache.beam provider. All classes for this provider package are in airflow.providers.apache.beam python package.
You can find package information and changelog for the provider in the documentation.
Installation
You can install this package on top of an existing Airflow 2 installation (see Requirements below for the minimum Airflow version supported) via pip install apache-airflow-providers-apache-beam
The package supports the following python versions: 3.7,3.8,3.9,3.10
Requirements
PIP package |
Version required |
---|---|
apache-airflow |
>=2.3.0 |
apache-beam |
>=2.33.0 |
Cross provider package dependencies
Those are dependencies that might be needed in order to use all the features of the package. You need to install the specified provider packages in order to use them.
You can install such cross-provider dependencies when installing from PyPI. For example:
pip install apache-airflow-providers-apache-beam[google]
Dependent package |
Extra |
---|---|
Changelog
4.2.0
Features
Add support for running a Beam Go pipeline with an executable binary (#28764)
Misc
Deprecate 'delegate_to' param in GCP operators and update docs (#29088)
4.1.1
Bug Fixes
Ensure Beam Go file downloaded from GCS still exists when referenced (#28664)
4.1.0
This release of provider is only available for Airflow 2.3+ as explained in the Apache Airflow providers support policy.
Misc
Move min airflow version to 2.3.0 for all providers (#27196)
Features
Add backward compatibility with old versions of Apache Beam (#27263)
4.0.0
Breaking changes
This release of provider is only available for Airflow 2.2+ as explained in the Apache Airflow providers support policy https://github.com/apache/airflow/blob/main/README.md#support-for-providers
Features
Added missing project_id to the wait_for_job (#24020)
Support impersonation service account parameter for Dataflow runner (#23961)
Misc
chore: Refactoring and Cleaning Apache Providers (#24219)
3.4.0
Features
Support serviceAccount attr for dataflow in the Apache beam
3.3.0
Features
Add recipe for BeamRunGoPipelineOperator (#22296)
Bug Fixes
Fix mistakenly added install_requires for all providers (#22382)
3.2.1
Misc
Add Trove classifiers in PyPI (Framework :: Apache Airflow :: Provider)
3.2.0
Features
Add support for BeamGoPipelineOperator (#20386)
Misc
Support for Python 3.10
3.1.0
Features
Use google cloud credentials when executing beam command in subprocess (#18992)
3.0.1
Misc
Optimise connection importing for Airflow 2.2.0
3.0.0
Breaking changes
Auto-apply apply_default decorator (#15667)
2.0.0
Breaking changes
Integration with the google provider
In 2.0.0 version of the provider we’ve changed the way of integrating with the google provider. The previous versions of both providers caused conflicts when trying to install them together using PIP > 20.2.4. The conflict is not detected by PIP 20.2.4 and below but it was there and the version of Google BigQuery python client was not matching on both sides. As the result, when both apache.beam and google provider were installed, some features of the BigQuery operators might not work properly. This was cause by apache-beam client not yet supporting the new google python clients when apache-beam[gcp] extra was used. The apache-beam[gcp] extra is used by Dataflow operators and while they might work with the newer version of the Google BigQuery python client, it is not guaranteed.
This version introduces additional extra requirement for the apache.beam extra of the google provider and symmetrically the additional requirement for the google extra of the apache.beam provider. Both google and apache.beam provider do not use those extras by default, but you can specify them when installing the providers. The consequence of that is that some functionality of the Dataflow operators might not be available.
Unfortunately the only complete solution to the problem is for the apache.beam to migrate to the new (>=2.0.0) Google Python clients.
This is the extra for the google provider:
extras_require = (
{
# ...
"apache.beam": ["apache-airflow-providers-apache-beam", "apache-beam[gcp]"],
# ...
},
)
And likewise this is the extra for the apache.beam provider:
extras_require = ({"google": ["apache-airflow-providers-google", "apache-beam[gcp]"]},)
You can still run this with PIP version <= 20.2.4 and go back to the previous behaviour:
pip install apache-airflow-providers-google[apache.beam]
or
pip install apache-airflow-providers-apache-beam[google]
But be aware that some BigQuery operators functionality might not be available in this case.
1.0.1
Bug fixes
Improve Apache Beam operators - refactor operator - common Dataflow logic (#14094)
Corrections in docs and tools after releasing provider RCs (#14082)
Remove WARNINGs from BeamHook (#14554)
1.0.0
Initial version of the provider.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for apache-airflow-providers-apache-beam-4.2.0rc1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44e2189529f9fcaae0f3c743e874689d43d26dbcaa0cc175cc159970bf73042b |
|
MD5 | 9e45c8e75a243dc939b55a7bfe8ebe66 |
|
BLAKE2b-256 | 652c6ce14657ef49f1edc53a3346ab8a8616c28c0dc99d839c5f98e4177b1e4d |
Hashes for apache_airflow_providers_apache_beam-4.2.0rc1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7620c5470358f85aef0ed79a005367ccdd81e7eef37dc23c421df08c23a5644b |
|
MD5 | ce349a3da7d43685b6de8b99338f9f4f |
|
BLAKE2b-256 | ac0496cc9b72325939851fe642a8ce4ae084acc3e3e7ce16616dd4046c4e827d |