Provider package apache-airflow-providers-google for Apache Airflow
Project description
Package apache-airflow-providers-google
Release: 6.3.0
Google services including:
Google Workspace (formerly Google Suite)
Provider package
This is a provider package for google provider. All classes for this provider package are in airflow.providers.google python package.
You can find package information and changelog for the provider in the documentation.
Installation
You can install this package on top of an existing Airflow 2.1+ installation via pip install apache-airflow-providers-google
The package supports the following python versions: 3.7,3.8,3.9
PIP requirements
PIP package |
Version required |
---|---|
apache-airflow |
>=2.1.0 |
PyOpenSSL |
|
google-ads |
>=12.0.0,<14.0.1 |
google-api-core |
>=1.25.1,<3.0.0 |
google-api-python-client |
>=1.6.0,<2.0.0 |
google-auth-httplib2 |
>=0.0.1 |
google-auth |
>=1.0.0,<3.0.0 |
google-cloud-automl |
>=2.1.0,<3.0.0 |
google-cloud-bigquery-datatransfer |
>=3.0.0,<4.0.0 |
google-cloud-bigtable |
>=1.0.0,<2.0.0 |
google-cloud-build |
>=3.0.0,<4.0.0 |
google-cloud-container |
>=0.1.1,<2.0.0 |
google-cloud-datacatalog |
>=3.0.0,<4.0.0 |
google-cloud-dataproc-metastore |
>=1.2.0,<2.0.0 |
google-cloud-dataproc |
>=3.1.0,<4.0.0 |
google-cloud-dlp |
>=0.11.0,<2.0.0 |
google-cloud-kms |
>=2.0.0,<3.0.0 |
google-cloud-language |
>=1.1.1,<2.0.0 |
google-cloud-logging |
>=2.1.1,<3.0.0 |
google-cloud-memcache |
>=0.2.0,<1.1.0 |
google-cloud-monitoring |
>=2.0.0,<3.0.0 |
google-cloud-os-login |
>=2.0.0,<3.0.0 |
google-cloud-pubsub |
>=2.0.0,<3.0.0 |
google-cloud-redis |
>=2.0.0,<3.0.0 |
google-cloud-secret-manager |
>=0.2.0,<2.0.0 |
google-cloud-spanner |
>=1.10.0,<2.0.0 |
google-cloud-speech |
>=0.36.3,<2.0.0 |
google-cloud-storage |
>=1.30,<2.0.0 |
google-cloud-tasks |
>=2.0.0,<3.0.0 |
google-cloud-texttospeech |
>=0.4.0,<2.0.0 |
google-cloud-translate |
>=1.5.0,<2.0.0 |
google-cloud-videointelligence |
>=1.7.0,<2.0.0 |
google-cloud-vision |
>=0.35.2,<2.0.0 |
google-cloud-workflows |
>=0.1.0,<2.0.0 |
grpcio-gcp |
>=0.2.2 |
httpx |
|
json-merge-patch |
~=0.2 |
pandas-gbq |
<0.15.0 |
pandas |
>=0.17.1, <2.0 |
Cross provider package dependencies
Those are dependencies that might be needed in order to use all the features of the package. You need to install the specified provider packages in order to use them.
You can install such cross-provider dependencies when installing from PyPI. For example:
pip install apache-airflow-providers-google[amazon]
Dependent package |
Extra |
---|---|
amazon |
|
apache.beam |
|
apache.cassandra |
|
cncf.kubernetes |
|
microsoft.azure |
|
microsoft.mssql |
|
mysql |
|
oracle |
|
postgres |
|
presto |
|
salesforce |
|
sftp |
|
ssh |
|
trino |
Changelog
6.3.0
Features
Add optional location to bigquery data transfer service (#15088) (#20221)
Add Google Cloud Tasks how-to documentation (#20145)
Added example DAG for MSSQL to Google Cloud Storage (GCS) (#19873)
Support regional GKE cluster (#18966)
Delete pods by default in KubernetesPodOperator (#20575)
Bug Fixes
Fixes docstring for PubSubCreateSubscriptionOperator (#20237)
Fix missing get_backup method for Dataproc Metastore (#20326)
BigQueryHook fix typo in run_load doc string (#19924)
Fix passing the gzip compression parameter on sftp_to_gcs. (#20553)
switch to follow_redirects on httpx.get call in CloudSQL provider (#20239)
avoid deprecation warnings in BigQuery transfer operators (#20502)
Change download_video parameter to resourceName (#20528)
Fix big query to mssql/mysql transfer issues (#20001)
Fix setting of project ID in ''provide_authorized_gcloud'' (#20428)
Misc
Move source_objects datatype check out of GCSToBigQueryOperator.__init__ (#20347)
Organize S3 Classes in Amazon Provider (#20167)
Providers facebook hook multiple account (#19377)
Remove deprecated method call (blob.download_as_string) (#20091)
Remove deprecated template_fields from GoogleDriveToGCSOperator (#19991)
Note! optional features of the apache-airflow-providers-facebook and apache-airflow-providers-amazon require newer versions of the providers (as specified in the dependencies)
6.2.0
Features
Added wait mechanizm to the DataprocJobSensor to avoid 509 errors when Job is not available (#19740)
Add support in GCP connection for reading key from Secret Manager (#19164)
Add dataproc metastore operators (#18945)
Add support of 'path' parameter for GCloud Storage Transfer Service operators (#17446)
Move 'bucket_name' validation out of '__init__' in Google Marketing Platform operators (#19383)
Create dataproc serverless spark batches operator (#19248)
updates pipeline_timeout CloudDataFusionStartPipelineOperator (#18773)
Support impersonation_chain parameter in the GKEStartPodOperator (#19518)
Bug Fixes
Fix badly merged impersonation in GKEPodOperator (#19696)
6.1.0
Features
Add value to 'namespaceId' of query (#19163)
Add pre-commit hook for common misspelling check in files (#18964)
Support query timeout as an argument in CassandraToGCSOperator (#18927)
Update BigQueryCreateExternalTableOperator doc and parameters (#18676)
Replacing non-attribute template_fields for BigQueryToMsSqlOperator (#19052)
Upgrade the Dataproc package to 3.0.0 and migrate from v1beta2 to v1 api (#18879)
Use google cloud credentials when executing beam command in subprocess (#18992)
Replace default api_version of FacebookAdsReportToGcsOperator (#18996)
Dataflow Operators - use project and location from job in on_kill method. (#18699)
Bug Fixes
Fix hard-coded /tmp directory in CloudSQL Hook (#19229)
Fix bug in Dataflow hook when no jobs are returned (#18981)
Fix BigQueryToMsSqlOperator documentation (#18995)
Move validation of templated input params to run after the context init (#19048)
Google provider catch invalid secret name (#18790)
6.0.0
Breaking changes
Migrate Google Cloud Build from Discovery API to Python SDK (#18184)
Features
Add index to the dataset name to have separate dataset for each example DAG (#18459)
Add missing __init__.py files for some test packages (#18142)
Add possibility to run DAGs from system tests and see DAGs logs (#17868)
Rename AzureDataLakeStorage to ADLS (#18493)
Make next_dagrun_info take a data interval (#18088)
Use parameters instead of params (#18143)
New google operator: SQLToGoogleSheetsOperator (#17887)
Bug Fixes
Fix part of Google system tests (#18494)
Fix kubernetes engine system test (#18548)
Fix BigQuery system test (#18373)
Fix error when create external table using table resource (#17998)
Fix ''BigQuery'' data extraction in ''BigQueryToMySqlOperator'' (#18073)
Fix providers tests in main branch with eager upgrades (#18040)
fix(CloudSqlProxyRunner): don't query connections from Airflow DB (#18006)
Remove check for at least one schema in GCSToBigquery (#18150)
deduplicate running jobs on BigQueryInsertJobOperator (#17496)
5.1.0
Features
Add error check for config_file parameter in GKEStartPodOperator (#17700)
Gcp ai hyperparameter tuning (#17790)
Allow omission of 'initial_node_count' if 'node_pools' is specified (#17820)
[Airflow 13779] use provided parameters in the wait_for_pipeline_state hook (#17137)
Enable specifying dictionary paths in 'template_fields_renderers' (#17321)
Don't cache Google Secret Manager client (#17539)
[AIRFLOW-9300] Add DatafusionPipelineStateSensor and aync option to the CloudDataFusionStartPipelineOperator (#17787)
Bug Fixes
GCP Secret Manager error handling for missing credentials (#17264)
Misc
Optimise connection importing for Airflow 2.2.0
Adds secrets backend/logging/auth information to provider yaml (#17625)
5.0.0
Breaking changes
Updated GoogleAdsHook to support newer API versions after google deprecated v5. Google Ads v8 is the new default API. (#17111)
Google Ads Hook: Support newer versions of the google-ads library (#17160)
Features
Standardise dataproc location param to region (#16034)
Adding custom Salesforce connection type + SalesforceToS3Operator updates (#17162)
Bug Fixes
Update alias for field_mask in Google Memmcache (#16975)
fix: dataprocpysparkjob project_id as self.project_id (#17075)
Fix GCStoGCS operator with replace diabled and existing destination object (#16991)
4.0.0
Breaking changes
Auto-apply apply_default decorator (#15667)
Move plyvel to google provider extra (#15812)
Fixes AzureFileShare connection extras (#16388)
Features
Add extra links for google dataproc (#10343)
add oracle connection link (#15632)
pass wait_for_done parameter down to _DataflowJobsController (#15541)
Use api version only in GoogleAdsHook not operators (#15266)
Implement BigQuery Table Schema Update Operator (#15367)
Add BigQueryToMsSqlOperator (#15422)
Bug Fixes
Fix: GCS To BigQuery source_object (#16160)
Fix: Unnecessary downloads in ``GCSToLocalFilesystemOperator (#16171)``
Fix bigquery type error when export format is parquet (#16027)
Fix argument ordering and type of bucket and object (#15738)
Fix sql_to_gcs docstring lint error (#15730)
fix: ensure datetime-related values fully compatible with MySQL and BigQuery (#15026)
Fix deprecation warnings location in google provider (#16403)
3.0.0
Breaking changes
Change in AutoMLPredictOperator
The params parameter in airflow.providers.google.cloud.operators.automl.AutoMLPredictOperator class was renamed operation_params because it conflicted with a param parameter in the BaseOperator class.
Integration with the apache.beam provider
In 3.0.0 version of the provider we’ve changed the way of integrating with the apache.beam provider. The previous versions of both providers caused conflicts when trying to install them together using PIP > 20.2.4. The conflict is not detected by PIP 20.2.4 and below but it was there and the version of Google BigQuery python client was not matching on both sides. As the result, when both apache.beam and google provider were installed, some features of the BigQuery operators might not work properly. This was cause by apache-beam client not yet supporting the new google python clients when apache-beam[gcp] extra was used. The apache-beam[gcp] extra is used by Dataflow operators and while they might work with the newer version of the Google BigQuery python client, it is not guaranteed.
This version introduces additional extra requirement for the apache.beam extra of the google provider and symmetrically the additional requirement for the google extra of the apache.beam provider. Both google and apache.beam provider do not use those extras by default, but you can specify them when installing the providers. The consequence of that is that some functionality of the Dataflow operators might not be available.
Unfortunately the only complete solution to the problem is for the apache.beam to migrate to the new (>=2.0.0) Google Python clients.
This is the extra for the google provider:
extras_require = (
{
# ...
"apache.beam": ["apache-airflow-providers-apache-beam", "apache-beam[gcp]"],
# ...
},
)
And likewise this is the extra for the apache.beam provider:
extras_require = ({"google": ["apache-airflow-providers-google", "apache-beam[gcp]"]},)
You can still run this with PIP version <= 20.2.4 and go back to the previous behaviour:
pip install apache-airflow-providers-google[apache.beam]
or
pip install apache-airflow-providers-apache-beam[google]
But be aware that some BigQuery operators functionality might not be available in this case.
Features
[Airflow-15245] - passing custom image family name to the DataProcClusterCreateoperator (#15250)
Bug Fixes
Bugfix: Fix rendering of ''object_name'' in ''GCSToLocalFilesystemOperator'' (#15487)
Fix typo in DataprocCreateClusterOperator (#15462)
Fixes wrongly specified path for leveldb hook (#15453)
2.2.0
Features
Adds 'Trino' provider (with lower memory footprint for tests) (#15187)
update remaining old import paths of operators (#15127)
Override project in dataprocSubmitJobOperator (#14981)
GCS to BigQuery Transfer Operator with Labels and Description parameter (#14881)
Add GCS timespan transform operator (#13996)
Add job labels to bigquery check operators. (#14685)
Use libyaml C library when available. (#14577)
Add Google leveldb hook and operator (#13109) (#14105)
Bug fixes
Google Dataflow Hook to handle no Job Type (#14914)
2.1.0
Features
Corrects order of argument in docstring in GCSHook.download method (#14497)
Refactor SQL/BigQuery/Qubole/Druid Check operators (#12677)
Add GoogleDriveToLocalOperator (#14191)
Add 'exists_ok' flag to BigQueryCreateEmptyTable(Dataset)Operator (#14026)
Add materialized view support for BigQuery (#14201)
Add BigQueryUpdateTableOperator (#14149)
Add param to CloudDataTransferServiceOperator (#14118)
Add gdrive_to_gcs operator, drive sensor, additional functionality to drive hook (#13982)
Improve GCSToSFTPOperator paths handling (#11284)
Bug Fixes
Fixes to dataproc operators and hook (#14086)
#9803 fix bug in copy operation without wildcard (#13919)
2.0.0
Breaking changes
Updated google-cloud-* libraries
This release of the provider package contains third-party library updates, which may require updating your DAG files or custom hooks and operators, if you were using objects from those libraries. Updating of these libraries is necessary to be able to use new features made available by new versions of the libraries and to obtain bug fixes that are only available for new versions of the library.
Details are covered in the UPDATING.md files for each library, but there are some details that you should pay attention to.
Library name |
Previous constraints |
Current constraints |
Upgrade Documentation |
---|---|---|---|
>=0.4.0,<2.0.0 |
>=2.1.0,<3.0.0 |
||
>=0.4.0,<2.0.0 |
>=3.0.0,<4.0.0 |
||
>=0.5.0,<0.8 |
>=3.0.0,<4.0.0 |
||
>=1.0.1,<2.0.0 |
>=2.2.0,<3.0.0 |
||
>=1.2.1,<2.0.0 |
>=2.0.0,<3.0.0 |
||
>=1.14.0,<2.0.0 |
>=2.0.0,<3.0.0 |
||
>=0.34.0,<2.0.0 |
>=2.0.0,<3.0.0 |
||
>=1.0.0,<2.0.0 |
>=2.0.0,<3.0.0 |
||
>=1.0.0,<2.0.0 |
>=2.0.0,<3.0.0 |
||
>=1.2.1,<2.0.0 |
>=2.0.0,<3.0.0 |
The field names use the snake_case convention
If your DAG uses an object from the above mentioned libraries passed by XCom, it is necessary to update the naming convention of the fields that are read. Previously, the fields used the CamelSnake convention, now the snake_case convention is used.
Before:
set_acl_permission = GCSBucketCreateAclEntryOperator(
task_id="gcs-set-acl-permission",
bucket=BUCKET_NAME,
entity="user-{{ task_instance.xcom_pull('get-instance')['persistenceIamIdentity']"
".split(':', 2)[1] }}",
role="OWNER",
)
After:
set_acl_permission = GCSBucketCreateAclEntryOperator(
task_id="gcs-set-acl-permission",
bucket=BUCKET_NAME,
entity="user-{{ task_instance.xcom_pull('get-instance')['persistence_iam_identity']"
".split(':', 2)[1] }}",
role="OWNER",
)
Features
Add Apache Beam operators (#12814)
Add Google Cloud Workflows Operators (#13366)
Replace 'google_cloud_storage_conn_id' by 'gcp_conn_id' when using 'GCSHook' (#13851)
Add How To Guide for Dataflow (#13461)
Generalize MLEngineStartTrainingJobOperator to custom images (#13318)
Add Parquet data type to BaseSQLToGCSOperator (#13359)
Add DataprocCreateWorkflowTemplateOperator (#13338)
Add OracleToGCS Transfer (#13246)
Add timeout option to gcs hook methods. (#13156)
Add regional support to dataproc workflow template operators (#12907)
Add project_id to client inside BigQuery hook update_table method (#13018)
Bug fixes
Fix four bugs in StackdriverTaskHandler (#13784)
Decode Remote Google Logs (#13115)
Fix and improve GCP BigTable hook and system test (#13896)
updated Google DV360 Hook to fix SDF issue (#13703)
Fix insert_all method of BigQueryHook to support tables without schema (#13138)
Fix Google BigQueryHook method get_schema() (#13136)
Fix Data Catalog operators (#13096)
1.0.0
Initial version of the provider.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for apache-airflow-providers-google-6.3.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | b59c707a26a2afa95065a3c425004ac89bbefa74927bff1629effdfffcb2e669 |
|
MD5 | 0a850d8e77bef1f61ce14944b450fd50 |
|
BLAKE2b-256 | c0c4962bb872a05493012846edf86a1e6cfd8deab9b9e563372d285ec691acf3 |
Hashes for apache_airflow_providers_google-6.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc4281ea00b5bc83ae3a1f2c2dbe55fc479918fcae703b2bb8167409b16187fd |
|
MD5 | 0da06d7d88eaf78a8e8db775c5796196 |
|
BLAKE2b-256 | 6470a7e17bebb74dbbeeb22005c13fc6830454062bac0d983cde42810003353a |