Dask + Delta Table
Project description
Dask Deltatable Reader
Reads a Delta Table from directory using Dask engine.
To Try out the package:
pip install dask-deltatable
Features:
- Reads the parquet files based on delta logs parallely using dask engine
- Supports all three filesystem like s3, azurefs, gcsfs
- Supports some delta features like
- Time Travel
- Schema evolution
- parquet filters
- row filter
- partition filter
- Query Delta commit info - History
- vacuum the old/ unused parquet files
- load different versions of data using datetime.
Usage:
import dask_deltatable as ddt
# read delta table
ddt.read_delta_table("delta_path")
# read delta table for specific version
ddt.read_delta_table("delta_path",version=3)
# read delta table for specific datetime
ddt.read_delta_table("delta_path",datetime="2018-12-19T16:39:57-08:00")
# read delta complete history
ddt.read_delta_history("delta_path")
# read delta history upto given limit
ddt.read_delta_history("delta_path",limit=5)
# read delta history to delete the files
ddt.vacuum("delta_path",dry_run=False)
# Can read from S3,azure,gcfs etc.
ddt.read_delta_table("s3://bucket_name/delta_path",version=3)
# please ensure the credentials are properly configured as environment variable or
# configured as in ~/.aws/credential
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
dask-deltatable-0.2.tar.gz
(4.8 kB
view details)
Built Distribution
File details
Details for the file dask-deltatable-0.2.tar.gz
.
File metadata
- Download URL: dask-deltatable-0.2.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9169660740ba89ca8753df7e55a9e75f8a18b1fa014f2b4d71f6e94a3b3e9255 |
|
MD5 | d0ef7a60c9850ce5ffbccef52f9a7975 |
|
BLAKE2b-256 | d09d6e634d77502d1ae920791083e064beb1da97d8b347a2f996aa0bbd1770df |
Provenance
File details
Details for the file dask_deltatable-0.2-py3-none-any.whl
.
File metadata
- Download URL: dask_deltatable-0.2-py3-none-any.whl
- Upload date:
- Size: 5.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5b56a57ec6623829cdd0fae8b744f1afc638e0bf6d1fc6d26945e92867a9f82 |
|
MD5 | 1db09bf405a4031e3c459eb65ad0c3b5 |
|
BLAKE2b-256 | c2638d9505c174c00e8d2a831e8ee8494e8bafd31eb1732fa6655ca08b314b4b |