A general purpose python ETL/pipeline utility library, for use especially with Hive Streaming.
Project description
transformpy is a Python 2/3 module for doing transforms on “streams” of data. The transforms can be applied to any python iterable object, and so can be used for continuous real_time streams or static streams (such as from a file). It is designed in such a manner that it uses very little memory (unless necessary by clustering and/or aggregation routines). It was originally designed to allow python transformations (maps and reductions) of data stored within HIVE, using the Hadoop streaming paradigm.
NOTE: TransformPy is not guaranteed to be API stable before version 1.0; but changes should be small if any to the current version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file transformpy-0.3.2.tar.gz
.
File metadata
- Download URL: transformpy-0.3.2.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9d8be791193dd444a919d83d5f42e9c62f63225087dd2ee10effb17eee0b3280 |
|
MD5 | 956074c996f4163af83f4f59f6cf2499 |
|
BLAKE2b-256 | 66ebc358fc9453544d36e351008e0be6fe045f5689fee22d0b44cc616f4254c7 |