Skip to main content

A general purpose python ETL/pipeline utility library, for use especially with Hive Streaming.

Project description

transformpy is a Python 2/3 module for doing transforms on “streams” of data. The transforms can be applied to any python iterable object, and so can be used for continuous real_time streams or static streams (such as from a file). It is designed in such a manner that it uses very little memory (unless necessary by clustering and/or aggregation routines). It was originally designed to allow python transformations (maps and reductions) of data stored within HIVE, using the Hadoop streaming paradigm.

NOTE: TransformPy is not guaranteed to be API stable before version 1.0; but changes should be small if any to the current version.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transformpy-0.3.2.tar.gz (5.7 kB view details)

Uploaded Source

File details

Details for the file transformpy-0.3.2.tar.gz.

File metadata

  • Download URL: transformpy-0.3.2.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for transformpy-0.3.2.tar.gz
Algorithm Hash digest
SHA256 9d8be791193dd444a919d83d5f42e9c62f63225087dd2ee10effb17eee0b3280
MD5 956074c996f4163af83f4f59f6cf2499
BLAKE2b-256 66ebc358fc9453544d36e351008e0be6fe045f5689fee22d0b44cc616f4254c7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page