Data Aggregation and Transformation component for Monasca
Project description
Monasca Transform
monasca-transform is a data driven aggregation engine which collects, groups and aggregates existing individual Monasca metrics according to business requirements and publishes new transformed (derived) metrics to the Monasca Kafka queue.
Since the new transformed metrics are published as any other metric in Monasca, alarms can be set and triggered on the transformed metric.
Monasca Transform uses Apache Spark to aggregate data. Apache Spark is a highly scalable, fast, in-memory, fault tolerant and parallel data processing framework. All monasca-transform components are implemented in Python and use Spark’s PySpark Python API to interact with Spark.
Monasca Transform does transformation and aggregation of incoming metrics in two phases.
In the first phase spark streaming application is set to retrieve in data from kafka at a configurable stream interval (default stream_inteval is 10 minutes) and write the data aggregated for stream interval to pre_hourly_metrics topic in kafka.
In the second phase, which is kicked off every hour, all metrics in metrics_pre_hourly topic in Kafka are aggregated again, this time over a larger interval of an hour. These hourly aggregated metrics published to metrics topic in kafka.
Use Cases handled by Monasca Transform
Please refer to Problem Description section on the Monasca/Transform wiki
Operation
Please refer to How Monasca Transform Operates section on the Monasca/Transform wiki
Architecture
Please refer to Architecture and Logical processing data flow sections on the Monasca/Transform wiki
To set up the development environment
The monasca-transform uses DevStack as a common dev environment. See the README.md in the devstack directory for details on how to include monasca-transform in a DevStack deployment.
Generic aggregation components
Monasca Transform uses a set of generic aggregation components which can be assembled in to an aggregation pipeline.
Please refer to the generic-aggregation-components document for information on list of generic aggregation components available.
Create a new aggregation pipeline example
Generic aggregation components make it easy to build new aggregation pipelines for different Monasca metrics.
This create a new aggregation pipeline example shows how to create pre_transform_specs and transform_specs to create an aggregation pipeline for a new set of Monasca metrics, while leveraging existing set of generic aggregation components.
Original proposal and blueprint
Original proposal: Monasca/Transform-proposal
Blueprint: monasca-transform blueprint
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for monasca_transform-1.0.0.0rc1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7e3a051df37851460b74a17ea9fdc318dbc65141f45213c4da4fb1d9ac95f0c |
|
MD5 | e779963f918f83112d26968dca6f3817 |
|
BLAKE2b-256 | 13e9769d13588192384db8877ace4950223c37e389e54ca9b433f231c6c20544 |
Hashes for monasca_transform-1.0.0.0rc1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe64e379dc9515cc5e8ae80df63ca2d2b31d549f3e2992acf1b4ebcc75689b31 |
|
MD5 | c3013dec0c2feab901437ef3c6b66454 |
|
BLAKE2b-256 | d66947967e784b94cd7ae783edf9ae646e69d679ef6202747fc223ebf20030d3 |