monasca-notification

Reads alarms from Kafka and then notifies the customer using their configured notification method.

These details have not been verified by PyPI

Project links

Homepage

Project description

Team and repository tags

https://governance.openstack.org/tc/badges/monasca-notification.svg

Notification Engine

This engine reads alarms from Kafka and then notifies the customer using the configured notification method. Multiple notification and retry engines can run in parallel, up to one per available Kafka partition. Zookeeper is used to negotiate access to the Kafka partitions whenever a new process joins or leaves the working set.

Architecture

The notification engine generates notifications using the following steps:

Read Alarms from Kafka, with no auto commit. - monasca_common.kafka.KafkaConsumer class
Determine notification type for an alarm. Done by reading from mysql. - AlarmProcessor class
Send notification. - NotificationProcessor class
Add successful notifications to a sent notification topic. - NotificationEngine class
Add failed notifications to a retry topic. - NotificationEngine class
Commit offset to Kafka - KafkaConsumer class

The notification engine uses three Kafka topics:

alarm_topic: Alarms inbound to the notification engine.
notification_topic: Successfully sent notifications.
notification_retry_topic: Failed notifications.

A retry engine runs in parallel with the notification engine and gives any failed notification a configurable number of extra chances at success.

The retry engine generates notifications using the following steps:

Read notification json data from Kafka, with no auto commit. - KafkaConsumer class
Rebuild the notification that failed. - RetryEngine class
Send notification. - NotificationProcessor class
Add successful notifications to a sent notification topic. - RetryEngine class
Add failed notifications that have not hit the retry limit back to the retry topic. - RetryEngine class
Discard failed notifications that have hit the retry limit. - RetryEngine class
Commit offset to Kafka. - KafkaConsumer class

The retry engine uses two Kafka topics:

notification_retry_topic: Notifications that need to be retried.
notification_topic: Successfully sent notifications.

Fault Tolerance

When reading from the alarm topic, no committing is done. The committing is done only after processing. This allows the processing to continue even though some notifications can be slow. In the event of a catastrophic failure some notifications could be sent but the alarms have not yet been acknowledged. This is an acceptable failure mode, better to send a notification twice than not at all.

The general process when a major error is encountered is to exit the daemon which should allow the other processes to renegotiate access to the Kafka partitions. It is also assumed that the notification engine will be run by a process supervisor which will restart it in case of a failure. In this way, any errors which are not easy to recover from are automatically handled by the service restarting and the active daemon switching to another instance.

Though this should cover all errors, there is the risk that an alarm or a set of alarms can be processed and notifications are sent out multiple times. To minimize this risk a number of techniques are used:

Timeouts are implemented for all notification types.
An alarm TTL is utilized. Any alarm older than the TTL is not processed.

Operation

oslo.config is used for handling configuration options. A sample configuration file etc/monasca/notification.conf.sample can be generated by running:

tox -e genconfig

Monitoring

StatsD is incorporated into the daemon and will send all stats to the StatsD server launched by monasca-agent. Default host and port points to localhost:8125.

Counters
- ConsumedFromKafka
- AlarmsFailedParse
- AlarmsNoNotification
- NotificationsCreated
- NotificationsSentSMTP
- NotificationsSentWebhook
- NotificationsSentPagerduty
- NotificationsSentFailed
- NotificationsInvalidType
- AlarmsFinished
- PublishedToKafka
Timers
- ConfigDBTime
- SendNotificationTime

Future Considerations

More extensive load testing is needed:
- How fast is the mysql db? How much load do we put on it. Initially I think it makes most sense to read notification details for each alarm but eventually I may want to cache that info.
- How expensive are commits to Kafka for every message we read? Should we commit every N messages?
- How efficient is the default Kafka consumer batch size?
- Currently we can get ~200 notifications per second per NotificationEngine instance using webhooks to a local http server. Is that fast enough?
- Are we putting too much load on Kafka at ~200 commits per second?

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

9.0.0

Oct 4, 2023

9.0.0.0rc1 pre-release

Sep 15, 2023

8.0.0

Mar 22, 2023

8.0.0.0rc1 pre-release

Mar 3, 2023

7.0.0

Oct 5, 2022

7.0.0.0rc1 pre-release

Sep 16, 2022

6.0.0

Mar 30, 2022

6.0.0.0rc1 pre-release

Mar 10, 2022

5.0.0

Oct 6, 2021

5.0.0.0rc1 pre-release

Sep 16, 2021

4.0.1

Sep 1, 2022

4.0.0

Apr 14, 2021

4.0.0.0rc1 pre-release

Mar 26, 2021

3.0.0

Oct 14, 2020

3.0.0.0rc1 pre-release

Sep 28, 2020

2.0.1

Jul 29, 2021

2.0.0

May 13, 2020

2.0.0.0rc1 pre-release

Apr 24, 2020

1.18.0

Sep 27, 2019

1.17.1

Jul 24, 2020

1.17.0

Apr 17, 2019

1.16.0

Apr 1, 2019

1.15.0

Oct 25, 2018

1.14.1

Jun 6, 2019

1.14.0

Aug 9, 2018

This version

1.13.1

Oct 25, 2019

1.13.0

Feb 7, 2018

1.12.0

Dec 21, 2017

1.11.0

Oct 26, 2017

1.10.1

Aug 21, 2017

1.10.0

Aug 10, 2017

1.9.0

Jun 5, 2017

1.8.0

Apr 19, 2017

1.7.0

Feb 15, 2017

1.6.0

Dec 19, 2016

1.5.0

Dec 5, 2016

1.4.0

Sep 23, 2016

1.3.0

Jun 20, 2016

1.2.13

Mar 25, 2016

1.2.12

Mar 3, 2016

1.2.11

Mar 1, 2016

1.2.10

Feb 1, 2016

1.2.9

Jan 27, 2016

1.2.8

Jan 25, 2016

1.2.7

Jul 28, 2015

1.2.6

Jul 20, 2015

1.2.5

May 7, 2015

1.2.4

Apr 30, 2015

1.2.3

Apr 29, 2015

1.2.2

Apr 17, 2015

1.2.1

Mar 5, 2015

1.2.0

Feb 11, 2015

1.1.6

Jan 6, 2015

1.1.5

Dec 4, 2014

1.1.4

Nov 25, 2014

1.1.3

Nov 20, 2014

1.1.2

Oct 31, 2014

1.1.1

Sep 30, 2014

1.1.0

Aug 29, 2014

1.0.3

Jul 29, 2014

1.0.2

Jul 22, 2014

1.0.1

Jul 22, 2014

1.0.0

Jun 19, 2014

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monasca-notification-1.13.1.tar.gz (60.6 kB view hashes)

Uploaded Oct 25, 2019 Source

Built Distribution

monasca_notification-1.13.1-py2.py3-none-any.whl (65.0 kB view hashes)

Uploaded Oct 25, 2019 Python 2 Python 3

Hashes for monasca-notification-1.13.1.tar.gz

Hashes for monasca-notification-1.13.1.tar.gz
Algorithm	Hash digest
SHA256	`026e6065196977f050716150c52b4244acd425f8a30f78c62dc74355d6ef21f3`
MD5	`25dc41211cb86de773d3a22526d4fe67`
BLAKE2b-256	`409d1388e41e7ef3b2903dfa6949055bbe7ca21f4c256d9c0f94a531dfd035ae`

Hashes for monasca_notification-1.13.1-py2.py3-none-any.whl

Hashes for monasca_notification-1.13.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`f7448b3d0311af006d56c2871ab2cb5f974480e0ac2c358afe602c8ebb74dcc7`
MD5	`c87acbe33d1789da98d18c7870d28d16`
BLAKE2b-256	`afda8c04ae44adda761401cfbffe62da672d623eeddaa4c628101a1eb281ec99`