Failed Jobs Monitor
Detecting and Managing Job Failures
The Failed Jobs Monitor tracks job failures in real time. An alert is triggered when a new failed job appears at the top of the queue that has not been flagged before.
Configuration Requirements
To ensure proper monitoring:
Your queues must retain at least one failed job.
The
removeOnFailedoption should be set to false (default) or to a value β₯ 1.
Alert Behavior
Since multiple jobs often fail together, no additional alerts will be generated until the initial alert is acknowledged.
Alerts are triggered per queue, meaning multiple alerts may exist simultaneously for different queues.
