Active Monitoring Module
The Active Monitoring Module helps you make sure that errors are acknowledged and fixed by IT staff, or escalated they are not addressed quickly.
Alerting revolves around the Operator Message system. When an Alert is raised, an Operator Message is created. This Operator Message must be replied to in order to acknowledge the Alert.
Alerts themselves are children of Operator Messages, and can be monitored on the Operator Messages screen (Configure > Control > Operator Messages). An Operator Message that has Alerts attached to it will have a [+] next to it.
There are two ways operators and managers can be informed of errors:
- Operator Messages
- Email or SMS
The key features of the Active Monitoring module are:
- Operator Message integration: All Alerts are raised as Operator Messages and are visible in the Operator Messages screen.
- Acknowledgment: An Alert must be acknowledged by replying to its Operator Message in order to be cleared.
- Alert escalations: Unacknowledged Alerts can be escalated to email or SMS through shared escalation pathways.
- Rule-based configuration: Alerts can be defined as rules that are separate from the objects that cause them to be raised.
Alerts can be defined for the following objects:
- Job Servers: Any status change. Rules are defined based on a Job Server name pattern.
- Processes: Any status change. Rules are defined based on a Job Definition name pattern and the Parameters.
- Monitors: Any change in severity. Rules are defined based on a Monitor Condition.
Note: The Active Monitoring Module requires the Module.Alerting
license key.
The alerting system consists of three types of Object:
- Alert Sources: Object-specific rules that are defined for when alerts should be created, see below.
- Job Server Alert Sources: Used when a Job Server loses the connection to a remote system.
- Process Alert Sources: Used when Jobs , Steps, or Workflows reach an undesired status.
- Ad Hoc Alert Sources: Used in chains to fire Alerts with the System_Alert_Send Job Definition.
- Monitor Alert Sources: Used when a Monitor Check reaches a certain severity.
- Alert Escalations: A set of rules stating who to send an Alert to, how long to wait for acknowledgment, and which Alert to escalate to if the Alert is unacknowledged.
- Alert Gateways: A set of rules determining how messages are formatted and sent.
An alert is raised by an Alert Source, creating an Operator Message that needs to be responded to. The Alert Source specifies the first alert escalation to use. From then on the alerting escalation system decides how long to wait for acknowledgment, and what the next Alert is. While the Alert is being escalated, messages are sent via the Alert gateways to elicit a response. As soon as the Operator Message is replied to, the Alert is acknowledged and no further automatic action is taken.
Alerting CAR File
An alerting CAR file is available in Configure > Admin > Configuration > Software Groups under Download CAR Files. It contains these Active Monitoring Module Objects.
GLOBAL.DelayedProcesses
: Job Definition Alert Source that fires for delayed Jobs.GLOBAL.ErroneousProcesses
: Job Definition Alert Source that fires for Jobs that have reached status Error, Killed, or Unknown.GLOBAL.NotConnectedProcessServers
: Job Server Alert Source that fires for Job Servers that have reached status Connecting, PartiallyRunning, or Shutdown for more than two minutes.