Skip to content

alarm

alarm plugin

Description

The alarm plugin is intended to do some basic system status checks on the monitored system and report alarms to WombatOAM when any of the monitored parameters reach a certain threshold.

Applications it depends on

kernel

Modules

wombat_plugin_alarm

Reports

The plugin reports the following alarms:

  • process_limit
  • port_limit
  • ets_limit
  • atom_limit
  • module_limit
  • export_limit
  • memory_limit
  • open_file_limit
  • open_socket_limit
  • os_cpu_load
  • disk_capacity
  • shell_history_size
  • process_message_queue
  • system_information
  • old_code

Configuration options

The interval at which the checks are performed is configurable, in case it is necessary to regulate plugin's moderate resource use:

  • collection_interval (integer, default: 60000): Specifies how many milliseconds to wait between checking whether any process-related alarm (e.g. process_message_queue) should be raised or ceased.
  • interval (integer, default: 60000): Specifies how many milliseconds to wait between checking whether any system limit-related alarm (e.g. atom_limit) should be raised or ceased.
  • app_check_interval (integer, default: 60000): Specifies how many milliseconds to wait between checking whether the version of any application changed. In case of any change, WombatOAM will raise a different_application_versions alarm.

The "node info alarms" are raised by the WombatOAM server, based on the node info reported by this plugin:

  • node_info_opts/app_version_alarms (default: true): If true, then alarm will be raised if nodes in the same family have different versions of the same application, or the application is not running on all nodes. Application started or stopped on nodes will be logged as notifications.
  • node_info_opts/time_diff_alarms (default: true): If true, then alarms will be raised if nodes in the same family are in different time zones.

The system_checks option is a list of system checks that the plugin shall perform.

  • process_limit, port_limit, ets_limit, atom_limit, module_limit, export_limit, memory_limit, open_file_limit, open_socket_limit, os_cpu_load, disk_capacity: These system checks are a minor alarm limit and a major alarm limit. After these limits are reached, an appropriate alarm is raised. The thresholds are expressed as percentages.
  • shell_history_size, process_message_queue: These system checks are a minor alarm limit and a major alarm limit. After these limits are reached, an appropriate alarm is raised. The thresholds are absolute numbers.
  • system_information, old_code: These system checks only have an alarm severity, which specifies the severity of the alarm that should be raised for them.

Example wombat.config entries

1
2
3
4
5
6
7
{set, wo_plugins, plugins, alarm, collection_interval, 60000}.
{set, wo_plugins, plugins, alarm, interval, 60000}.
{set, wo_plugins, plugins, alarm, app_check_interval, 60000}.

{set, wo_plugins, plugins, alarm, node_info_opts, app_version_alarms, true}.
{set, wo_plugins, plugins, alarm, node_info_opts, module_version_alarms, true}.
{set, wo_plugins, plugins, alarm, node_info_opts, time_diff_alarms, true}.

The system check configuration entries can be overridden individually:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
{set, wo_plugins, plugins, alarm, system_checks,
 process_limit, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 process_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 port_limit, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 port_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 ets_limit, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 ets_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 atom_limit, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 atom_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 module_limit, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 module_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 export_limit, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 export_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 memory_limit, minor, 70}.
{set, wo_plugins, plugins, alarm, system_checks,
 memory_limit, major, 75}.

{set, wo_plugins, plugins, alarm, system_checks,
 open_file_limit, minor, 60}.
{set, wo_plugins, plugins, alarm, system_checks,
 open_file_limit, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 open_socket_limit, minor, 60}.
{set, wo_plugins, plugins, alarm, system_checks,
 open_socket_limit, major, 75}.

{set, wo_plugins, plugins, alarm, system_checks,
 os_cpu_load, minor, 75}.
{set, wo_plugins, plugins, alarm, system_checks,
 os_cpu_load, major, 90}.

{set, wo_plugins, plugins, alarm, system_checks,
 disk_capacity, minor, 80}.
{set, wo_plugins, plugins, alarm, system_checks,
 disk_capacity, major, 90}.

%% The following thresholds are absolute values, not percentages.

{set, wo_plugins, plugins, alarm, system_checks,
 shell_history_size, minor, 20000000}.
{set, wo_plugins, plugins, alarm, system_checks,
 shell_history_size, major, 100000000}.

{set, wo_plugins, plugins, alarm, system_checks,
 process_message_queue, minor, 10000}.
{set, wo_plugins, plugins, alarm, system_checks,
 process_message_queue, major, 100000}.

%% The following alarm doesn't have a threshold.

%% system_information is not enabled by default because of its runtime impact.
{set, wo_plugins, plugins, alarm, system_checks,
 system_information, major}.

{set, wo_plugins, plugins, alarm, system_checks,
 old_code, minor}.

To disable all system checks or enable only a few of them, list only those that shall be performed:

1
2
3
4
5
6
7
8
%% Enable only three system checks
{set, wo_plugins, plugins, alarm, system_checks,
 [{process_limit, [{minor, 80}, {major, 90}]},
  {port_limit,    [{minor, 80}, {major, 90}]},
  {ets_limit,     [{minor, 80}, {major, 90}]}]}.

%% Disable all system checks
{set, wo_plugins, plugins, alarm, system_checks, []}.