Skip to content

Role based information retrieval

Role based information retrieval helps users filter information to focus only on data related to their role.

Developers are usually interested in fine-grained information that gives insight into the Erlang VM, reveals performance bottlenecks and a potential overuse of resources. Operators maintaining production Erlang systems tend to use host and node level information. Operation system level metrics and alarms are usually enough to uncover bad trends. For instance, having a process with very large message queue is a useful information for both roles. However, operation should not be bothered with the amount of memory allocated by the active timers, but be warned when an Erlang node is running out of file descriptors.

The filtering is based on tags that classify metrics, notifications and alarms. There are two roles used as built-in tags: dev and op. All metrics, notifications and alarms are tagged with any subset of the two roles (i.e. dev, op, dev and op) to split information between developers and operators. The documentation of metrics, notifications and alarms include the default tags. Additionally to the two roles, object attributes can be used as tags.

You may want to overwrite the default assignments or introduce your own tags. For both reasons, you should use custom tags.

The tags assigned to users select those metrics, notifications and alarms that are matching any of the given tags, hence the tags hide irrelevant data from users. This filtering mechanism works on the REST layer, thus it can be used to customise all REST based integrations even if the authentication is switched off. By defining a set of tags, you can control the non-REST based integrations' behaviour (e.g. narrowing down the set of metrics pushed to Datadog).

Note that the tags assigned to users are used only if authentication is switched on. When authentication is disabled, the tags defined by the default_user_tags configuration parameter are used by the REST layer.

Built-in tags

Roles

The following roles can be used as tags to filter metrics, notifications and alarms: <<"dev">> and <<"op">>.

Object attributes

The following object attributes, which should be given as binary strings, can be used as tags.

Defined for metrics, notifications and alarms:

  • <<"metric">>, <<"log">> and <<"alarm">> respectively.
  • NodeId The UUID of the managed node that the object corresponds to (e.g. <<"7e7e1188-ff5d-45ed-8103-1d32542efa25">>).
  • NodeName The node name of the managed node that the object corresponds to (e.g. <<"wombat@127.0.0.1">>).
  • DisplayName The display name of the managed node that the object corresponds to (e.g. <<"my-wombat-node">>).
  • NodeFamilyId The id of the node family including the managed node that the object corresponds to (e.g. <<"056e1058-f646-494a-8fe5-2cc0f5d7bd1e">>).
  • NodeFamilyName The name of the node family including the managed node that the object corresponds to (e.g. <<"wombat-family">>).
  • Originator The short name of the plugin responsible for providing the data (e.g. <<"poolboy">> stands for wombat_plugin_poolboy).

Object attributes that can only be used to filter alarms are

  • Severity The severity of the alarm (e.g. <<"minor">>).
  • AlarmKey The built-in alarms are either parametric alarms or global alarms. Parametric alarms (e.g. {missing_application, App}) are identified as tuples. In this case the AlarmKey is the first element of the tuple (e.g. <<"missing_application">>). The identifiers of global alarms are atoms (e.g. ets_limit). In this case, the AlarmKey is the binary string representation of the atom (e.g. <<"ets_limit">>).

Object attributes that can only be used to filter metrics are

  • MetricType The type of the metric (e.g. <<"counter">>).
  • MetricGroup The group of the metric (e.g. <<"I/O">>).
  • MetricName The name of the metric (e.g. <<"Output I/O bytes">>).

Object attributes that can only be used to filter notifications are

  • Severity The severity of the log entry (e.g. <<"info">>).
  • PropertyValues Values stored in the log object's property list.

Custom tags

To introduce a custom tag, define the name of the tag and a logical condition that must be satisfied by any alarm or metric to be tagged with your custom tag.

The logical condition should be provided using any valid combinations of filter expressions. A filter expression can be constructed using the all, the any and the not operators (logical conjunction, disjunction and negation operators, respectively) combined with the object attributes or roles.

Before going into details, let's see two examples.

Assuming we only want to create new PagerDuty incidents for the major alarms that are related to the two production nodes, which are named as node1 and node2, we define and then use the custom tag, which is named as <<"MajorAlarmsFromProdNodes">>, to configure PagerDuty.

1
2
3
4
5
{<<"MajorAlarmsFromProdNodes">>,
 {all, [<<"major">>,
        <<"alarm">>,
        {any, [<<"node1">>,
               <<"node2">>]}]}}.

Assuming we are almost satisfied with the built-in dev role, we just want to remove anything related to node N1, hide the invalid_application_version alarm and show ets_limit and os_cpu_load alarms. In this case we may customise the dev role as follows.

1
2
3
4
5
6
{<<"dev">>,
 {any, [{all, [<<"dev">>,
               {'not', <<"invalid_application_version">>},
               {'not', <<"N1">>}]},
        <<"ets_limit">>,
        <<"os_cpu_load">>]}}.

Constructing filter expressions

A valid filter expression can be typed as followed.

1
2
3
4
5
6
-type filter_expr() :: any_expr() | all_expr() | not_expr().
-type any_expr() :: {any, non_empty_list(expr())}.
-type all_expr() :: {all, non_empty_list(expr())}.
-type not_expr() :: {'not', expr()}.
-type expr() :: filter_expr() | object_attr().
-type object_attr() :: binary().

Defining custom tags

All custom tags should be defined as one list and loaded into WombatOAM by adding the following line to wombat.config.

1
2
{set, wo_core, custom_tags, [{<<"MyTag1">>, {any, [<<"T1">>, <<"T2">>]}},
                             {<<"MyTag2">>, {'not', <<"T3">>}}]}.

Another example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
{set, wo_core, custom_tags, 
[{<<"datadog_tag">>,
{any, 
     [ 
    {all, [
    {any, [<<"node1@127.0.0.1">>]},
    {any, [<<"Process memory">>]}
    ]},
{all, [
    {any, [<<"node2@127.0.0.1">>]},
    {any, [<<"System memory">>, <<"Total memory">>]}
    ]}
    ]
    }
}]}.