WombatOAM Plugin Guide
What is a plugin?
Plugins for WombatOAM are user-supplied modules that extend its capabilities to
monitor new Erlang applications. Plugins provide a way to extend WombatOAM and give
it the ability to collect application-specific metrics, notifications, raise
custom alarms and implement services. They can also be added to a binary WombatOAM
release; this requires a restart of the system. Plugins run on the managed node
and communicate with the WombatOAM server.
Viewing and managing plugins
The web dashboard shows the plugins that are running on each node.
Click Topology → select a node → Agents.
You can turn individual agents on or off on a per-node basis.
Configuring & deploying plugins
WombatOAM will start a plugin automatically on the managed node if the node is
running the matching versions of the dependant Erlang applications declared for
that plugin in one of the WombatOAM config files (sys.config
or
wombat.config
).
For WombatOAM to find a plugin, the compiled BEAM files for the plugin need to be
present in the plugins
directory of the WombatOAM release when WombatOAM is started.
This directory can be overridden: in the wombat.config
file, set the
environment variable plugin_dir
of the wo_utils
application:
| {set, wo_utils, plugin_dir, "...path/to/plugins"}.
|
The main module of the plugin should be named wombat_plugin_<APPNAME>.beam
.
For example, a plugin for the wo_test
application should be named
wombat_plugin_wo_test.beam
.
A plugin's settings need to be declared in WombatOAM's wombat.config
file.
For example, to enable the plugin for the wo_test
application, wombat.config
needs the following:
| %% {PluginName, DependantApplications, ExtraModules, Options}
{replace, wo_plugins, plugins, wo_test,
{wo_test,
[{myapp1, "1\\.3"}, [{myapp2, ".*"}, {myapp3, ".*"}]],
[wombat_plugin_dummy_module],
[{test_option, 42},
{required_wombat_apps, [{kernel, "^[3-9]\\..*"}]}]}}.
|
The main plugin module is wombat_plugin_wo_test
and there is a library module
called wombat_plugin_dummy_module
which gets loaded into the managed node.
Module names must also have a wombat_plugin_
prefix.
A dependant application is an application that must be running on the managed
node in order to be able to use the plugin. A plugin can have more than one
dependant application. In the example above, the wo_test
plugin depends on
{myapp1, "1\\.3"}
and [{myapp2, ".*"}, {myapp3, ".*"}]
. This means that
wo_test
can be used if either:
- version 1.3 of
myapp1
; or
- any version of both
myapp2
and myapp3
are available on the target node. [{myapp2, ".*"}, {myapp3, ".*"}]
is known as
a dependant application group, which means that all the dependencies defined
within the group must be met in order to be able to use the plugin.
Extra options to the plugins can be passed via the Options
property list.
This list should be used in case of the plugin's binaries requires some extra
applications to be loaded into the Wombat node. That can be declared by the
required_wombat_apps
property that should list the dependant applications. In
the example above, the wo_test
plugin will be loaded into Wombat, only if
Wombat is running on Erlang/OTP 17 or newer versions as its binary contains op
codes that can't be interpreted by older Erlang versions (e.g. uses maps).
If the property is not set, Wombat always tries to load the plugin's binaries.
When developing a plugin, it is often useful to set the verbosity that belongs
to the plugin to the debug level:
| {set, wo_plugins, plugin_action_verbosity, my_plugin, <<"debug">>}.
|
This means that notifications will be generated for many plugin actions, such as
starting, terminating, sending metrics and alarms to WombatOAM, handling requests,
etc. See more information about this settings in the
Configuration section.
Note that this setting may have a negative impact on WombatOAM's performance.
It should be only used while you are developing a plugin or experiencing issues.
Writing plugins for WombatOAM
The main module of the plugin must implement the wombat_plugin
behaviour by
implementing its callback functions. Each plugin will run in its own process and
it has a state in which it can store information. The callback functions init
and terminate
are called when the plugin is started/stopped. The callback
handle_info/2
is called when the plugin process receives a message.
A plugin can report metrics, notifications, alarms and other information by:
- Implementing the
wombat_plugin
callbacks.
- Using the library functions in
wombat_plugin
and wombat_plugin_utils
.
Reporting metrics is mostly callback-oriented: the callbacks
capabilities/1
, live_metrics2comp_units/2
return the list of available
metrics, while the callbacks collect_metrics/1
and collect_live_metrics/1
return the metric samples. If the list of available metrics change, the plugin
should call the wombat_plugin:announce_capabilities/1
function.
The notifications, alarms and other information are reported not via
callbacks but via calling library functions: wombat_plugin:report_log/2
for
reporting notifications, wombat_plugin:raise_alarm/2
and
wombat_plugin:clear_alarm/1
for reporting alarms, and
wombat_plugin:report_internal_data/2
for reporting other information.
A plugin can also implement services. First it has to report as capabilities
which services it implements (see Capabilities
section). Then it has to implement 3 callbacks to implement a request for a
certain service (see Callbacks section).
There are many useful library functions too, organised into 3 modules.
-
wombat_plugin
contains functions to implement the core wombat_plugin
behaviour,
-
wombat_plugin_utils
includes general utility functions,
-
wombat_plugin_services
provides functions to create new services by
implementing the wombat_plugin_services
behaviour.
The plugin is started on the managed node, supervised by other WombatOAM processes.
Some general guidelines:
- Plugins are expected to generate a moderate amount of data; currently, WombatOAM
doesn't try to throttle plugins.
- As WombatOAM supervises the plugin processes, start-up notifications for the
plugins and their supervisors might show up in logs on the managed node, for
example, when SASL is started.
Types
The types used by the plugins are defined in the wombat_types
module and
exported:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236 | %%%-----------------------------------------------------------------------------
%%% Capabilities
%%%-----------------------------------------------------------------------------
-type capabilities() :: [capability()].
%% A plugin exposes a list of capabilities.
-type capability() :: {capability_id(), capability_info()}.
%% Each capability has an id and a list of information.
-type capability_id() :: [binary()].
%% For metrics, the final component is the name of the metric, the prefix list
%% is the name of the metric group. Notifications (aka logs) are
%% currently not using these ids.
-type capability_info() :: metric_info()
| notification_info()
| alarm_info()
| service_info().
-type capability_tags_item() :: {tags, capability_tags()}.
%% List of tags should be assigned to alarms and metrics in the capabilities.
-type capability_tag() :: binary().
%% A tag is represented as a binary string.
-type capability_tags() :: list(capability_tag()).
%%%-----------------------------------------------------------------------------
%%% Capabilities: Metrics
%%%-----------------------------------------------------------------------------
-type metric_info() :: [metric_info_item()].
-type metric_info_item() :: {type, metric}
% UTF-8 binary string (not currently used, could be
% tooltip, for example)
| {description, binary()}
| {metric_type, metric_type()}
| {metric_unit, metric_unit()}
| capability_tags_item().
-type metric_cap_id_last() :: term().
%% The last element of the capability id of a metric, but as a term and not as a
%% binary. E.g. 'Folsom cpu'.
%%
%% From the plugin's point of view, this data type is used to administer live
%% metrics: e.g. when a live metric is enabled on the dashboard, a list of
%% metric_cap_id_last() items are send to the plugins to start sending the
%% appropriate metric periodically.
%%%-----------------------------------------------------------------------------
%%% Capabilities: Notifications
%%%-----------------------------------------------------------------------------
-type notification_info() :: [notification_info_item()].
-type notification_info_item() :: {type, notification}
| {description, binary()}.
%%%-----------------------------------------------------------------------------
%%% Capabilities: Alarms
%%%-----------------------------------------------------------------------------
-type alarm_info() :: [alarm_info_item()].
-type alarm_info_item() :: {type, alarm}
| {probable_cause, binary()}
| {proposed_repair_action, binary()}
| {severity, alarm_severity()}
| capability_tags_item().
-type alarm_severity() ::
critical | major | minor | warning | indeterminate | cleared.
%%%-----------------------------------------------------------------------------
%%% Capabilities: Services
%%%-----------------------------------------------------------------------------
-type service_info() :: [service_info_item()].
-type service_info_item() :: {type, service_capability_type()}
| {description, binary()}
| {label, binary()}
| {feature, term()}
| {is_internal, boolean()}
| {is_exclusive, boolean()}
| {priority, integer()}
| {arguments, [service_option_name()]}
| {options, [service_option()]}.
-type service_capability_type() :: configurator | explorer | executor.
-type service_option() :: [service_option_item()].
-type service_option_item() ::
%% This key is used in the request.
{option_name, service_option_name()}
%% The label of the option on the Dashboard.
| {option_label, binary()}
%% The type of the option. It determines whether `option_values' or
%% 'listitem_type' need to be also specified.
%%
%% If the option type is a list, then the listitem_type field contains
%% the types in the list. A list type will actually form a table. E.g.
%% if the listitem_type field describes key1 and key2, then the user
%% can fill in a table with two columns (key1 and key2) and any number
%% of rows.
| {option_type, service_option_type()}
%% The default value of the option.
| {option_default, binary()}
%% Whether the option can be seen and directly set by the users.
| {option_enabled, boolean()}
%% The list of possible values that the option can have. When type is
%% not enum, it is an empty list.
| {option_values, [OptionValue :: binary()]}
%% This definition is recursive, but in reality only the top service
%% option can be a list, its children cannot. So the listitem_type
%% will be empty for the children.
| {listitem_type, [service_option()]}.
-type service_option_name() :: binary().
-type service_option_type() :: string | number | enum | list.
%%%-----------------------------------------------------------------------------
%%% Reporting
%%%-----------------------------------------------------------------------------
-type capability_data() :: metric_data().
%%%-----------------------------------------------------------------------------
%%% Reporting: Metrics
%%%-----------------------------------------------------------------------------
-type metric_data() :: {metric, capability_id(), metric_type(), metric_value()}.
-type live_metric_data() :: {live_metric, capability_id(),
metric_type(), metric_value()}.
-type live_metric_comp_unit() :: term().
-type metric_type() :: gauge | counter | histogram | meter | spiral | duration.
-type metric_unit() :: numeric | byte | percentage.
-type metric_value() :: term().
%%%-----------------------------------------------------------------------------
%%% Reporting: Notifications
%%%-----------------------------------------------------------------------------
-type severity() :: binary().
%% The severity of a notification. The recommended values are the following:
%% critical, error, warning, info, debug.
-type log_message() :: binary().
%% The text of a notification.
%%%-----------------------------------------------------------------------------
%%% Reporting: Alarms
%%%-----------------------------------------------------------------------------
-type alarm_id() :: term().
%% This is the same as alarm_id() in elarm.hrl.
-type alarm_add_info() :: term().
%% This is the same as additional_information() in elarm.hrl.
%%%-----------------------------------------------------------------------------
%%% Implementing: Services
%%%-----------------------------------------------------------------------------
-type request_args() ::
[{KeyBinStr :: binary(),
ValueBinStr :: binary() |
[[{InnerKeyBinStr :: binary(),
InnerValueBinStr :: binary()}]]}].
%% A piece of input given by the user, which is used to execute a certain
%% request.
%%
%% The type of the key (as defined in the service capability of this
%% request_args term) defines the type of the value:
%%
%% * For keys that have string, number, enum type, ValueBinStr is a binary.
%% * For keys that have list type, ValueBinStr is a list that contains inner
%% lists. Each inner list is a proplist, and each inner list has the same
%% keys (as defined in the capability).
-type display_info() :: #display_info{}.
%% A display info term describes for the GUI how to display streamed data. It is
%% created by the plugin process.
-type display_info_option_item() :: {is_interactive, boolean()} |
{table_headers, [binary()]}.
%% Modifiers for the display_info().
-type execution_info() :: #execution_info{}.
%% An execution info term describes for the wombat_plugin behaviour how to
%% execute the request. It is created by the plugin process.
-type stream_data() :: stream_data_value() | stream_data_table().
%% Data to be streamed from the plugin to the GUI
%% Stream data constructions (values and tables):
-type stream_data_value() :: plain_value() | interactive_value().
-type stream_data_table() :: stream_data_plain_table()
| stream_data_interactive_table().
-type stream_data_plain_table() :: [[plain_value()]].
-type stream_data_interactive_table() :: [[interactive_value()]].
%% Basic building blocks for stream data:
-type plain_value() :: binary().
-type interactive_value() :: #interactive_value{}.
-type action() :: #action{}.
-type from_ref() :: {pid(), wombat_plugin_services:exec_req_ref()}.
%% A from_ref() reference is used by the plugins to identify an execute_request
%% call that will reply later asynchronously.
-type async_reply() :: {continue | close,
no_data |
{data, StreamData :: wombat_types:stream_data()}} |
{error, ReasonBinStr :: binary()}.
%%%-----------------------------------------------------------------------------
%%% Miscellaneous
%%%-----------------------------------------------------------------------------
-type plugin_state() :: term().
%% The plugins usually define and use their own state() type.
|
Capabilities
The metrics and services plugin interfaces use a list of capabilities to
return information back to WombatOAM. When WombatOAM asks for information on available
metrics, the capabilities/1
function of all plugins will be queried.
A plugin exposes a list of capabilities (capabilities()
). Currently this is
used to report the metrics that the plugin can report and services that can be
requested from the plugin. Optionally, the alarms the plugin may raise can be
reported. Each capability has an id (capability_id()
) and a
list that contains further information about the capability
(capability_info()
). An id is made up of a list of binary UTF-8 strings.
Metrics capabilities
In case of metrics capabilities the final component of the id is the name of the
metric. The prefix list of the id (i.e. the list containing all elements of the
capability id except the last one) becomes the name of the metric group.
Actual metrics samples either have the type metric_data()
(in case of
so-called collected metrics that are collected automatically) or
live_metric_data()
(in case of live metrics that are collected on-demand).
Note about the order of the entries in the capability list: When WombatOAM presents
the metrics to the user, it shows them in the order they were received from the
plugin in the capabilities
callback or by calling
wombat_plugin:announce_capabilities
. If new metrics are added and reported
later (either by capabilities
or by wombat_plugin:announce_capabilities
),
WombatOAM will insert the new metrics. If metrics are deleted, they will be still
shown to the user at least as long as WombatOAM stores samples from the metric. If
the metrics are reordered, WombatOAM will prefer the new ordering. The
recommendation is to keep a consistent order and not to reorder existing metrics
though for two reasons:
- It is better user experience to see the metrics always at the same place.
- When some metrics are reordered and some are removed, WombatOAM is not always
able to locate the correct position of the removed metrics, so they will be
moved to unexpected locations.
Adding and removing metrics causes no problem, unless the same metrics appears
at different places different times.
Services capabilities
For each service announced the plugin should declare its identifier, priority,
type (configurator
, explorer
, executor
), a label (displayed name on the
dashboard) and description, whether the service is internal and exclusive,
specification of the arguments of the service (label, type, default value;
options
field) and a subset of these which are the mandatory
argument names (arguments
field).
The feature
field is the identifier of a service. Multiple plugins can
implement the same service. WombatOAM aggregates the announced implementations of
a service using the feature
field. Then the generalised interface of
the service is available for users to submit new requests. When a request
arrives WombatOAM will try to initiate the request by asking the satisfied
implementations in order. The implementations are ordered by priority and
mandatory arguments count. This mechanism allows a way to override
built-in services or implement more specific ones (e.g. a custom configuration
service which does not use the OTP application environment).
The plugin should use the wombat_plugin_services:create_capability/6
function
to create a service capability. (see the
wombat_plugin_services API section)
Alarms capabilities
Announcing alarms capabilities is optional. Alarms capabilities defines the
assigned tags and provides additional information about the severity,
the probable cause and the proposed repair action.
Alarm capabilities are matched to alarms using two identifiers, namely,
using capability id and alarm id. If there is no matching alarm capability for a
certain alarm, only information included by the alarm will be available and it
will be tagged with the default tags. Although alarm id can be an arbitrary
Erlang term, the matching algorithm works only with atoms and only those tuples
whose first element is an atom. This atom is converted to an UTF-8 binary
string, that is matched against the list item of the capability id.
As examples for the matching, consider the ETS limit and the missing application alarms. The alarm id of the ETS limit alarm is ets_limit
that matches to the following capability id: [<<"ets_limit">>]
. Considering the missing application alarm that has a parametric id {missing_appliaction, App}
the correct capability id is
[<<"missing_application">>]
.
The wombat_plugin_utils:create_alarm_capability/5
utility function (refer to
the Useful functions in wombat_plugin_utils
section for further detail) should
be used to create an alarm capability, which should be returned by the
capabilities/1
callback or be announced using the
wombat_plugin:announce_capabilities/1
function.
Callbacks of the wombat_plugin
behaviour
The following callbacks are defined in the wombat_plugin
behaviour:
| -callback init(Arguments :: [term()]) -> {ok, wombat_types:plugin_state()} |
{skip, Msg :: binary()} |
{error, _}.
|
When the plugin is started, its init
function is called with the arguments
specified for the plugin in sys.config
/wombat.config
. It either returns the
initial state, which is typically a record (just like in case of a gen_server
module) or an error to indicate an unexpected problem. Alternatively skip can
be returned to gracefully exit without generating a crash report on the managed
node. Msg will be shown as a notification from the plugin on Wombat dashboard.
| -callback capabilities(wombat_types:plugin_state()) ->
{wombat_types:capabilities(),
NewState :: wombat_types:plugin_state()}.
|
After the plugin is started, WombatOAM will retrieve the list of capabilities
provided by this plugin by calling the capabilities
function. Currently only
metrics, alarms and services are handled as capabilities. In the future, this
function might be called on other occasions too. A typical pattern is to
calculate the capabilities in init
, store it in the state record and simply
read and return them in capabilities
.
| -callback handle_info(Message :: term(),
wombat_types:plugin_state()) ->
{noreply,
NewState :: wombat_types:plugin_state()}.
|
This function is called when the plugin process receives a message, just like in
case of a gen_server
.
| -callback terminate(wombat_types:plugin_state()) -> any().
|
This function is called when the plugin is terminated. This can happen for a
number of reasons: the plugin is disabled by the user; WombatOAM is stopped; the
node is removed from WombatOAM; the connection between WombatOAM and the node is
stopped; etc.
| -callback collect_metrics(wombat_types:plugin_state()) ->
{ok,
[wombat_types:metric_data()],
NewState :: wombat_types:plugin_state()}.
|
This function is called periodically for those plugins whose capabilities
function reported that they have at least one metric. (Those plugins who
reported that they didn't have metrics but later on realized that they do have
them can use the wombat_plugin:announce_capabilities
function.) The function
should return the list of metric samples (i.e. metric values). The order of the
samples is irrelevant. This function should not report metrics that have not
been announced beforehand via capabilities
or
wombat_plugin:announce_capabilities
.
| -callback live_metrics2comp_units([wombat_types:metric_cap_id_last()],
wombat_types:plugin_state()) ->
{ok,
[wombat_types:live_metric_comp_unit()],
NewState :: wombat_types:plugin_state()} |
{error, term(), wombat_types:plugin_state()}.
|
When handling live metrics, metrics are divided into "computation units" for the
sake of optimization. There will be one collect_live_metrics
call for each
computation unit (i.e. the metrics in the same computation unit are calculated
in one collect_live_metrics
call).
As an example, let's assume for example that metric a
and metric b
are
computed by calling a costly function that computes a proplist, and a
returns
one item of the proplist while b
returns another. Metric c
is a different
independent metric. In this case, we would put a
and b
into the same
computation unit (let's call it ab_group
), which would then to be calculated
in a common collect_live_metrics
function call. c
could be in different
computation unit (let's call it c_group
), so it would be calculated
independently.
The live_metrics2comp_units
function gets the list of metrics that the user
currently wants to monitor as live metrics, and it should return the computation
units that include those metrics. The data type of the computation units is up
to the plugin.
In the example above, the function would return {ok, [ab_group, c_group],
State}
.
| -callback collect_live_metrics(wombat_types:live_metric_comp_unit()) ->
{ok, [wombat_types:live_metric_data()]} | {error, term()}.
|
For each computation unit that is returned by live_metrics2comp_units
, a
process will be started. Each process will periodically (once per second by
default) call collect_live_metrics
with one of the computation units.
In the example above, if the user wanted to monitor metrics a
, b
and c
, we
would have two processes, one of them calling collect_live_metrics(ab_group)
and the other calling collect_live_metrics(c_group).
When the process for a computation unit crashes, the plugin won't be stopped,
only the live collection of the metrics handled by the computation unit.
Notes:
- All the above callbacks are obligatory.
- If any of the functions throw an exception, the terminate function is called
and the plugin is stopped. (Note that terminate will receive the state data
that was given to the function that threw the exception; changes applied to
the state but not returned by that function are lost.)
Services callbacks
A request for a service is fulfilled by executing a suitable implementation
of the service. A certain request is identified by ReqId
, while the
implementation to be executed is specified by CapabilityId
. State
is the
current plugin state, it is shared among concurrent requests being served by a
certain plugin. The execution consists of the following 3 phases.
| init_request -> (execute_request)+ -> cleanup_request.
|
-
The process begins with asking the implementations whether they are willing
to serve the request (init_request/4
). The first implementation which
accepts the request will be executed.
-
Real execution takes place by calling the execute_request/4
callback of the
implementation. Data pushed back to WombatOAM is created during this phase. In
case of periodic requests this callback will be called multiple times (once
every period).
-
After the execution has finished, the implementation is allowed to clean up.
Releasing resources, cleaning the plugin state should be the part of the
cleanup_request/3
.
For each phase a callback is defined in the wombat_plugin_services
behaviour
described below. All 3 callbacks must be implemented to implement a new service.
1
2
3
4
5
6
7
8
9
10
11
12
13
14 | -spec init_request(ReqId :: binary(),
CapabilityId :: wombat_types:capability_id(),
ReqArgs :: wombat_types:request_args(),
State :: wombat_types:plugin_state()) ->
{out_of_scope,
ReasonBinStr :: binary(),
NewState :: wombat_types:plugin_state()} |
{error,
ReasonBinStr :: binary(),
NewState :: wombat_types:plugin_state()} |
{ok,
DisplayInfo :: wombat_types:display_info(),
ExecutionInfo :: wombat_types:execution_info(),
NewState :: wombat_types:plugin_state()}
|
This callback initializes a request (identified by ReqId
) for a service
(announced as CapabilityId
by the plugin) based on the input arguments. The
validation can have the following outcomes:
- Serving the request is out of the plugin's scope. For instance, consider a
special configurator plugin that changes only the configs of the MongooseIM
application. If it is asked to change the config of a Riak application, it is
simply not capable of performing the change.
- The plugin is capable of serving such a request (it has all necessary input)
but the provided arguments are incorrect. For instance, consider the Etop
service that receives the
<<"ETC">>
binary as the value of its interval
argument for which it only accepts binaries that can be converted to
non-negative integers.
- The plugin is capable and willing to serve the request (all mandatory
arguments are given, have been checked and considered to be valid). In this
case it initialises the request by storing any necessary data in its state,
provides information about how to display the result of the request
(refer to
wombat_plugin_services:create_display_info/3
) and how to execute
the request (refer to wombat_plugin_services:create_execution_info/3
).
1
2
3
4
5
6
7
8
9
10
11
12 | -spec execute_request(ReqId :: binary(),
CapabilityId :: wombat_types:capability_id(),
From :: wombat_types:from_ref(),
State :: wombat_types:plugin_state()) ->
{continue | close,
no_data | {data, StreamData :: wombat_types:stream_data()},
NewState :: wombat_types:plugin_state()} |
{reply_later,
NewState :: wombat_types:plugin_state()} |
{error,
ReasonBinStr :: binary(),
NewState :: wombat_types:plugin_state()}.
|
The goal is to really execute the request (identified by ReqId
), to provide
data to be streamed and then to be displayed based on the previously given
DisplayInfo
, and to define what will happen to the stream (should be closed or
kept open to continue the execution).
It can return:
continue
to continue a periodic request. (Non-periodic requests cannot
return this.) Can return stream data or no_data
.
close
to indicate that the request completed. Can return stream data or
no_data
.
- The plugin can indicate to
reply_later
. This is useful to execute longer
jobs in a separate worker process. In this case it can use the
wombat_plugin:spawn_worker/1
function to initiate a worker and the From
reference received as input argument and the
wombat_plugin_services:request_reply/2
function to send stream data back to
WombatOAM later.
- The plugin can indicate an
error
with a human readable reason to be
displayed on the dashboard. In this case depending on the specified restart
strategy and the number of previous retries the execution can continue or
finish.
Data to be pushed to WombatOAM fall into the following 3 categories.
-
plain_value()
. The simplest category. This will be displayed as is.
-
stream_data_plain_table()
. List of lists built up from plain_value()
.
This will be rendered as a table on the dashboard.
-
stream_data_interactive_table()
. List of lists built up from
interactive_value()
. To each value a list of actions is assigned which
will be listed under the value's local menu on the dashboard. 2 general API
functions and a utility function are available to construct such data, which
are wombat_plugin_services:create_interactive_value/2
,
wombat_plugin_services:create_action/4
,
wombat_plugin_services:create_process_actions/1
.
| -spec cleanup_request(ReqId :: binary(),
CapabilityId :: wombat_types:capability_id(),
State :: wombat_types:plugin_state()) ->
{ok, NewState :: wombat_types:plugin_state()}.
|
This callback can do any cleanup necessary after the execution of the request
has finished. It will always be called, regardless of how the execution
finished (successfully completed, failed, or runtime error occurred).
Notes:
-
All these callbacks are evaluated in the plugin process. That means while the
callbacks are being evaluated the plugin process cannot handle other tasks
(i.e.: cannot push metrics, logs, alarms).
-
The execution of periodic requests can always be stopped by the users. It is
stopped by finalising the request instead of scheduling its next execution.
Requests being executed are not effected by stop commands, they are allowed to
normally terminate. Non-periodic requests cannot be stopped by the users.
-
Information provided in the capabilities is used by
-
WombatOAM to create services by aggregating the capabilities that describe
different implementations of the same feature.
-
WombatOAM to categorise the services. Services will be displayed under their
category group (configurator
, explorer
, executor
) on the dashboard.
-
Information provided in the display info (DisplayInfo
) is used by
-
The wombat_plugin
behaviour to control the execution of requests.
-
The dashboard to display data to be streamed by the plugins.
-
The dashboard to decide whether users are allowed to stop requests.
The wombat_plugin_services
API
The following functions in the wombat_plugin_services
module can be used to
implement services (for example to create structures).
| -spec create_capability(CapabilityID :: binary(),
Type :: wombat_types:service_capability_type(),
Description :: binary(),
Label :: binary(),
Feature :: term(),
Options :: [wombat_types:service_info_item()]) ->
wombat_types:capability().
|
Create a service capability. Properties given in Options
override the default
properties of the service. These properties together with their defaults are:
is_internal
(false
)
is_exclusive
(false
)
priority
(0
)
arguments
([]
)
options
([]
)
Note
- The options defined by the same capability should have unique names.
- The mandatory options specified by listing their names should be defined as
options.
| -spec create_string_option(Name :: wombat_types:service_option_name(),
Label :: binary(),
Default :: binary(),
IsEnabled :: boolean()) ->
wombat_types:service_option().
|
| -spec create_number_option(Name :: wombat_types:service_option_name(),
Label :: binary(),
Default :: binary(),
IsEnabled :: boolean()) ->
wombat_types:service_option().
|
| -spec create_enum_option(Name :: wombat_types:service_option_name(),
Label :: binary(),
Default :: binary(),
IsEnabled :: boolean(),
OptionValues :: [binary()]) ->
wombat_types:service_option().
|
These 3 functions create a scalar option. An empty binary (<<"">>
) means no
default value. Note that the Default
value for enums should be the member of
OptionValues
(or an empty binary).
| -spec create_list_option(Name :: wombat_types:service_option_name(),
Label :: binary(),
Components :: [wombat_types:service_option()]) ->
wombat_types:service_option().
|
Create a list option. The components of the list are specified as options. For
instance, consider that a list of module names should be given by users. Then,
a list option with one component, which is a string option, would be suitable to
require this input. For another example, check the built-in configurator service
allowing to change a batch of configs at once.
| -spec create_display_info(DataStructure :: value | table,
Label :: binary(),
Options :: [display_info_option_item()]) ->
wombat_types:display_info().
|
Create a display info about a service. Properties given in Options
override
the default properties of display info. These properties together with their
defaults are:
is_interactive
(false
)
table_headers
([]
)
| -spec create_execution_info(Period :: once | non_neg_integer(),
RetryAfter :: never | non_neg_integer(),
MaxRetries :: non_neg_integer()) ->
wombat_types:execution_info().
|
Create an execution info about a service. Execution info to be returned by the
init_request/4
callback is used by the framework to know how to execute a
request.
-
The Period
specifies how often data will be streamed. It can be
once
or a non negative number. once
means that data will be streamed only
once and users are not allowed to stop the execution of the requests. Periodic
requests can always be stopped by the users. The period of executing such
requests is defined by the value of this option, namely, the given value
defines the number of milliseconds elapsed between two executions.
-
The RetryAfter
specifies how the failures should be handled. It can be
never
or a non negative integer. never
means the evaluation of the request
should be never retried, whilst the given number defines the number of
milliseconds after the evaluation can be retried.
-
The MaxRetries
defines the maximum number of attempts to evaluate the
request in a row. If the number of attempts reaches the defined maximum, the
plugin process gives it up and finalises the request. If its value is 0
, the
plugin process will never retry the evaluation and gives up immediately after
the first failure occurs.
| -spec create_interactive_value(Data :: binary(),
Actions :: [wombat_types:action()]) ->
wombat_types:interactive_value().
|
Create an interactive value within a stream data, can be used to construct
a cell in an interactive tables. Data
will be displayed on the dashboard as
the content of the cell. Actions
specifies the content of the local menu. To
construct an arbitrary action, use create_action/4
. If the Data
is a pid,
use create_process_actions/1
utility function to define the same local menu
that appears for processes in the Etop service's output.
| -spec create_action(Label :: binary(),
ObjectType :: node | family,
FeatureName :: term(),
FeatureArgs :: wombat_types:request_args()) ->
wombat_types:action().
|
Create an arbitrary action for an interactive value. Imagine an action as a
zero-arity fun expression, which will be evaluated when the user request for it.
The body of the fun expression is a complete request for an other, already
implemented service. The target of the request can be the node creating the
action or this node's family. This is specified by ObjectType
. FeatureName
is the feature identifying the service, which is implemented as a capability by
a plugin. (Same as the 5th argument passed to create_capability/6
).
FeatureArgs
are the request arguments, with which the request will be
initialised.
Label
will be shown as the link of this action in the local menu.
| -spec create_process_actions(pid()) -> [wombat_types:action()].
|
Create a list of actions related to the given process. The list of actions can
be directly used as the actions of interactive values. The actions are Terminate process,
process info, process messages, process dictionary, process state,
process stack trace.
| -spec request_reply(From :: wombat_types:from_ref(),
Reply :: wombat_types:async_reply()) ->
ok.
|
This function can be used by a plugin to explicitly send stream data to
WombatOAM.
When the execute_request/4
callback wants to return and send stream data only
later, it can return reply_later
and use this function later to send the
stream data. The From
parameter received in execute_request/4
must be
provided to this function. Note well that one From
value can be used only once
(i.e. it cannot be used to send back multiple stream data messages).
The wombat_plugin
API
The following functions in the wombat_plugin
module can be used by plugins.
| -spec report_log(Severity :: wombat_types:severity(),
LogMessage :: wombat_types:log_message()) -> ok.
|
Report a notification entry.
| -spec raise_alarm(AlarmId :: wombat_types:alarm_id(),
AddInfo :: wombat_types:alarm_add_info()) -> ok.
-spec clear_alarm(AlarmId :: wombat_types:alarm_id()) -> ok.
|
Raise/clear an alarm.
| -spec announce_capabilities(Capabilities :: wombat_types:capabilities()) -> ok.
|
Push the list of capabilities to WombatOAM. It needs to be called with the list of
all capabilities of the plugin (not only the new ones).
Calling the wombat_plugin
API from outside of the plugin process
The wombat_plugin
API is simple because when its functions are called,
WombatOAM's plugin infrastructure knows who the caller is. But when a plugin calls
these functions, WombatOAM will not know who they are; therefore calling these
functions from other processes is not allowed. Instead, those processes need to
obtain the counterparts of these functions in the wombat_plugin_utils
module:
| -spec report_log_cb(Options :: plugin_options()) ->
fun((Severity :: wombat_types:severity(),
LogMsg :: wombat_types:log_message()) -> ok) |
undefined.
|
Report a notification entry.
The Options
parameter that needs to be passed to these functions is the same
as the Arguments
parameter that is received by the init
function of the
module.
The following is an example that shows how this function can be used:
| init(Options) ->
LogCB = wo_plugin_utils:report_log_cb(Options),
LogCB(<<"error">>, <<"Test notification">>),
{ok, #state{}}.
|
| -spec raise_alarm_cb(Options :: plugin_options()) ->
fun((AlarmId :: term(), Message :: term()) -> ok) | undefined.
-spec clear_alarm_cb(Options :: plugin_options()) ->
fun((AlarmId :: term()) -> ok) | undefined.
|
Raise/clear an alarm.
| -spec announce_capabilities_cb(Options :: plugin_options()) ->
fun((Capabilities :: wombat_types:capabilities()) -> ok) | undefined.
|
Push the list of capabilities to WombatOAM. It needs to be called with the list of
all capabilities of the plugin (not only the new ones).
Useful functions in wombat_plugin_utils
| -spec binfmt(Fmt :: io:format(), Args :: [term()]) -> binary().
|
Print the given arguments into a binary.
| -spec spawn_worker(fun(() -> any())) -> pid().
|
Spawn a worker process from a plugin. The return value of the fun is
ignored. The process is linked to the plugin process and has special
treatment. (Never use plain erlang:spawn_link
from a plugin process!)
| -spec create_metric_capability(MetricId :: wombat_types:capability_id(),
Description :: binary(),
Type :: wombat_types:metric_type(),
Unit :: wombat_types:metric_unit(),
Tags :: wombat_types:capability_tags()) ->
wombat_types:capability().
|
Create a metric capability term. Note that the create_metric_capability/4
function is deprecated, kept only for backward compatibility. It uses the dev
and the op tags to create the metric capability.
| -spec cap_id_to_cap_id_last(wombat_types:capability_id()) ->
wombat_types:metric_cap_id_last().
|
Return the last element of the capability id as an atom.
| -spec create_alarm_capability(CapabilityId :: wombat_types:capability_id(),
Severity :: wombat_types:alarm_severity(),
ProbableCause :: binary(),
ProposedRepairAction :: binary(),
DefaultTags :: wombat_types:capability_tags()) ->
wombat_types:capability().
|
Create an alarm capability term.
Starting periodic jobs
The types used are defined in wombat_types.erl
:
| -type task_fun() :: fun(() -> ok | stop).
-type millisecs() :: non_neg_integer().
|
| -spec periodic(Period :: wombat_types:millisecs(),
Job :: wombat_types:task_fun()) -> pid().
|
Start a periodic job from a main wombat plugin module. The process either stops
when the fun doesn't return ok
or the plugin is stopped.
| -spec stream_task_data(term()) -> ok.
|
Function to be called by the job (task) processes in order to stream results
back to the WombatOAM plugin process. Streamed data format is {'$task_data',
TaskPid, Data}
.
Using wombat_tracer
as a service
If you want to write a plugin that collects trace information, you should use
the tracing service provided by the wombat_plugin
application. The service is
implemented as a server that is locally registered under the name
wombat_plugin_tracer
.
To subscribe, call wombat_plugin_tracer:subscribe(Who, Topic, Filter)
, where:
Who
is the pid of the receiver,
Topic
is {FlagList, MFA}
, where the variables share the types defined in
the documentation of erlang:trace_pattern
.
Filter
has the type fun((TraceMsg) -> boolean() | {true, Msg})
. It needs
to pre-select the trace messages that shall be delivered to the receiver. The
type of TraceMsg
is defined in the documentation of erlang:trace
.
If the filter returns true the TraceMsg
is forwarded to the subscriber as
is. In case the filter returns a custom Msg
that will be then sent to the
subscriber instead of the original TraceMsg
.
Optionally also a FinishFlag
can be provided when calling
wombat_plugin_tracer:subscribe(Who, Topic, Filter, FinishFlag)
which can have
the following values:
undefined
(default): only call
trace messages are sent to the tracer
return_trace
: apart from the call
messages also a `return_from' trace
message is sent upon return from the traced function.
exception_trace
: same as return_trace
, plus; if the traced function exits
due to an exception, an `exception_from' trace message is generated,
whether the exception is caught or not.
The result of the call can be:
ok
, meaning the subscription was successful and the tracing is active.
{warning, Reason}
, meaning the subscription was okay but the tracing is not
active.
{error, Reason}
, meaning the subscription wasn't done due to bad arguments
were passed as parameters.
The tracer sends messages that have the form {wombat_plugin_tracer, Msg}
,
where Msg
is one of the following:
TraceMsg
as defined in the documentation of erlang:trace
.
tracer_inactived
, meaning no trace messages can be expected, tracing is not
active.
tracer_actived
, meaning trace messages can be expected, tracing is active.
A strong recommendation is to link the receivers – the plugins – to the tracer.
Hence, the plugins can restart in case the tracer restarts, simplifying the
implementation of the plugins.
Notes:
- There is no need for unsubscribing from the tracer, as the plugin is monitored
by the tracer.
- Tracing calls towards the functions of a module that is reloaded or loaded
after the trace pattern has been enabled is supported. However, there is one
exception. Trace patterns matching to any modules (
'_'
) won't receive traces
for modules that have been reloaded or loaded after the pattern has been
activated. Also note that when loading a module is triggered by a first call
towards that module, then this first call will not be traced.
Example
Assume you want to keep track which modules are loaded into the VM. Then,
first subscribe to trace erlang:load_module/2
calls during init
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20 | init(_Args) ->
Topic = {[], {erlang, load_module, 2}},
Filter = fun({trace, _, call, {erlang, load_module, [_ | _]}}) -> true;
(_) -> false
end,
case wombat_plugin_tracer:subscribe(self(), Topic, Filter) of
ok ->
ok;
{warning, Warning} ->
Formatted =
wombat_plugin_utils:binfmt("Tracers response: ~p", [Warning]),
wombat_plugin:report_log(<<"warning">>, Formatted);
{error, Reason} ->
Formatted =
wombat_plugin_utils:binfmt("Tracers response: ~p", [Reason]),
wombat_plugin:report_log(<<"error">>, Formatted)
end,
link(whereis(wombat_plugin_tracer)).
|
To receive the collected trace messages and other system messages sent by the
tracer, add the following function clause to handle_info/2
.
| handle_info({wombat_plugin_tracer, tracer_actived}, State) ->
wombat_plugin:report_plugin_error(<<"info">>, <<"Tracer activated.">>),
{noreply, State};
handle_info({wombat_plugin_tracer, tracer_inactived}, State) ->
wombat_plugin:report_plugin_error(<<"warning">>, <<"Tracer is inactive.">>),
{noreply, State};
handle_info({wombat_plugin_tracer, {trace, Pid, call, MFA}}, State) ->
{erlang, load_module, [Module, _Binary]} = MFA,
Msg = wombat_plugin_utils:binfmt("~p module is loaded", [Module]),
wombat_plugin:report_log(<<"info">>, Msg),
{noreply, State};
|
Example of a complete plugin
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590 | %%%=============================================================================
%%% @copyright 2015-2016, Erlang Solutions Ltd
%%% @doc Example WombatOAM plugin.
%%%
%%% This example plugin demonstrates how to write a simple WombatOAM plugin. It does
%%% the following:
%%%
%%% - It provides two metrics: nodes_count and hidden_nodes_count.
%%% - It raises an alarm and sends a notification if there is a process
%%% registered with the name "Troublemaker". This is checked once a second.
%%% (In a real plugin, this value would be much higher to avoid overloading
%%% the system, e.g. one minute.)
%%%
%%% To activate this plugin, the following entry needs to be added to
%%% wombat.config:
%%%
%%% ```
%%% {replace, wo_plugins, plugins, example,
%%% {example, [{kernel, ".*"}], [], []}}.
%%% '''
%%% @end
%%%=============================================================================
-module(wombat_plugin_example).
-copyright("2015-2016, Erlang Solutions Ltd.").
-behaviour(wombat_plugin).
-behaviour(wombat_plugin_services).
%% wombat_plugin callbacks
-export([init/1, capabilities/1,
handle_info/2, terminate/1,
collect_metrics/1, live_metrics2comp_units/2, collect_live_metrics/1]).
%% wombat_plugin_services callbacks
-export([init_request/4, execute_request/4, cleanup_request/3]).
-define(CHECK_INTERVAL, 1000). % 1 second
%%------------------------------------------------------------------------------
%% Types
%%------------------------------------------------------------------------------
-record(state,
{
%% The plugin's internal representation of the metrics provided.
metric_info_tuples = [] :: [metric_info_tuple()],
%% The WombatOAM representation of the metrics provided.
capabilities = [] :: [wombat_types:capability()],
%% True if there is a process called 'troublemaker'.
troublemaker_exists :: boolean(),
requests = [] :: [{ReqId :: binary(),
ReqInfo :: tm_mode()}]
}).
-type state() :: #state{}.
%% Plugin state.
-type metric_internal_id() :: atom().
%% An id used by this plugin to identify a metric.
-type metric_info_tuple() :: {MetricInternalId :: metric_internal_id(),
MetricNameBin :: binary(),
Type :: wombat_types:metric_type(),
Unit :: wombat_types:metric_unit(),
Tags :: wombat_types:capability_tags()}.
%% A tuple that is used by this plugin to describe a metric.
-type tm_mode() :: binary().
%% the mode of the troublemaker to be started
%%%=============================================================================
%% wombat_plugin callbacks
%%%=============================================================================
%%------------------------------------------------------------------------------
%% @doc Initialise the plugin state.
%% @end
%%------------------------------------------------------------------------------
-spec init(Arguments :: [term()]) -> {ok, state()} | {error, _}.
init(_) ->
%% Metrics
Metrics = get_metric_info_tuples(),
MetricCapabilities =
[ wombat_plugin_utils:create_metric_capability(
metric_name_to_capability_id(Name), Name, Type, Unit, Tags)
|| {_Id, Name, Type, Unit, Tags} <- Metrics ],
ServiceCapabilities = service_capabilities(),
AlarmCapabilities = alarm_capabilities(),
Capabilities =
MetricCapabilities ++ ServiceCapabilities
% Note that announcing alarm capabilities is optional.
++ AlarmCapabilities,
%% Alarms and notifications pushed to WombatOAM based on periodic checks
%% The process started as periodic task will check whether a process
%% registered as 'troublemaker' exists. The result of each check is
%% streamed to the plugin process to perform any necessary further actions.
wombat_plugin:periodic(
?CHECK_INTERVAL,
fun() ->
%% Determine the current status of the troublemaker process.
TroubleMaker = erlang:whereis(troublemaker),
%% Inform the plugin process about the troublemaker process.
ok = wombat_plugin:stream_task_data(TroubleMaker)
end),
%% Perform the initial check.
TroublemakerExists =
case erlang:whereis(troublemaker) of
undefined ->
wombat_plugin:clear_alarm(there_is_a_troublemaker),
false;
Pid ->
wombat_plugin:raise_alarm(there_is_a_troublemaker,
[{pid, Pid}]),
true
end,
{ok, #state{metric_info_tuples = Metrics,
capabilities = Capabilities,
troublemaker_exists = TroublemakerExists}}.
%%------------------------------------------------------------------------------
%% @doc Return the capabilities of the plugin.
%% @end
%%------------------------------------------------------------------------------
-spec capabilities(state()) -> {wombat_types:capabilities(), state()}.
capabilities(#state{capabilities = Capabilities} = State) ->
{Capabilities, State}.
%%------------------------------------------------------------------------------
%% @doc Handle a message.
%% @end
%%------------------------------------------------------------------------------
-spec handle_info(Message :: term(), state()) -> {noreply, state()}.
handle_info({'$task_data', _Pid, Troublemaker},
#state{troublemaker_exists = TroublemakerExistsOld} = State) ->
NewState =
case {TroublemakerExistsOld, Troublemaker} of
{false, undefined} ->
%% No troublemaker.
State;
{true, undefined} ->
%% The troublemaker disappeared.
wombat_plugin:clear_alarm(there_is_a_troublemaker),
State#state{troublemaker_exists = false};
{false, Pid} ->
%% The troublemaker appeared.
wombat_plugin:raise_alarm(there_is_a_troublemaker,
[{pid, Pid}]),
Msg = wombat_plugin_utils:binfmt(
"We have a troublemaker: ~p", [Pid]),
wombat_plugin:report_log(<<"warning">>, Msg),
State#state{troublemaker_exists = true};
{true, Pid} ->
%% The troublemaker is still there.
Msg = wombat_plugin_utils:binfmt(
"The troublemaker is still there: ~p", [Pid]),
wombat_plugin:report_log(<<"warning">>, Msg),
State
end,
{noreply, NewState};
handle_info(_Message, State) ->
{noreply, State}.
%%------------------------------------------------------------------------------
%% @doc Terminate the plugin.
%% @end
%%------------------------------------------------------------------------------
-spec terminate(state()) -> any().
terminate(_State) ->
ok.
%%------------------------------------------------------------------------------
%% @doc Return the metrics' values belonging to the already announced
%% capabilities.
%% @end
%%------------------------------------------------------------------------------
-spec collect_metrics(state()) -> {ok, [wombat_types:metric_data()], state()}.
collect_metrics(#state{metric_info_tuples = Metrics} = State) ->
Samples = [ {metric, metric_name_to_capability_id(Name), Type,
get_metric_value(Id)}
|| {Id, Name, Type, _Unit, _Tags} <- Metrics ],
{ok, Samples, State}.
%%------------------------------------------------------------------------------
%% @doc Convert live metrics into computation units.
%% @end
%%------------------------------------------------------------------------------
-spec live_metrics2comp_units(LiveMs :: [wombat_types:metric_cap_id_last()],
state()) ->
{ok, [metric_info_tuple()], state()} | {error, term(), state()}.
live_metrics2comp_units(LiveMs, #state{metric_info_tuples = Metrics} = State) ->
%% Return those metric_info_tuples whose cap_id_last is present in LiveMS
%% (i.e. those metrics that shall be collected).
CompUnits = [MetricInfoTuple
|| MetricInfoTuple <- Metrics,
lists:member(
metric_info_tuple_to_cap_id_last(MetricInfoTuple),
LiveMs)],
{ok, CompUnits, State}.
%%------------------------------------------------------------------------------
%% @doc Return the values of the given live metric.
%% @end
%%------------------------------------------------------------------------------
-spec collect_live_metrics(MetricInfoTuple :: metric_info_tuple()) ->
{ok, [wombat_types:live_metric_data()]} | {error, term()}.
collect_live_metrics({Id, Name, Type, _Unit, _Tags}) ->
{ok, [{live_metric, metric_name_to_capability_id(Name), Type,
get_metric_value(Id)}]}.
%%------------------------------------------------------------------------------
%% @doc Initialize a service request.
%% @end
%%------------------------------------------------------------------------------
-spec init_request(ReqID :: binary(),
CapabilityID :: wombat_types:capability_id(),
ReqArgs :: wombat_types:request_args(),
State :: wombat_types:plugin_state()) ->
{ok,
DisplayInfo :: wombat_types:display_info(),
ExecutionInfo :: wombat_types:execution_info(),
NewState :: wombat_types:plugin_state()} |
{out_of_scope,
ReasonBinStr :: binary(),
NewState :: wombat_types:plugin_state()} |
{error,
ReasonBinStr :: binary(),
NewState :: wombat_types:plugin_state()}.
init_request(_ReqID, [<<"troublemaker status">>], _ReqArgs, State) ->
{ok,
wombat_plugin_services:create_display_info(
_DataStructure = value,
_Label = <<"Troublemaker status">>,
_Options = []),
wombat_plugin_services:create_execution_info(
_Period = once,
_RetryAfter = never,
_MaxRetries = 0),
State};
init_request(_ReqID, [<<"troublemaker watcher">>], _ReqArgs, State) ->
{ok,
wombat_plugin_services:create_display_info(
_DataStructure = table,
_Label = <<"Troublemaker status">>,
_Options = [
{is_interactive, true},
{table_headers, [<<"Status">>]}
]),
wombat_plugin_services:create_execution_info(
_Period = 3000,
_RetryAfter = never,
_MaxRetries = 0),
State};
init_request(ReqID, [<<"troublemaker start">>], ReqArgs, State) ->
case proplists:get_value(<<"mode">>, ReqArgs) of
undefined ->
{error, <<"Mandatory argument 'mode' missing.">>, State};
Mode when Mode =:= <<"Persistent">>; Mode =:= <<"Temporary">> ->
{ok,
wombat_plugin_services:create_display_info(
_DataStructure = table,
_Label = <<"Troublemaker process id">>,
_Options = [
{is_interactive, true},
{table_headers, [<<"Result">>, <<"Pid">>]}
]),
wombat_plugin_services:create_execution_info(
_Period = once,
_RetryAfter = never,
_MaxRetries = 0),
add_req_info(ReqID, Mode, State)};
Mode ->
{out_of_scope,
wombat_plugin_utils:binfmt(
"Unknown value for argument mode: ~p", [Mode]), State}
end;
init_request(_ReqID, [<<"troublemaker stop">>], _ReqArgs, State) ->
{ok,
wombat_plugin_services:create_display_info(
_DataStructure = value,
_Label = <<"Result">>,
_Options = []),
wombat_plugin_services:create_execution_info(
_Period = once,
_RetryAfter = never,
_MaxRetries = 0),
State}.
%%------------------------------------------------------------------------------
%% @doc Execute a service request.
%% @end
%%------------------------------------------------------------------------------
-spec execute_request(ReqID :: binary(),
CapabilityID :: wombat_types:capability_id(),
From :: wombat_types:from_ref(),
State :: wombat_types:plugin_state()) ->
{continue | close,
no_data | {data, StreamData :: wombat_types:stream_data()},
NewState :: wombat_types:plugin_state()} |
{reply_later,
NewState :: wombat_types:plugin_state()} |
{error,
ReasonBinStr :: binary(),
NewState :: wombat_types:plugin_state()}.
execute_request(_ReqId, [<<"troublemaker status">>], _From, State) ->
Status = troublemaker_status(),
{close, {data, Status}, State};
execute_request(_ReqId, [<<"troublemaker watcher">>], From, State) ->
%% The worker is only spawned for the sake of example
wombat_plugin_utils:spawn_worker(
fun() ->
Data = watch_troublemaker(),
wombat_plugin_services:request_reply(From, {continue, {data, Data}})
end),
{reply_later, State};
execute_request(ReqId, [<<"troublemaker start">>], _From, State) ->
Mode = get_req_info(ReqId, State),
{Result, Pid} = start_troublemaker(Mode),
BinPid = wombat_plugin_utils:binfmt("~p", [Pid]),
PidActions = wombat_plugin_services:create_process_actions(Pid),
Data =
[[
wombat_plugin_services:create_interactive_value(Result, []),
wombat_plugin_services:create_interactive_value(BinPid, PidActions)
]],
{close, {data, Data}, State};
execute_request(_ReqId, [<<"troublemaker stop">>], _From, State) ->
case stop_troublemaker() of
error ->
{error, <<"No Troublemaker process running">>, State};
ok ->
{close, {data, <<"Done">>}, State}
end.
%%------------------------------------------------------------------------------
%% @doc Clean up a service request.
%% @end
%%------------------------------------------------------------------------------
-spec cleanup_request(ReqID :: binary(),
CapabilityID :: wombat_types:capability_id(),
State :: wombat_types:plugin_state()) ->
{ok, NewState :: wombat_types:plugin_state()}.
cleanup_request(ReqID, [<<"troublemaker_start">>], State) ->
{ok, delete_req_info(ReqID, State)};
cleanup_request(_, _, State) ->
{ok, State}.
%%==============================================================================
%% Internal functions
%%==============================================================================
alarm_capabilities() ->
% The capability Id defines the matching alarms.
% Considering CapabilityId, the matching alarms are identified by
% - the 'there_is_a_troublemaker' atom, or
% - any tuple with arbitrary size while the first element of the
% tuple is the 'there_is_a_troublemaker' atom, for examples,
% {there_is_a_troublemaker, Pid} and {there_is_a_troublemaker,[Pid]}.
CapabilityId = [<<"there_is_a_troublemaker">>],
Severity = minor,
ProbableCause =
<<"A process has been registered with the name troublemaker.">>,
ProposedRepairAction =
<<"Use the Stop Troublemaker service to terminate the process.">>,
% Relevant only for operators
Tags = [<<"op">>],
[wombat_plugin_utils:create_alarm_capability(
CapabilityId, Severity, ProbableCause, ProposedRepairAction, Tags)].
%%------------------------------------------------------------------------------
%% @doc Return the metrics that this plugin provides (in its own internal
%% representation).
%% @end
%%------------------------------------------------------------------------------
-spec get_metric_info_tuples() -> [metric_info_tuple()].
get_metric_info_tuples() ->
[{nodes_count,
<<"Number of non-hidden nodes">>,
counter,
numeric,
% Metric is relevant for developers.
[<<"dev">>]},
{hidden_nodes_count,
<<"Number of hidden nodes">>,
counter,
numeric,
% Metric is relevant for both developers and operators.
[<<"dev">>, <<"op">>]}].
%%------------------------------------------------------------------------------
%% @doc Calculate the value of a given metric.
%% @end
%%------------------------------------------------------------------------------
-spec get_metric_value(MetricInternalId :: metric_internal_id()) -> integer().
get_metric_value(nodes_count) ->
length(nodes());
get_metric_value(hidden_nodes_count) ->
length(nodes(hidden)).
%%------------------------------------------------------------------------------
%% @doc Converts a metric name info a capability id.
%% @end
%%------------------------------------------------------------------------------
-spec metric_name_to_capability_id(MetricName :: binary()) ->
wombat_types:capability_id().
metric_name_to_capability_id(MetricName) ->
[<<"Example metrics">>, MetricName].
%%------------------------------------------------------------------------------
%% @doc Convert a metric from a metric_info_tuple into a metric_cap_id_last
%% value.
%% @end
%%------------------------------------------------------------------------------
-spec metric_info_tuple_to_cap_id_last(metric_info_tuple()) ->
wombat_types:metric_cap_id_last().
metric_info_tuple_to_cap_id_last({_Id, Name, _Type, _Unit, _Tags}) ->
wombat_plugin_utils:cap_id_to_cap_id_last(
metric_name_to_capability_id(Name)).
%%==============================================================================
%% Internal functions - Services
%%==============================================================================
service_capabilities() ->
[wombat_plugin_services:create_capability(
[<<"troublemaker status">>], % CapabilityId
explorer, % Type
<<"Return whether the Troublemaker process is alive or not.">>, % Description
<<"Get Troublemaker status">>, % Label
troublemaker_status, % Feature
[]), % Options - no arguments, use defaults
wombat_plugin_services:create_capability(
[<<"troublemaker watcher">>], % CapabilityId
explorer, % Type
<<"Periodically return whether the Troublemaker process is alive or not.">>, % Description
<<"Watch Troublemaker">>, % Label
troublemaker_watcher, % Feature
[]), % Options - no arguments, use defaults
wombat_plugin_services:create_capability(
[<<"troublemaker start">>], % CapabilityId
executor, % Type
<<"Start the Troublemaker process.">>, % Description
<<"Start Troublemaker">>, % Label
troublemaker_start, % Feature
[
{is_internal, false},
{options, [wombat_plugin_services:create_enum_option(
<<"mode">>, % Name
<<"Mode">>, % Label
<<"">>, % No Default
true, % IsEnabled
[<<"Persistent">>, <<"Temporary">>] % EnumValues
)]},
{arguments, [<<"mode">>]}
]), % Options
wombat_plugin_services:create_capability(
[<<"troublemaker stop">>], % CapabilityId
executor, % Type
<<"Stop the Troublemaker process.">>, % Description
<<"Stop Troublemaker">>, % Label
troublemaker_stop, % Feature
[]) % Options
].
%%------------------------------------------------------------------------------
%% @doc Start a troublemaker process if one is not started already
%% @end
%%------------------------------------------------------------------------------
-spec start_troublemaker(tm_mode()) -> {binary(), pid()}.
start_troublemaker(Mode) ->
Parent = self(),
TMPid =
spawn(
fun() ->
try register(troublemaker, self()) of
true ->
Parent ! {started, self()},
Timeout =
case Mode of
<<"Persistent">> ->
infinity;
<<"Temporary">> ->
10000
end,
receive
stop -> ok
after
Timeout -> ok
end
catch error:badarg ->
Parent ! {already_started, self()}
end
end),
receive
{started, TMPid} ->
{<<"Started">>, TMPid};
{already_started, TMPid} ->
{<<"Already started">>, whereis(troublemaker)}
end.
%%------------------------------------------------------------------------------
%% @doc Stop the troublemaker process
%% @end
%%------------------------------------------------------------------------------
-spec stop_troublemaker() -> ok | error.
stop_troublemaker() ->
case whereis(troublemaker) of
undefined ->
error;
Pid ->
exit(Pid, shutdown),
ok
end.
%%------------------------------------------------------------------------------
%% @doc Check troublemaker status and create an interactive table data
%% accordingly.
%% @end
%%------------------------------------------------------------------------------
-spec watch_troublemaker() -> wombat_types:stream_data_interactive_table().
watch_troublemaker() ->
case whereis(troublemaker) of
undefined ->
[[wombat_plugin_services:create_interactive_value(
<<"Not running">>, [])]];
_Pid ->
Action = wombat_plugin_services:create_action(
<<"Stop Troublemaker">>, %% Label
node, %% Object type - this node
troublemaker_stop, %% Feature name
[] %% No arguments
),
[[wombat_plugin_services:create_interactive_value(
<<"Running">>, [Action])]]
end.
%%------------------------------------------------------------------------------
%% @doc Return troublemaker status as a binstring
%% @end
%%------------------------------------------------------------------------------
-spec troublemaker_status() -> binary().
troublemaker_status() ->
case whereis(troublemaker) of
undefined ->
<<"Not running">>;
_Pid ->
<<"Running">>
end.
%%------------------------------------------------------------------------------
%% @doc Add info about a request to the state.
%% @end
%%------------------------------------------------------------------------------
-spec add_req_info(ReqId :: binary(),
ReqInfo :: tm_mode(),
State :: state()) -> NewState :: state().
add_req_info(ReqId, ReqInfo, #state{requests = Requests} = State)->
State#state{requests = [{ReqId, ReqInfo}|Requests]}.
%%------------------------------------------------------------------------------
%% @doc Get the info of a request from the state.
%% @end
%%------------------------------------------------------------------------------
-spec get_req_info(ReqId :: binary(),
State :: state()) -> tm_mode().
get_req_info(ReqId, #state{requests = Requests}) ->
{_ReqId, ReqInfo} = lists:keyfind(ReqId, 1, Requests),
ReqInfo.
%%------------------------------------------------------------------------------
%% @doc Delete the info of a request from the state.
%% @end
%%------------------------------------------------------------------------------
-spec delete_req_info(ReqId :: binary(),
State :: state()) -> NewState :: state().
delete_req_info(ReqId, #state{requests = Requests} = State)->
NewRequests = lists:keydelete(ReqId, 1, Requests),
State#state{requests = NewRequests}.
|
Rules about passing callback functions to non-WombatOAM processes
There are two important rules to keep in mind when passing a callback function
to a non-WombatOAM process:
- Agent modules (including plugin modules and plugin infrastructure modules)
should never pass a reference to an anonymous or local function (e.g.
sys:install(interesting_gen_server, {fun (FuncState, SysMsg, ServerState) ->
... end, FuncState0})
or sys:install(interesting_gen_server, {fun my_dbg/3,
FuncState0})
) to a non-agent process, because when the agent module is
purged, the process with the reference to the unloaded module will be killed
by code:purge
.
Agent modules should pass only exported functions using the MFA syntax (e.g.
sys:install(interesting_gen_server, {fun ?MODULE:my_dbg/3, FuncState0})
),
because this way the non-WombatOAM process will keep only the MFA in its memory
as opposed to a reference, so it is not affected by the agent module being
purged. When the callback is called, the caller will get an "undefined
function" error, but that can be caught easily by the non-WombatOAM process. The
plugin developer should check whether the error is indeed caught by the
process that the plugin is observing.
- The callback functions should be very quick: they should not take more than 1
second even if the system is loaded heavily. This is because if a process is
executing a callback function defined in an agent module, WombatOAM will give 1
second for that function call to finish before doing a hard purge (which
would kill the process if it were still executing the callback).
The following snippet demonstrates the problem behind the first rule:
1
2
3
4
5
6
7
8
9
10
11
12 | $ cat test.erl
-module(test).
-compile(export_all).
f() ->
io:format("Finished f_fun").
f_fun() ->
fun() ->
io:format("Finished f_fun")
end.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29 | $ erl
% We load the test module.
1> c(test).
{ok,test}
% We create a reference to a function in the test module.
2> F = test:f_fun().
#Fun<test.0.124694843>
% We don't have old code yet (only new code), so check_process_code is
% false when called with the shell process.
3> erlang:check_process_code(self(), test).
false
% We mark the test module as old code.
4> code:delete(test).
true
% Now check_process_code says that we do have old code.
5> erlang:check_process_code(self(), test).
true
% Code purge calls check_process_code on each process to decide if it
% uses the old version of the purged module, and if so, it kills the
% process. In this case it kills the shell process.
6> code:purge(test).
*** ERROR: Shell process terminated! ***
Eshell V5.10.4 (abort with ^G)
|
If line 2 is replaced with F = fun test:f_fun/0
, then this problem will
not occur:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33 | % Now F only contains the information that is should call
% test:f_fun/0, and not a real reference that points inside the byte
% code of the test module.
2> F = fun test:f/0.
#Fun<test.f.0>
[...]
% Therefore it doesn't use the test module...
5> erlang:check_process_code(self(), test).
false
% ...and therefore it is not killed by purge.
6> code:purge(test).
false
% If we now call F(), we will simply get an undef error that can be
% caught by 'catch'. Before doing that, let's set the path to an
% empty list, otherwise Erlang would automatically load test.beam
% when we call F.
7> code:set_path([]).
true
8> F().
** exception error: undefined function test:f/0
9> catch F().
{'EXIT',{undef,[{test,f,[],[]},
{erl_eval,do_apply,5,[{file,"erl_eval.erl"},{line,560}]},
{erl_eval,expr,5,[{file,"erl_eval.erl"},{line,357}]},
{shell,exprs,7,[{file,"shell.erl"},{line,674}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,629}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,614}]}]}}
|
A typical scenario is to pass an MFA (which points to a WombatOAM plugin) to a
non-WombatOAM process that will use it as a callback. Examples include:
- Passing debug functions to
gen
processes using sys:install
/sys:remove
.
This scenario is analysed below.
- Passing callback functions to event handler processes.
Let's say the plugin uses sys:install
to install a debug function into the
interesting_gen_server
process. When WombatOAM wants to stop the plugin, the
plugin shall call sys:remove
, with 0 timeout:
| terminate(_State) ->
...
catch sys:remove(interesting_gen_server, fun ?MODULE:my_dbg_function/3, 0),
...
|