Metrics
WombatOAM has more than 100 built-in metrics, organised into metric categories on the dashboard (such categories are Memory, Runtime, I/O, etc.) See the description of all built-in metrics below.
WombatOAM also collects metrics from the folsom and exometer applications if they are running on the managed node. These are shown under the "folsom" and "exometer" metric categories on the dashboard.
Viewing metrics
On the Metrics page, select a node or node family. This reveals the available metrics, grouped by type and category. Select a type (counter, gauge, meter, spiral, histogram or duration – these are shown as tabs) and a category to reveal individual metrics, and then select a metric to display it.
For individual nodes, each of the numeric metrics are displayed as line charts. If you select a node family, the metrics are displayed as stacked graphs showing all the nodes in the family. For node families only one metric can be selected for stacked display and only counters and gagues are supported (as they only have single datapoints).
You can superimpose metrics to view several metrics on the same graph. Each metric you select is added to the graph. To remove a metric, click it again to clear the checkbox next to it. To clear all metrics from the graph, click the "trash" icon on the right above the graph. You can also remove individual metrics on the "Configure metrics" window.
Other viewing options:
- Refresh interval: To change the frequency at which the graph refreshes, click the "settings" icon on the right above the graph to open the "Configure metrics" window, and then select an option in the "Refresh interval" list.
- Markers: Different metrics are distinguished on the graph by different colours and markers. To hide or show these markers, click the "settings" icon, and select or clear the "Enable markers" checkbox.
- Delta: To view deltas instead of actual values, select the "Delta" (Δ) icon on the right above the graph.
By default, each metric is polled every 30 seconds, i.e. the metric graphs will show a new data point once a minute. If you would like to change this setting, click the cogwheel icon.
Built-in metrics
I/O
Input I/O bytes
The total number of bytes received through ports.
- Tags:
dev
,op
Output I/O bytes
The total number of bytes output to ports.
- Tags:
dev
,op
TCP: Total bytes received
The total number of bytes that have been received by TCP.
- Tags:
dev
,op
TCP: Packets received
The number of TCP packets that have been received.
- Tags:
dev
TCP: Average received packet size
The average size of TCP packets that have been received.
- Tags:
dev
TCP: Maximum received packet size
The maximum size of TCP packets that have been received.
- Tags:
dev
TCP: Total bytes sent
The total number of bytes that have been sent by TCP.
- Tags:
dev
,op
TCP: Packets sent
The number of TCP packets that have been sent.
- Tags:
dev
,op
TCP: Average sent packet size
The average size of TCP packets that have been sent.
- Tags:
dev
TCP: Maximum sent packet size
The maximum size of TCP packets that have been sent.
- Tags:
dev
UDP: Total bytes received
The total number of bytes that have been received by UDP.
- Tags:
dev
,op
UDP: Packets received
The number of UDP packets that have been received.
- Tags:
dev
UDP: Average received packet size
The average size of UDP packets that have been received.
- Tags:
dev
UDP: Maximum received packet size
The maximum size of UDP packets that have been received.
- Tags:
dev
UDP: Total bytes sent
The total number of bytes that have been sent by UDP.
- Tags:
dev
,op
UDP: Packets sent
The number of UDP packets that have been sent.
- Tags:
dev
,op
UDP: Average sent packet size
The average size of UDP packets that have been sent.
- Tags:
dev
UDP: Maximum sent packet size
The maximum size of UDP packets that have been sent.
- Tags:
dev
Disk usage on x
The result of the latest disk check for each partition. Reports the disk usage (e.g. the percentage of disk space occupied) on a mounted partition.
- Tags:
op
Inode usage on x
The result of the latest disk check for each local mount point. Reports the inode usage (e.g. the percentage of inodes used) on a mounted partition.
- Tags:
op
Memory
Total memory
The total amount of memory currently allocated, which is the same as the sum of
memory size for processes
and system
.
- Tags:
dev
,op
Process memory
The total amount of memory currently allocated by the Erlang processes.
- Tags:
dev
,op
Process memory used
The total amount of memory currently used by the Erlang processes. This memory
is part of the memory presented as process
memory.
- Tags:
dev
System memory
The total amount of memory currently allocated by the emulator that is not
directly related to any Erlang process. Memory presented as processes
is not
included in this memory.
- Tags:
dev
,op
Atom memory
The total amount of memory currently allocated for atoms. This memory is part of
the memory presented as system
memory.
- Tags:
dev
Atom memory used
The total amount of memory currently used for atoms. This memory is part of the
memory presented as atom
memory.
- Tags:
dev
Binary memory
The total amount of memory currently allocated for binaries. This memory is part
of the memory presented as system
memory.
- Tags:
dev
Code memory
The total amount of memory currently allocated for Erlang code. This memory is
part of the memory presented as system
memory.
- Tags:
dev
ETS memory
The total amount of memory currently allocated for ETS tables. This memory is
part of the memory presented as system
memory.
- Tags:
dev
System total memory
The amount of memory available to the whole operating system.
- Tags:
dev
,op
Total memory available
The total amount of memory available to the Erlang emulator, allocated and free. May or may not be equal to the amount of memory configured in the system.
- Tags:
dev
,op
Buffered memory
The amount of memory the system uses for temporary storing raw disk blocks.
- Tags:
dev
Cached memory
The amount of memory the system uses for cached files read from disk.
- Tags:
dev
Free memory
The amount of free memory available to the Erlang emulator for allocation.
- Tags:
dev
,op
Free swap memory
The amount of memory the system has available for disk swap.
- Tags:
dev
Swap memory used
The amount of memory the system is using for disk swap.
- Tags:
dev
,op
Atoms
The total number of atoms in the system.
- Tags:
dev
DETS tables
The number of open DETS tables on the selected node.
- Tags:
dev
,op
ETS tables
The number of ETS tables at the selected node.
- Tags:
dev
,op
Low memory
The total amount of memory allocated in low memory areas that are restricted to less than 4GB even though the system may have more physical memory. The metric is available only on 64-bit halfword emulator.
- Tags:
dev
Maximum memory
The maximum total amount of memory allocated since the emulator was started. The metric is available only when the emulator is run with instrumentation.
- Tags:
dev
Allocated atom_table area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated bif_timer area
Memory allocated for timers in bytes.
- Tags:
dev
Allocated bits_bufs_size area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated dist_table area
Memory allocated for the distribution table in bytes.
- Tags:
dev
Allocated ets_misc area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated export_list area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated export_table area
Memory allocated for the export table in bytes.
- Tags:
dev
Allocated fun_table area
Memory allocated for the function table in bytes.
- Tags:
dev
Allocated link_lh area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated loaded_code area
Memory allocated for all the loaded code in bytes.
- Tags:
dev
Allocated module_refs area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated module_table area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated node_table area
Memory allocated for the table of nodes in bytes.
- Tags:
dev
Allocated process_table area
Memory allocated for the process table in bytes.
- Tags:
dev
Allocated register_table area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated static area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Allocated sys_misc area
The amount of allocated memory for this area in bytes.
- Tags:
dev
Mnesia System metrics
These metrics are collected by the mnesia plugin. For more information, see the "mnesia plugin" section.
checkpoints
The checkpoints currently active on the node.
db_nodes
The nodes which make up the persistent database.
dc_dump_limit
Controls how often disc_copies tables are dumped from memory. Lower values reduce CPU overhead but increases disk space and startup times.
dump_log_time_threshold
The time threshold for transaction log dumps in milliseconds.
dump_log_write_threshold
The write threshold for transaction log dumps as the number of writes to the transaction log.
extra_db_nodes
Extra db_nodes
to be contacted at start-up.
held_locks
Locks held by the local Mnesia lock manager.
local_tables
Tables which are configured to reside locally.
lock_queue
Transactions that are queued for execution by the local lock manager.
master_node_tables
Tables with at least one master node
no_table_loaders
The number of parallel table loaders during start. More loaders can be good if the network latency is high or if many tables contains few records.
running_db_nodes
Nodes where Mnesia currently is running. For more information, see mnesia:system_info/1
subscribers
Local processes currently subscribing to system events.
tables
Locally known tables.
transaction_commits
A number that indicates how many transactions have terminated successfully since Mnesia was started.
transaction_failures
A number that indicates how many transactions have failed since Mnesia was started.
transaction_log_writes
A number that indicates the number of write operations that have been performed to the transaction log since start-up.
transaction_restarts
A number which indicates how many transactions have been restarted since Mnesia was started.
transactions
All currently active local transactions.
Nodes, modules and applications
Known nodes
The number of nodes that are known to the selected node; this includes not only visible nodes, but also hidden nodes and previously known nodes, etc.
- Tags:
dev
,op
Connected nodes
The number of nodes that are connected to the selected node.
- Tags:
dev
,op
Visible nodes
The number of nodes that are connected to the selected node through normal connections.
- Tags:
dev
Hidden nodes
The number of nodes that are connected to the selected node through hidden connections.
- Tags:
dev
Traced nodes
The number of nodes traced from the current node by the Erlang dbg
facility.
- Tags:
dev
Loaded modules
The number of loaded Erlang modules (current and/or old code), including preloaded modules.
- Tags:
dev
Old modules
The number of modules that have old code. For more details, see "Current and Old Code" on the following page.
- Tags:
dev
Module name clashes
The number of module name clashes. The function searches the entire code space for module names with identical names.
- Tags:
dev
,op
Loaded applications
The number of applications that have been loaded into the application controller. This includes any included applications.
- Tags:
dev
Started applications
The number of processes started by the application controller process, which starts all other applications.
- Tags:
dev
Running applications
The number of applications that are currently running.
- Tags:
dev
Ports and sockets
Open ports
The number of ports currently existing on the local node.
- Tags:
dev
,op
Ports with driver level locking
Number of ports with driver level locking. Driver level locking implies that all
instances (ports) of the same port driver will use a global lock and only one
emulator thread will execute code in the driver at a time. (As opposed to port
level locking where each instance of the same port driver will use a
per-instance lock and multiple emulator threads may execute code in the driver
at the same time.) It might indicate a bottleneck if such a driver has many
instances. (See erlang:port_info/2 and the
ERL_DRV_FLAG_USE_PORT_LOCKING
driver flag.) This metric
is always zero on VMs with SMP support disabled.
- Tags:
dev
Alive ports total input in bytes
The total amount of data, in bytes, queued by all ports using the ERTS driver queue implementation.
- Tags:
dev
Alive ports total output in bytes
The total number of bytes written to by all ports from Erlang processes using
either port_command/2
, port_command/3
, or Port ! {Owner, {command, Data}}
.
- Tags:
dev
Open TCP sockets
The number of TCP sockets that are connected.
- Tags:
dev
,op
Open UDP sockets
The number of UDP sockets that are connected.
- Tags:
dev
,op
Open SCTP sockets
The number of SCTP sockets that are connected.
- Tags:
dev
,op
Open x ports
The number of open ports belonging to a specific type. This type is obtained
from erlang:port_info/1
using the name
key of the proplist.
- Tags:
dev
,op
By default, only ports with type TCP/UDP/SCTP are displayed. The
port_type_counters_mode
option can be used to configure WombatOAM to show all
ports, including ports opened by a running application. See the
"builtin_*_metrics plugins" for more information.
Process notifications
Long GC
The number of "Long GC" triggers from the system monitor in the last period (minute or second). A "Long GC" trigger means that a garbage collection in the system took longer than expected.
- Tags:
dev
Long schedule
The number of "Long schedule" triggers from the system monitor in the last period (minute or second). A "Long schedule" trigger means that a process or port in the system has been running uninterrupted for a longer time than expected.
- Tags:
dev
Large heap
The number of "Large heap" triggers from the system monitor in the last period (minute or second). A "Large heap" trigger means that a garbage collection in the system resulted in the size of a heap being unusually large.
- Tags:
dev
Busy port
The number of "Busy port" triggers from the system monitor in the last period (minute or second). A "Busy port" trigger means that a process in the system was suspended because it was sending to a busy port.
- Tags:
dev
,op
Busy dist port
The number of "Busy dist port" triggers from the system monitor in the last period (minute or second). A "Busy dist port" trigger means that a process in the system was suspended because it was sending to a process on a remote node whose inter-node communication was handled by a busy port.
- Tags:
dev
,op
Processes
Processes
The number of processes currently existing at the selected node.
- Tags:
dev
Process limit
The maximum number of processes that can existing simultaneously on the selected node.
- Tags:
dev
,op
Registered processes
The number of process names that have been registered.
- Tags:
dev
OS threads
Returns the result of calling the function cpu_sup:nprocs/0
. Note that this
function returns the number of LWP's (aka threads) that are alive in the system.
That is something similar to what you get when you run ps -eLF
. It is a
rudimentary way of measuring the system load that may be of interest in some
cases.
- Tags:
dev
,op
Processes traced
The total number of processes traced by the Erlang's tracing mechanism.
- Tags:
dev
Sum process dictionary size
The total size of the process dictionaries of all processes.
- Tags:
dev
Sum message queue length
The total number of messages currently in the message queue of all processes.
- Tags:
dev
,op
Error logger message queue length
The number of messages currently in the message queue of the Erlang error logger.
- Tags:
dev
,op
Memory size of all processes
The total size of all processes. This includes call stack, heap and internal structures.
- Tags:
dev
Number of process groups
The total number of known process groups.
- Tags:
dev
Shell history length
Sum of history length (the number of commands evaluated by a shell process) of all shell processes.
- Tags:
dev
,op
Shell process size
Sum of the size of all shell processes in bytes. This includes call stack, heap, and internal structures.
- Tags:
dev
Processes in running state
The number of processes where the status of the process is running
.
- Tags:
dev
Processes in runnable state
The number of processes where the status of the process is runnable
(ready to run, but another process is running).
- Tags:
dev
Processes in exiting state
The number of processes where the status of the process is exiting
.
- Tags:
dev
Processes in GC state
The number of processes where the status of the process is garbage_collecting
.
- Tags:
dev
Processes in waiting state
The number of processes where the status of the process is waiting
(for a message).
- Tags:
dev
Processes in suspended state
The number of processes where the status of the process is suspended
(suspended on a "busy" port or by the erlang:suspend_process/[1,2]
built-in function).
- Tags:
dev
Processes with max priority
The number of processes where the current priority level for the process
is max
.
- Tags:
dev
Processes with high priority
The number of processes where the current priority level for the process
is high
.
- Tags:
dev
Processes with normal priority
The number of processes where the current priority level for the process
is normal
.
- Tags:
dev
Processes with low priority
The number of processes where the current priority level for the process
is low
.
- Tags:
dev
Processes in app x
Every process is a member of some process group and all groups have a group leader process. Every application has a group leader. For each application, this metric shows the number of processes associated to the application group leader.
- Tags:
dev
Orphan processes
The number of processes that do not belong to any group leader.
- Tags:
dev
Unknown processes
The number of processes that belong to a non-application group leader process.
- Tags:
dev
Runtime
Context switch count
The total number of context switches since the system started.
- Tags:
dev
Scheduler run queue size
The total length of the run queues, that is, the number of processes that are ready to run on all available run queues.
- Tags:
dev
,op
Reductions total
The total number of reductions performed by processes. This is an approximate measure of how much CPU time they have used.
- Tags:
dev
Reduction count for alive processes
The total number of reductions executed by all processes that are still running.
- Tags:
dev
Garbage collections
The total number of garbage collections since the system started.
- Tags:
dev
Minor garbage collections
The total of minor garbage collections that have happened so far for every process in the system.
- Tags:
dev
Bytes reclaimed by GC
The total number of bytes reclaimed through garbage collection.
- Tags:
dev
Average fullsweep after
The average value of the fullsweep_after
parameter for all processes. Relates
to garbage collection.
- Tags:
dev
Average min binary vheap size
The average of minimum binary virtual heap sizes (in words) for all processes.
- Tags:
dev
Average min heap size
The average of minimum heap sizes (in words) for all processes.
- Tags:
dev
CPU utilization total
The sum of CPU utilization in percentages on all the cores. In case of having 4 CPUs the maximum is 400.
- Tags:
dev
,op
CPU load for 1 avg
The average system load in the last minute, as described at
cpu_sup. 0 represents no load, 256 represents the
load reported as 1.00 by rup.
- Tags:
dev
,op
CPU load for 5 avg
The average system load in the last five minutes, as described at
cpu_sup. 0 represents no load, 256 represents the
load reported as 1.00 by rup
.
- Tags:
dev
,op
CPU load for 15 avg
The average system load in the last 15 minutes, as described at
cpu_sup. 0 represents no load, 256 represents the
load reported as 1.00 by rup
.
- Tags:
dev
,op
CPU utilization - kernel on core x
CPU utilization (the percentage share of the CPU cycles spent in this processor state) for executing code in kernel mode. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Solaris and Linux only.
- Tags:
dev
,op
CPU utilization - user on core x
CPU utilization (the percentage share of the CPU cycles spent in this processor state) for executing code in user mode. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Solaris and Linux only.
- Tags:
dev
,op
CPU utilization - nice user on core x
CPU utilization (the percentage share of the CPU cycles spent in this processor state) for executing code in low priority (nice) user mode. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Linux only.
- Tags:
dev
,op
CPU utilization - idle on core x
The percentage share of the CPU cycles spent in the idle state. Each CPU is specified separately, if this information can be retrieved from the operating system. Available for Solaris and Linux only.
- Tags:
dev
,op
Time
Active timers
The number of all timers (one-shot and interval timers) in the table holding timing requests and timer objects; this is an ETS table maintained by Erlang's timer server.
- Tags:
dev
Interval timers
The number of internal timers in the timer interval table, an ETS table maintained by Erlang's timer server.
- Tags:
dev
CPU utilization total
The sum of the runtime for all threads in the Erlang run-time system. This may be greater than the wallclock time. The original source of the calculated value is the times kernel call.
- Tags:
dev
Wallclock total
The time that has elapsed since the program started, measured in real time (as if checking with a clock on the wall).
- Tags:
dev
Scheduler x active wall time
Active time of scheduler x in terms of wall-clock time.
- Tags:
dev
Scheduler x total wall time
Time elapsed in terms of wall-clock time since activation wall-clock in scheduler x.
- Tags:
dev
Scheduler x utilization
The average utilization (i.e., the ratio of the elapsed active wall time and total wall time) of the scheduler for the last one minute (one second in case of live metrics) interval.
- Tags:
dev