rabbitmq_netsplit_recovery
RabbitMQ Recovery from netsplit:
About:
This service enables the recovery for a RabbitMQ cluster after having a netsplit in the network which leads to inconsistency in the mnesia database.
Requirements:
In order to perform the recovery, it is required that: - All the nodes of the cluster are added to Wombat. - All the nodes are running. - All the nodes have Rabbit application running - Network connection is stable
Structure:
Every RabbitMQ Node added to wombat will have the rabbitmq_fix_netsplit plugin enabled, which in turn, going to be responsible for handling the orders coming from the wo_services_rabbitmq_fix_netsplit service which manages the recovery process.
The recovery process can be started from any node in the cluster, and then Wombat will gather information about the cluster nodes and inform the user with the cluster partitioning situation: - Number of partitions - Nodes in every Partition
The user can choose between two ways of recovery: Automatic recovery (Default Button) Manual recovery: by giving the user the ability to choose which partition to be the winner and restart all the nodes in other ones.
Depending on the number of partitions exists, buttons for every partition will be shown in order to make the user select which partition is the winner.
Important:
It is important to know that by choosing a partition, this partition is the winner and all nodes in the other ones are going to be restarted (losers).
The Default button is responsible for performing the automatic recovery based on the Auto-heal algorithm (which is going to select the losers based on the number of nodes on every partition and the number of local connections in every node).
In case this request is started and no partitioning problem exists, the user is going to be informed about it.