Details
-
Bug
-
Status: Resolved
-
Medium
-
Resolution: Done
-
None
-
None
Description
In a 3 node or a geo-cluster, consider the following scenario:
1. geo-cluster - odl1 to odl6
2. set odl1-odl3 as voting, odl4-odl6 as non-voting
3. isolate odl6 using iptables
4. wait for 5 mins
5. unisolate using iptables -F
6. the odl1/2/3 and odl6 will go into a quarantine loop as below:
2020-05-27T09:06:32,036 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.103:2550] is still unreachable or has not been restarted. Keeping it quarantined. 2020-05-27T09:06:32,037 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.101:2550] is still unreachable or has not been restarted. Keeping it quarantined. 2020-05-27T09:06:32,037 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.102:2550] is still unreachable or has not been restarted. Keeping it quarantined.
This is actually an issue in akka upstream - https://github.com/akka/akka/issues/24764.
But the solution for this is only present in the artery based akka.
Hence, till we move to artery we need to add this workaround to make this scenario work.
Attachments
Issue Links
- relates to
-
CONTROLLER-1901 cluster node quarantined, but the node did not auto restart when restore the network connection
-
- Resolved
-