-
Bug
-
Resolution: Done
-
Medium
-
None
-
None
In a 3 node or a geo-cluster, consider the following scenario:
1. geo-cluster - odl1 to odl6
2. set odl1-odl3 as voting, odl4-odl6 as non-voting
3. isolate odl6 using iptables
4. wait for 5 mins
5. unisolate using iptables -F
6. the odl1/2/3 and odl6 will go into a quarantine loop as below:
2020-05-27T09:06:32,036 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.103:2550] is still unreachable or has not been restarted. Keeping it quarantined. 2020-05-27T09:06:32,037 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.101:2550] is still unreachable or has not been restarted. Keeping it quarantined. 2020-05-27T09:06:32,037 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.102:2550] is still unreachable or has not been restarted. Keeping it quarantined.
This is actually an issue in akka upstream - https://github.com/akka/akka/issues/24764.
But the solution for this is only present in the artery based akka.
Hence, till we move to artery we need to add this workaround to make this scenario work.
- relates to
-
CONTROLLER-1901 cluster node quarantined, but the node did not auto restart when restore the network connection
- Resolved