[CONTROLLER-1941] Controller does not quarantine on isolation/unisolation in cluster Created: 01/Jun/20  Updated: 02/Jun/21  Resolved: 19/Jun/20

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: Magnesium SR2, Sodium SR4, 2.0.3

Type: Bug Priority: Medium
Reporter: Tejas Nevrekar Assignee: Tejas Nevrekar
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to CONTROLLER-1901 cluster node quarantined, but the nod... Resolved

 Description   

In a 3 node or a geo-cluster, consider the following scenario:

1. geo-cluster - odl1 to odl6
2. set odl1-odl3 as voting, odl4-odl6 as non-voting
3. isolate odl6 using iptables
4. wait for 5 mins
5. unisolate using iptables -F
6. the odl1/2/3 and odl6 will go into a quarantine loop as below:

2020-05-27T09:06:32,036 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting                         | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.103:2550] is still unreachable or has not been restarted. Keeping it quarantined.
2020-05-27T09:06:32,037 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting                         | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.101:2550] is still unreachable or has not been restarted. Keeping it quarantined.
2020-05-27T09:06:32,037 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-40 | Remoting                         | 48 - com.typesafe.akka.slf4j - 2.5.22 | Quarantined address [akka.tcp://opendaylight-cluster-data@10.18.165.102:2550] is still unreachable or has not been restarted. Keeping it quarantined.

This is actually an issue in akka upstream - https://github.com/akka/akka/issues/24764.
But the solution for this is only present in the artery based akka.
Hence, till we move to artery we need to add this workaround to make this scenario work.


Generated at Wed Feb 07 19:56:49 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.