Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1901

cluster node quarantined, but the node did not auto restart when restore the network connection

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Medium Medium
    • Sodium SR4
    • Oxygen SR4
    • clustering
    • Normal

      In a three-node cluster environment, a cluster member is isolated by manual network isolation, and then the network is restored. It is found that the cluster members are not restarted.

      Environment:

      3 cluster nodes

      member-01:172.20.14.162
      member-02:172.20.14.163
      member-03:172.20.14.164

       

      odl-version: Oxygen-sr4(0.8.4)

       

      Steps to reproduce:

      1. config cluster
      2. cluster nodes start
      3. install feature:odl-mdsal-all
      4. add reject route on node01 as:
        route add -host 172.20.14.163 reject
        route add -host 172.20.14.164 reject
      5. few minutes later delete reject route as :
        route del -host 172.20.14.163 reject
        route del -host 172.20.14.164 reject
      6. log always print "is still unreachable or has not been restarted. Keeping it quarantined."
      7. node did not restart

      cluster config use default settings, such as node01:

       

      odl-cluster-data {
        akka {
          remote {
            artery {
              enabled = off
              canonical.hostname = "172.20.14.162"
              canonical.port = 2550
            }
      
            netty.tcp {
              hostname = "172.20.14.162"
              port = 2550
            }
      
            # when under load we might trip a false positive on the failure detector
            transport-failure-detector {
              # heartbeat-interval = 4 s
              # acceptable-heartbeat-pause = 16s #
            }
          }
      
          cluster {
            # Remove ".tcp" when using artery.
            seed-nodes = ["akka.tcp://opendaylight-cluster-data@172.20.14.162:2550", "akka.tcp://opendaylight-cluster-data@172.20.14.163:2550", "akka.tcp://opendaylight-cluster-data@172.20.14.164:2550"] roles = ["member-1"]
          }
      
          persistence {
            # By default the snapshots/journal directories live in KARAF_HOME. You can choose to put it somewhere else by
            # modifying the following two properties. The directory location specified may be a relative or absolute path.
            # The relative path is always relative to KARAF_HOME.
            snapshot-store.local.dir = "target/snapshots"
            journal.leveldb.dir = "target/journal"
            journal {
              leveldb {
                # Set native = off to use a Java-only implementation of leveldb.
                # Note that the Java-only version is not currently considered by Akka to be production quality.
                # native = off
              }
            }
          }
        }
      }
      

       

       

            Bosong Bo Song
            Bosong Bo Song
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: