[CONTROLLER-804] Clustering : Restarting a controller node in a 3 node cluster causes it to not be able to find the leader Created: 11/Sep/14  Updated: 16/Sep/14  Resolved: 16/Sep/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Moiz Raja Assignee: Moiz Raja
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Mac OS
Platform: PC


External issue ID: 1803
Priority: High

 Description   

For steps checkout the clustering disaster recovery scripts in the integration repo.

https://git.opendaylight.org/gerrit/#/c/11021/



 Comments   
Comment by Moiz Raja [ 11/Sep/14 ]

2014-09-11 02:59:09,916 | WARN | lt-dispatcher-18 | ShardManager | 152 - com.typesafe.akka.slf4j - 2.3.4 | akka://opendaylight-cluster-data/user/shardmanager-config | Supervisor Strategy of resume applied
at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2412)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2380)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257)
at com.google.common.cache.LocalCache.get(LocalCache.java:4000)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.DataNodeContainerModificationStrategy.getChild(DataNodeContainerModificationStrategy.java:81)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.DataNodeContainerModificationStrategy$ContainerModificationStrategy.getChild(DataNodeContainerModificationStrategy.java:119)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.RootModificationApplyOperation.getChild(RootModificationApplyOperation.java:66)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.TreeNodeUtils.findNodeChecked(TreeNodeUtils.java:53)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.InMemoryDataTreeModification.resolveModificationStrategy(InMemoryDataTreeModification.java:137)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.InMemoryDataTreeModification.resolveModificationFor(InMemoryDataTreeModification.java:143)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.InMemoryDataTreeModification.merge(InMemoryDataTreeModification.java:73)
at org.opendaylight.controller.md.sal.dom.store.impl.SnapshotBackedWriteTransaction.merge(SnapshotBackedWriteTransaction.java:85)
at org.opendaylight.controller.cluster.datastore.modification.MergeModification.apply(MergeModification.java:37)
at org.opendaylight.controller.cluster.datastore.modification.MutableCompositeModification.apply(MutableCompositeModification.java:33)
at org.opendaylight.controller.cluster.datastore.Shard.commit(Shard.java:330)
at org.opendaylight.controller.cluster.datastore.Shard.applyState(Shard.java:444)
at org.opendaylight.controller.cluster.raft.RaftActor.onReceiveRecover(RaftActor.java:151)
at org.opendaylight.controller.cluster.datastore.Shard.onReceiveRecover(Shard.java:175)
at akka.persistence.UntypedPersistentActor$$anonfun$receiveRecover$1.applyOrElse(Eventsourced.scala:433)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33)
at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25)
at akka.persistence.Eventsourced$$anonfun$akka$persistence$Eventsourced$$recoveryBehavior$1.applyOrElse(Eventsourced.scala:168)
at akka.persistence.Recovery$State$$anonfun$processPersistent$1.apply(Recovery.scala:33)
at akka.persistence.Recovery$State$$anonfun$processPersistent$1.apply(Recovery.scala:33)
at akka.persistence.Recovery$class.withCurrentPersistent(Recovery.scala:176)
at akka.persistence.UntypedPersistentActor.withCurrentPersistent(Eventsourced.scala:428)
at akka.persistence.Recovery$State$class.processPersistent(Recovery.scala:33)
at akka.persistence.Recovery$$anon$1.processPersistent(Recovery.scala:95)
at akka.persistence.Recovery$$anon$1.aroundReceive(Recovery.scala:101)
at akka.persistence.Recovery$class.aroundReceive(Recovery.scala:256)
at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(Eventsourced.scala:428)
at akka.persistence.Eventsourced$$anon$1.aroundReceive(Eventsourced.scala:35)
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:369)
at akka.persistence.UntypedPersistentActor.aroundReceive(Eventsourced.scala:428)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2014-09-11 02:59:09,916 | WARN | lt-dispatcher-18 | OneForOneStrategy | 152 - com.typesafe.akka.slf4j - 2.3.4 | akka://opendaylight-cluster-data/user/shardmanager-config/member-3-shard-default-config | CacheLoader returned null for key (urn:opendaylight:params:xml:ns:yang:controller:config:sal-clustering-it:car-people?revision=2014-08-18)car-people.

Comment by Moiz Raja [ 16/Sep/14 ]

This happens to be an issue with configuration. The issue was kind of like this.

Let's say you have 3 nodes and node 1 was setup to be the only seed node. Now you bring down the seed node and bring it back up. When the leader node comes back up it tries to join cluster however the cluster has changed because a new leader was elected.

The fix for this is one of the following,

a. When the leader is restarted set the seed node for it as one of the existing nodes in the cluster (the one that is running)
b. In a 3 node cluster provide all nodes as the seed nodes.

Generated at Wed Feb 07 19:53:55 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.