Details
-
Bug
-
Status: Resolved
-
Highest
-
Resolution: Done
-
None
-
None
Description
OwnerSupervisor - Cluster Singleton Manager Actor which is responsible for assigning the owners to the DOMEntities is getting terminated in the following scenarios,
- When trying to do multiple Netconf device mounts or multiple Deletes of netconf devices in an ODL cluster setup
- When the ODL instances with the master mounts gets restarted and joins the Cluster.
And it never gets restarted again and due to which further new mounts are failing as their is no actor to assign owners entities.
As per the akka docs and cluster singleton manager actor, ideally it should be always running in any one of the ODL instances in the cluster
2022-03-09T12:32:30,836 | ERROR | opendaylight-cluster-data-akka.actor.default-dispatcher-36 | Behavior$ | 210 - org.opendaylight.controller.repackaged-akka - 4.0.10 | Supervisor StopSupervisor saw failure: Ask timed out on Actor[akka://opendaylight-cluster-data/system/typedDdataReplicator#971206880] after [5000 ms]. Message of type [akka.cluster.ddata.typed.javadsl.Replicator$Update]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply. 2022-03-09T12:32:30,836 | ERROR | opendaylight-cluster-data-akka.actor.default-dispatcher-36 | Behavior$ | 210 - org.opendaylight.controller.repackaged-akka - 4.0.10 | Supervisor StopSupervisor saw failure: Ask timed out on Actor[akka://opendaylight-cluster-data/system/typedDdataReplicator#971206880] after [5000 ms]. Message of type [akka.cluster.ddata.typed.javadsl.Replicator$Update]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply.java.util.concurrent.TimeoutException: Ask timed out on Actor[akka://opendaylight-cluster-data/system/typedDdataReplicator#971206880] after [5000 ms]. Message of type [akka.cluster.ddata.typed.javadsl.Replicator$Update]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply. at akka.actor.typed.scaladsl.AskPattern$.$anonfun$onTimeout$1(AskPattern.scala:131) ~[bundleFile:?] at akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:730) ~[bundleFile:?] at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:479) ~[bundleFile:?] at scala.concurrent.ExecutionContext$parasitic$.execute(ExecutionContext.scala:222) ~[bundleFile:?] at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:365) ~[bundleFile:?] at akka.actor.LightArrayRevolverScheduler$$anon$3.executeBucket$1(LightArrayRevolverScheduler.scala:314) ~[bundleFile:?] at akka.actor.LightArrayRevolverScheduler$$anon$3.nextTick(LightArrayRevolverScheduler.scala:318) ~[bundleFile:?] at akka.actor.LightArrayRevolverScheduler$$anon$3.run(LightArrayRevolverScheduler.scala:270) ~[bundleFile:?] at java.lang.Thread.run(Unknown Source) ~[?:?] 2022-03-09T12:32:30,984 | INFO | opendaylight-cluster-data-akka.actor.default-dispatcher-6 | ClusterSingletonManager | 210 - org.opendaylight.controller.repackaged-akka - 4.0.10 | Singleton actor [akka://opendaylight-cluster-data/system/singletonManagerOwnerSupervisor/OwnerSupervisor] was terminated
Attachments
Issue Links
- is cloned by
-
CONTROLLER-2046 CLONE - ODL Cluster - Akka Cluster Singleton Manager Actor - OwnerSupervisor Getting Terminated and never restarts
-
- Open
-
- is duplicated by
-
CONTROLLER-2036 Failure of initial removal of candidates from previous iteration
-
- Resolved
-
- relates to
-
CONTROLLER-2047 ODL Clustering issues
-
- Open
-
# | Subject | Branch | Project | Status | CR | V |
---|---|---|---|---|---|---|
100347,1 | Add supervisor to EOS singleton actor | 4.0.x | controller | Status: MERGED | +2 | +1 |
100357,3 | Add supervisor to EOS singleton actor | master | controller | Status: MERGED | +2 | +1 |