Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-2035

ODL Cluster - Akka Cluster Singleton Manager Actor - OwnerSupervisor Getting Terminated and never restarts

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Highest Highest
    • 4.0.11, 5.0.2
    • None
    • clustering
    • None

      OwnerSupervisor - Cluster Singleton Manager Actor which is responsible for assigning the owners to the DOMEntities is getting terminated in the following scenarios,

       

      1. When trying to do multiple Netconf device mounts or multiple Deletes of netconf devices in an ODL cluster setup
      2. When the ODL instances with the master mounts gets restarted and joins the Cluster.

       

      And it never gets restarted again and due to which further new mounts are failing as their is no actor to assign owners entities.

      As per the akka docs and cluster singleton manager actor, ideally it should be always running in any one of the ODL instances in the cluster

       

      2022-03-09T12:32:30,836 | ERROR | opendaylight-cluster-data-akka.actor.default-dispatcher-36 | Behavior$                        | 210 - org.opendaylight.controller.repackaged-akka - 4.0.10 | Supervisor StopSupervisor saw failure: Ask timed out on Actor[akka://opendaylight-cluster-data/system/typedDdataReplicator#971206880] after [5000 ms]. Message of type [akka.cluster.ddata.typed.javadsl.Replicator$Update]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply.
      2022-03-09T12:32:30,836 | ERROR | opendaylight-cluster-data-akka.actor.default-dispatcher-36 | Behavior$                        | 210 - org.opendaylight.controller.repackaged-akka - 4.0.10 | Supervisor StopSupervisor saw failure: Ask timed out on Actor[akka://opendaylight-cluster-data/system/typedDdataReplicator#971206880] after [5000 ms]. Message of type [akka.cluster.ddata.typed.javadsl.Replicator$Update]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply.java.util.concurrent.TimeoutException: Ask timed out on Actor[akka://opendaylight-cluster-data/system/typedDdataReplicator#971206880] after [5000 ms]. Message of type [akka.cluster.ddata.typed.javadsl.Replicator$Update]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply.
        at akka.actor.typed.scaladsl.AskPattern$.$anonfun$onTimeout$1(AskPattern.scala:131) ~[bundleFile:?]
        at akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:730) ~[bundleFile:?]
        at akka.actor.Scheduler$$anon$7.run(Scheduler.scala:479) ~[bundleFile:?]
        at scala.concurrent.ExecutionContext$parasitic$.execute(ExecutionContext.scala:222) ~[bundleFile:?]
        at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:365) ~[bundleFile:?]
        at akka.actor.LightArrayRevolverScheduler$$anon$3.executeBucket$1(LightArrayRevolverScheduler.scala:314) ~[bundleFile:?]
        at akka.actor.LightArrayRevolverScheduler$$anon$3.nextTick(LightArrayRevolverScheduler.scala:318) ~[bundleFile:?]
        at akka.actor.LightArrayRevolverScheduler$$anon$3.run(LightArrayRevolverScheduler.scala:270) ~[bundleFile:?]
        at java.lang.Thread.run(Unknown Source) ~[?:?]
      2022-03-09T12:32:30,984 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-6 | ClusterSingletonManager          | 210 - org.opendaylight.controller.repackaged-akka - 4.0.10 | Singleton actor [akka://opendaylight-cluster-data/system/singletonManagerOwnerSupervisor/OwnerSupervisor] was terminated

            shibu.vijayakumar Shibu Vijayakumar
            shibu.vijayakumar Shibu Vijayakumar
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: