Uploaded image for project: 'netconf'
  1. netconf
  2. NETCONF-994

akka asktimeout in a single node cluster

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: High High
    • None
    • None
    • restconf-nb
    • None

      1. Used NEtconf 5.0.4
      2. JAVA_OPTS=Xms512M -Xmx12288m -XX:+UseG1GC
      3. Tried mounting 40K devices one after another with payloads like this..

       8<---------------

       <node xmlns="urn:TBD:params:xml:ns:yang:network-topology">
        <node-id>$NODE_NAME</node-id>
        <host xmlns="urn:opendaylight:netconf-node-topology">172.17.0.2</host>
        <port xmlns="urn:opendaylight:netconf-node-topology">$NODE_PORT</port>
        <username xmlns="urn:opendaylight:netconf-node-topology">admin</username>
        <password xmlns="urn:opendaylight:netconf-node-topology">topsecret</password>
        <tcp-only xmlns="urn:opendaylight:netconf-node-topology">false</tcp-only>
        <reconnect-on-changed-schema xmlns="urn:opendaylight:netconf-node-topology">false</reconnect-on-changed-schema>
        <connection-timeout-millis xmlns="urn:opendaylight:netconf-node-topology">20000</connection-timeout-millis>
        <default-request-timeout-millis xmlns="urn:opendaylight:netconf-node-topology">60000</default-request-timeout-millis>
        <max-connection-attempts xmlns="urn:opendaylight:netconf-node-topology">24</max-connection-attempts>
        <between-attempts-timeout-millis xmlns="urn:opendaylight:netconf-node-topology">60000</between-attempts-timeout-millis>
        <sleep-factor xmlns="urn:opendaylight:netconf-node-topology">1</sleep-factor>
        <keepalive-delay xmlns="urn:opendaylight:netconf-node-topology">300</keepalive-delay>
      </node>

      ----->8-----

      4. After nearly 20 K PUT in a sequence with a sleep of 0.01 seconds after every PUT.

      5. ODL stopped responding to PUT, GET was working though!!!

      Karaf.log kept printing this error again and again

      ----8<---------

      2023-04-17T17:48:18,974 | WARN  | ForkJoinPool.commonPool-worker-9 | AbstractShardBackendResolver     | 194 - org.opendaylight.controller.sal-distributed-datastore - 7.0.4 | Failed to resolve shard
      java.util.concurrent.TimeoutException: Connection attempt failed
              at org.opendaylight.controller.cluster.databroker.actors.dds.AbstractShardBackendResolver.wrap(AbstractShardBackendResolver.java:151) ~[?:?]
              at org.opendaylight.controller.cluster.databroker.actors.dds.AbstractShardBackendResolver.onConnectResponse(AbstractShardBackendResolver.java:168) ~[?:?]
              at org.opendaylight.controller.cluster.databroker.actors.dds.AbstractShardBackendResolver.lambda$connectShard$4(AbstractShardBackendResolver.java:161) ~[?:?]
              at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
              at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
              at java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:483) ~[?:?]
              at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373) ~[?:?]
              at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182) ~[?:?]
              at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655) ~[?:?]
              at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) ~[?:?]
              at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165) ~[?:?]
      Caused by: akka.pattern.AskTimeoutException: Ask timed out on ActorSelection[Anchor(akka://opendaylight-cluster-data/), Path(/user/shardmanager-config/member-1-shard-topology-config#900040822)] after [5000 ms]. Message of type [org.opendaylight.controller.cluster.access.commands.ConnectClientRequest]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply.
      2023-04-17T17:48:25,013 | WARN  | ForkJoinPool.commonPool-worker-9 | AbstractShardBackendResolver     | 194 - org.opendaylight.controller.sal-distributed-datastore - 7.0.4 | Failed to resolve shard
      java.util.concurrent.TimeoutException: Connection attempt failed
              at org.opendaylight.controller.cluster.databroker.actors.dds.AbstractShardBackendResolver.wrap(AbstractShardBackendResolver.java:151) ~[?:?]
              at org.opendaylight.controller.cluster.databroker.actors.dds.AbstractShardBackendResolver.onConnectResponse(AbstractShardBackendResolver.java:168) ~[?:?]
              at org.opendaylight.controller.cluster.databroker.actors.dds.AbstractShardBackendResolver.lambda$connectShard$4(AbstractShardBackendResolver.java:161) ~[?:?]
              at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
              at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
              at java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:483) ~[?:?]
              at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373) ~[?:?]
              at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182) ~[?:?]
              at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655) ~[?:?]
              at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622) ~[?:?]
              at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165) ~[?:?]

      ----->8----------------

       

      Why is this akka timeout seen in a single node deployment?, Is there any way to disable akka for single node deployments.

       

       

       

            gvrangan Venkatrangan Govindarajan
            gvrangan Venkatrangan Govindarajan
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: