[CONTROLLER-1753] AskTimeoutException in dsbenchmark nitrogen -all- job Created: 21/Aug/17  Updated: 19/Oct/17

Status: Open
Project: controller
Component/s: clustering
Affects Version/s: Nitrogen
Fix Version/s: None

Type: Bug
Reporter: Vratko Polak Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 9021

 Description   

This occasionally affects CSIT results: [0].

Here is a copy of the restconf error message:
{"errors":{"error":[

{"error-type":"application","error-tag":"operation-failed","error-message":"The operation encountered an unexpected error while executing.","error-info":"akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://opendaylight-cluster-data@10.29.15.213:2550/user/rpc/broker#1754800661]] after [15000 ms]. Sender[null] sent message of type \"org.opendaylight.controller.remote.rpc.messages.ExecuteRpc\".\n\tat akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)\n\tat akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)\n\tat scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)\n\tat scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)\n\tat scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)\n\tat akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)\n\tat java.lang.Thread.run(Thread.java:748)\n"}

]}}

Karaf.log does not contain any exception. Moreover, even though the test is aiming at current leader [1] (member-3), the INFO messages in karaf.log are present on member-2 [2] instead.

As the title suggests, this does not happen in only jobs nor in Carbon jobs.
Apparently, some other ODL feature is slowing datastore down, but even then, AskTimeoutException usually happens only for long pauses (the job does not start ODL with Garbage Collection logging). To be clear, no leader movement or UnreachableMember happens either.

[0] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-periodic-benchmark-all-nitrogen/116/log.html.gz#s1-t2-k2-k6-k2
[1] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-periodic-benchmark-all-nitrogen/116/log.html.gz#s1-t2-k2-k1-k3-k5
[2] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-periodic-benchmark-all-nitrogen/116/odl2_karaf.log.gz



 Comments   
Comment by Tom Pantelis [ 02/Sep/17 ]

The message that timed out was ExecuteRpc which is related to RPCs and not the data store.

Generated at Wed Feb 07 19:56:22 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.