[OPNFLWPLUG-876] Send RPC to non-owner after original owner is killed is not stable in Carbon Created: 07/Apr/17 Updated: 27/Sep/21 Resolved: 22/May/17 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Luis Gomez | Assignee: | Luis Gomez |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
| External issue ID: | 8185 |
| Description |
|
As tracked here: We get ERROR when trying to push an RPC flow from not-owner instance after original owner is killed: {"errors":{"error":[ {"error-type":"application","error-tag":"operation-failed","error-message":"The operation encountered an unexpected error while executing.","error-info":"akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://opendaylight-cluster-data@10.29.12.203:2550/user/rpc/broker#-1313599470]] after [15000 ms]. Sender[null] sent message of type \"org.opendaylight.controller.remote.rpc.messages.ExecuteRpc\".\n\tat akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)\n\tat akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)\n\tat scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)\n\tat scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)\n\tat scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)\n\tat akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)\n\tat akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)\n\tat java.lang.Thread.run(Thread.java:745)\n"}]}} |
| Comments |
| Comment by Anil Vishnoi [ 14/Apr/17 ] |
|
Luis, do you see that device gets connected to other controller? |
| Comment by Luis Gomez [ 14/Apr/17 ] |
|
Yes because same RPC works from new owner: The test:
BR/Luis |
| Comment by Anil Vishnoi [ 14/Apr/17 ] |
|
Does it fail even if you retry multiple times ? |
| Comment by Luis Gomez [ 14/Apr/17 ] |
|
Add request takes 15 secs to fail, after that there is a delete request taking also 15 secs. |
| Comment by Luis Gomez [ 21/Apr/17 ] |
|
Attachment karaf-1-new-owner.txt has been added with description: new owner log |
| Comment by Luis Gomez [ 21/Apr/17 ] |
|
Attachment karaf-2-old-owner.txt has been added with description: old owner log |
| Comment by Luis Gomez [ 21/Apr/17 ] |
|
Attachment karaf-3-candidate-RPC.txt has been added with description: candidate-RPC-fails |
| Comment by Luis Gomez [ 21/Apr/17 ] |
|
Uploaded cluster logs. The switch being tested is openflow:1, you can neglect openflow:2 and openflow:3, they are just for discovering topology. |
| Comment by Luis Gomez [ 25/Apr/17 ] |
|
BTW, same or similar issue is observed in controller cluster test: |
| Comment by Tomas Slusny [ 12/May/17 ] |
|
This should be fixed with this patch by Jozef: https://git.opendaylight.org/gerrit/#/c/56918/ |