[NETCONF-344] Code 500 in netconf cluster test (Master is down.) Created: 30/Jan/17  Updated: 15/Mar/19  Resolved: 21/Feb/17

Status: Resolved
Project: netconf
Component/s: netconf
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Vratko Polak Assignee: Miroslav Kovac
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 7705

 Description   

This is the same symptom as NETCONF-261 but karaf.log contains different errors. Similarly to 6499, this only happens occasionally.

Example of Robot failure: [0] (note the PUT took 20 seconds to return).
Response code: 500 [1].
Response text: <errors xmlns="urn:ietf:params:xml:ns:yang:ietf-restconf"><error><error-type>application</error-type><error-tag>operation-failed</error-tag><error-message>Problem while PUT operations</error-message></error></errors>

Karaf.log segment (without lines from unrelated projects):

2017-01-28 01:42:24,792 | WARN | ult-dispatcher-2 | BrokerFacade | 364 - org.opendaylight.netconf.sal-rest-connector - 1.5.0.SNAPSHOT | Exception by reading CONFIGURATION via Restconf: /(urn:opendaylight:test:netconf:crud?revision=2014-10-18)cont
NetconfDocumentedException{message=RemoteDevice

{netconf-test-device}:Master is down. Please try again., errorType=APPLICATION, errorTag=OPERATION_FAILED, errorSeverity=WARNING, errorInfo={}}
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfProxyDOMTransaction$1.onComplete(NetconfProxyDOMTransaction.java:70)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
at akka.dispatch.OnComplete.internal(Future.scala:258)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.OnComplete.internal(Future.scala:256)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
2017-01-28 01:42:24,795 | ERROR | ult-dispatcher-2 | ExecutionList | 65 - com.google.guava - 18.0.0 | RuntimeException while executing runnable com.google.common.util.concurrent.Futures$6@2eae3fa5 with executor INSTANCE
org.opendaylight.netconf.sal.restconf.impl.RestconfDocumentedException: errors: [RestconfError [error-type: application, error-tag: operation-failed, error-message: Problem to get data from transaction., error-info: NetconfDocumentedException{message=RemoteDevice{netconf-test-device}

:Master is down. Please try again., errorType=APPLICATION, errorTag=OPERATION_FAILED, errorSeverity=WARNING, errorInfo={}}
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfProxyDOMTransaction$1.onComplete(NetconfProxyDOMTransaction.java:70)
at akka.dispatch.OnComplete.internal(Future.scala:258)
at akka.dispatch.OnComplete.internal(Future.scala:256)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
]]
at org.opendaylight.netconf.sal.restconf.impl.BrokerFacade.handlingCallback(BrokerFacade.java:1259)[364:org.opendaylight.netconf.sal-rest-connector:1.5.0.SNAPSHOT]
at org.opendaylight.netconf.sal.restconf.impl.BrokerFacade$2.onFailure(BrokerFacade.java:537)[364:org.opendaylight.netconf.sal-rest-connector:1.5.0.SNAPSHOT]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)[65:com.google.guava:18.0.0]
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfReadOnlyTransaction$2.onComplete(NetconfReadOnlyTransaction.java:87)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfReadOnlyTransaction$2.onComplete(NetconfReadOnlyTransaction.java:68)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
at akka.dispatch.OnComplete.internal(Future.scala:258)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.OnComplete.internal(Future.scala:256)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
Caused by: NetconfDocumentedException{message=RemoteDevice

{netconf-test-device}:Master is down. Please try again., errorType=APPLICATION, errorTag=OPERATION_FAILED, errorSeverity=WARNING, errorInfo={}}
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfProxyDOMTransaction$1.onComplete(NetconfProxyDOMTransaction.java:70)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
... 17 more
2017-01-28 01:42:29,257 | WARN | lt-dispatcher-20 | ReliableDeliverySupervisor | 181 - com.typesafe.akka.slf4j - 2.4.16 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.29.13.122:2550] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://opendaylight-cluster-data@10.29.13.122:2550]] Caused by: [Connection refused: /10.29.13.122:2550]
2017-01-28 01:42:34,459 | WARN | ult-dispatcher-5 | ReliableDeliverySupervisor | 181 - com.typesafe.akka.slf4j - 2.4.16 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.29.13.122:2550] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://opendaylight-cluster-data@10.29.13.122:2550]] Caused by: [Connection refused: /10.29.13.122:2550]
2017-01-28 01:42:34,813 | ERROR | lt-dispatcher-18 | ExecutionList | 65 - com.google.guava - 18.0.0 | RuntimeException while executing runnable com.google.common.util.concurrent.Futures$6@38f5fe8a with executor INSTANCE
org.opendaylight.netconf.sal.restconf.impl.RestconfDocumentedException: errors: [RestconfError [error-type: application, error-tag: operation-failed, error-message: RemoteDevice{netconf-test-device}

:Master is down. Please try again., error-info: NetconfDocumentedException{message=RemoteDevice

{netconf-test-device}:Master is down. Please try again., errorType=APPLICATION, errorTag=OPERATION_FAILED, errorSeverity=WARNING, errorInfo={}}
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfProxyDOMTransaction$3.onComplete(NetconfProxyDOMTransaction.java:169)
at akka.dispatch.OnComplete.internal(Future.scala:258)
at akka.dispatch.OnComplete.internal(Future.scala:256)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
]]
at org.opendaylight.netconf.sal.restconf.impl.RestconfImpl.handlingLoggerPut(RestconfImpl.java:867)[364:org.opendaylight.netconf.sal-rest-connector:1.5.0.SNAPSHOT]
at org.opendaylight.netconf.sal.restconf.impl.RestconfImpl$1.onFailure(RestconfImpl.java:834)[364:org.opendaylight.netconf.sal-rest-connector:1.5.0.SNAPSHOT]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)[65:com.google.guava:18.0.0]
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfWriteOnlyTransaction$2.onComplete(NetconfWriteOnlyTransaction.java:100)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfWriteOnlyTransaction$2.onComplete(NetconfWriteOnlyTransaction.java:94)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
at akka.dispatch.OnComplete.internal(Future.scala:258)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.OnComplete.internal(Future.scala:256)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)[180:com.typesafe.akka.actor:2.4.16]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)[180:com.typesafe.akka.actor:2.4.16]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[176:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
Caused by: NetconfDocumentedException{message=RemoteDevice{netconf-test-device}

:Master is down. Please try again., errorType=APPLICATION, errorTag=OPERATION_FAILED, errorSeverity=WARNING, errorInfo={}}
at org.opendaylight.netconf.topology.singleton.impl.tx.NetconfProxyDOMTransaction$3.onComplete(NetconfProxyDOMTransaction.java:169)[521:org.opendaylight.netconf.topology-singleton:1.2.0.SNAPSHOT]
... 17 more

[0] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/165/archives/log.html.gz#s1-s6-t12-k2-k1-k3-k6
[1] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/165/archives/log.html.gz#s1-s6-t12-k2-k1-k3-k7-k1



 Comments   
Comment by Vratko Polak [ 09/Feb/17 ]

Note that the fact http status is 500 has separate NETCONF-348.

This Bug is specifically about "Master is down" for 20 seconds after netconf device owner has been killed.

Comment by Miroslav Kovac [ 16/Feb/17 ]

Restconf sends 2 requests when put -> first is read request and if master is down slave wait for 10 seconds for response which it doesn`t receive and then restconf receives null and decides to create new data cause it things that data do not exist. Fix will throw error imediately after it receives exception from read request.
https://git.opendaylight.org/gerrit/#/c/51902/

Comment by Miroslav Kovac [ 16/Feb/17 ]

Also I changed waiting time to 5 seconds for slave to wait for response. This time is now configurable if for some reason someone needs more time for one request to be processed.
https://git.opendaylight.org/gerrit/#/c/51949/

Comment by Miroslav Kovac [ 21/Feb/17 ]

https://git.opendaylight.org/gerrit/#/c/51792/

Generated at Wed Feb 07 20:14:46 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.