[OVSDB-299] unable to read topology after recovering a failed controller in cluster Created: 11/Feb/16  Updated: 19/Oct/17  Resolved: 12/Feb/16

Status: Resolved
Project: ovsdb
Component/s: Southbound.Open_vSwitch
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Jamo Luhrsen Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: Text File 5331_odl1_karaf.log     Text File 5331_odl2_karaf.log     Text File 5331_odl3_karaf.log    
Issue Links:
Duplicate
duplicates CONTROLLER-1479 Cluster HA test: 500 Server Error whe... Resolved
External issue ID: 5331

 Description   

The system test failure happens sporadically, but twice in a row most
recently:
https://jenkins.opendaylight.org/releng/view/ovsdb/job/ovsdb-csit-3node-clustering-only-beryllium

The test reports a failure because a GET @ /restconf/operational/network-topology:network-topology/topology/ovsdb:1 returns 500.

full error below, but appears to indicate some trouble doing a read on the topology shard.

this GET is done on a node that was first an Owner, then killed and restarted.

I only have the CSIT to go by at this point and have not reproduced locally or with any more simple
steps. Maybe this is not OVSDB specific and belongs in controller project as a generic clustering
bug, but because it is seen in OVSDB CSIT, I'm putting it here first.

{"errors":{"error":[{"error-type":"application","error-tag":"operation-failed","error-message":"Problem to get data from transaction.","error-info":"ReadFailedException{message=Error executeRead ReadData for path /(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

], errorList=[RpcError [message=Error executeRead ReadData for path /(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

], severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=java.lang.Exception: Error creating READ_ONLY transaction on shard topology]]}\n\tat org.opendaylight.controller.cluster.datastore.NoOpTransactionContext.executeRead(NoOpTransactionContext.java:71)\n\tat org.opendaylight.controller.cluster.datastore.TransactionProxy$1.invoke(TransactionProxy.java:92)\n\tat org.opendaylight.controller.cluster.datastore.TransactionContextWrapper.executePriorTransactionOperations(TransactionContextWrapper.java:132)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.createTransactionContext(RemoteTransactionContextSupport.java:237)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.onCreateTransactionComplete(RemoteTransactionContextSupport.java:200)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.access$000(RemoteTransactionContextSupport.java:40)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport$1.onComplete(RemoteTransactionContextSupport.java:135)\n\tat akka.dispatch.OnComplete.internal(Future.scala:247)\n\tat akka.dispatch.OnComplete.internal(Future.scala:245)\n\tat akka.dispatch.japi$CallbackBridge.apply(Future.scala:175)\n\tat akka.dispatch.japi$CallbackBridge.apply(Future.scala:172)\n\tat scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)\n\tat akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)\n\tat scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)\n\tat akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)\n\tat akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)\n\tat scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)\n\tat scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)\n\tat scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)\n\tat scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)\nCaused by: java.lang.Exception: Error creating READ_ONLY transaction on shard topology\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.createTransactionContext(RemoteTransactionContextSupport.java:222)\n\t... 20 more\nCaused by: akka.actor.InvalidActorNameException: actor name [shard-member-3-txn-28] is not unique!\n\tat akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130)\n\tat akka.actor.dungeon.Children$class.reserveChild(Children.scala:76)\n\tat akka.actor.ActorCell.reserveChild(ActorCell.scala:369)\n\tat akka.actor.dungeon.Children$class.makeChild(Children.scala:201)\n\tat akka.actor.dungeon.Children$class.actorOf(Children.scala:37)\n\tat akka.actor.ActorCell.actorOf(ActorCell.scala:369)\n\tat org.opendaylight.controller.cluster.datastore.ShardTransactionActorFactory.newShardTransaction(ShardTransactionFactory.java:60)\n\tat org.opendaylight.controller.cluster.datastore.Shard.createTypedTransactionActor(Shard.java:538)\n\tat org.opendaylight.controller.cluster.datastore.Shard.createTransaction(Shard.java:570)\n\tat org.opendaylight.controller.cluster.datastore.Shard.createTransaction(Shard.java:549)\n\tat org.opendaylight.controller.cluster.datastore.Shard.handleCreateTransaction(Shard.java:521)\n\tat org.opendaylight.controller.cluster.datastore.Shard.onReceiveCommand(Shard.java:230)\n\tat akka.persistence.UntypedPersistentActor.onReceive(Eventsourced.scala:430)\n\tat org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:97)\n\tat akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:534)\n\tat akka.persistence.Recovery$State$class.process(Recovery.scala:30)\n\tat akka.persistence.ProcessorImpl$$anon$2.process(Processor.scala:103)\n\tat akka.persistence.ProcessorImpl$$anon$2.aroundReceive(Processor.scala:114)\n\tat akka.persistence.Recovery$class.aroundReceive(Recovery.scala:265)\n\tat akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(Eventsourced.scala:428)\n\tat akka.persistence.Eventsourced$$anon$2.doAroundReceive(Eventsourced.scala:82)\n\tat akka.persistence.Eventsourced$$anon$2.aroundReceive(Eventsourced.scala:78)\n\tat akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:369)\n\tat akka.persistence.UntypedPersistentActor.aroundReceive(Eventsourced.scala:428)\n\tat akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)\n\tat akka.actor.ActorCell.invoke(ActorCell.scala:487)\n\tat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)\n\tat akka.dispatch.Mailbox.run(Mailbox.scala:220)\n\t... 5 more\n"}]}}



 Comments   
Comment by Jamo Luhrsen [ 11/Feb/16 ]

Attachment 5331_odl1_karaf.log has been added with description: controller1 karaf.log

Comment by Jamo Luhrsen [ 11/Feb/16 ]

Attachment 5331_odl2_karaf.log has been added with description: controller2 karaf.log

Comment by Jamo Luhrsen [ 11/Feb/16 ]

Attachment 5331_odl3_karaf.log has been added with description: controller3 karaf.log

Generated at Wed Feb 07 20:36:02 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.