Uploaded image for project: 'ovsdb'
  1. ovsdb
  2. OVSDB-299

unable to read topology after recovering a failed controller in cluster

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Duplicate
    • unspecified
    • None
    • None
    • Operating System: All
      Platform: All

    • 5331

    Description

      The system test failure happens sporadically, but twice in a row most
      recently:
      https://jenkins.opendaylight.org/releng/view/ovsdb/job/ovsdb-csit-3node-clustering-only-beryllium

      The test reports a failure because a GET @ /restconf/operational/network-topology:network-topology/topology/ovsdb:1 returns 500.

      full error below, but appears to indicate some trouble doing a read on the topology shard.

      this GET is done on a node that was first an Owner, then killed and restarted.

      I only have the CSIT to go by at this point and have not reproduced locally or with any more simple
      steps. Maybe this is not OVSDB specific and belongs in controller project as a generic clustering
      bug, but because it is seen in OVSDB CSIT, I'm putting it here first.

      {"errors":{"error":[{"error-type":"application","error-tag":"operation-failed","error-message":"Problem to get data from transaction.","error-info":"ReadFailedException{message=Error executeRead ReadData for path /(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

      {(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

      ], errorList=[RpcError [message=Error executeRead ReadData for path /(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

      {(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

      ], severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=java.lang.Exception: Error creating READ_ONLY transaction on shard topology]]}\n\tat org.opendaylight.controller.cluster.datastore.NoOpTransactionContext.executeRead(NoOpTransactionContext.java:71)\n\tat org.opendaylight.controller.cluster.datastore.TransactionProxy$1.invoke(TransactionProxy.java:92)\n\tat org.opendaylight.controller.cluster.datastore.TransactionContextWrapper.executePriorTransactionOperations(TransactionContextWrapper.java:132)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.createTransactionContext(RemoteTransactionContextSupport.java:237)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.onCreateTransactionComplete(RemoteTransactionContextSupport.java:200)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.access$000(RemoteTransactionContextSupport.java:40)\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport$1.onComplete(RemoteTransactionContextSupport.java:135)\n\tat akka.dispatch.OnComplete.internal(Future.scala:247)\n\tat akka.dispatch.OnComplete.internal(Future.scala:245)\n\tat akka.dispatch.japi$CallbackBridge.apply(Future.scala:175)\n\tat akka.dispatch.japi$CallbackBridge.apply(Future.scala:172)\n\tat scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)\n\tat akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)\n\tat scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)\n\tat akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)\n\tat akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)\n\tat akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)\n\tat scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)\n\tat scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)\n\tat scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)\n\tat scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)\nCaused by: java.lang.Exception: Error creating READ_ONLY transaction on shard topology\n\tat org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.createTransactionContext(RemoteTransactionContextSupport.java:222)\n\t... 20 more\nCaused by: akka.actor.InvalidActorNameException: actor name [shard-member-3-txn-28] is not unique!\n\tat akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130)\n\tat akka.actor.dungeon.Children$class.reserveChild(Children.scala:76)\n\tat akka.actor.ActorCell.reserveChild(ActorCell.scala:369)\n\tat akka.actor.dungeon.Children$class.makeChild(Children.scala:201)\n\tat akka.actor.dungeon.Children$class.actorOf(Children.scala:37)\n\tat akka.actor.ActorCell.actorOf(ActorCell.scala:369)\n\tat org.opendaylight.controller.cluster.datastore.ShardTransactionActorFactory.newShardTransaction(ShardTransactionFactory.java:60)\n\tat org.opendaylight.controller.cluster.datastore.Shard.createTypedTransactionActor(Shard.java:538)\n\tat org.opendaylight.controller.cluster.datastore.Shard.createTransaction(Shard.java:570)\n\tat org.opendaylight.controller.cluster.datastore.Shard.createTransaction(Shard.java:549)\n\tat org.opendaylight.controller.cluster.datastore.Shard.handleCreateTransaction(Shard.java:521)\n\tat org.opendaylight.controller.cluster.datastore.Shard.onReceiveCommand(Shard.java:230)\n\tat akka.persistence.UntypedPersistentActor.onReceive(Eventsourced.scala:430)\n\tat org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:97)\n\tat akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:534)\n\tat akka.persistence.Recovery$State$class.process(Recovery.scala:30)\n\tat akka.persistence.ProcessorImpl$$anon$2.process(Processor.scala:103)\n\tat akka.persistence.ProcessorImpl$$anon$2.aroundReceive(Processor.scala:114)\n\tat akka.persistence.Recovery$class.aroundReceive(Recovery.scala:265)\n\tat akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(Eventsourced.scala:428)\n\tat akka.persistence.Eventsourced$$anon$2.doAroundReceive(Eventsourced.scala:82)\n\tat akka.persistence.Eventsourced$$anon$2.aroundReceive(Eventsourced.scala:78)\n\tat akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:369)\n\tat akka.persistence.UntypedPersistentActor.aroundReceive(Eventsourced.scala:428)\n\tat akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)\n\tat akka.actor.ActorCell.invoke(ActorCell.scala:487)\n\tat akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)\n\tat akka.dispatch.Mailbox.run(Mailbox.scala:220)\n\t... 5 more\n"}]}}

      Attachments

        1. 5331_odl1_karaf.log
          539 kB
        2. 5331_odl2_karaf.log
          391 kB
        3. 5331_odl3_karaf.log
          770 kB

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Unassigned Unassigned
              jluhrsen Jamo Luhrsen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: