[CONTROLLER-1284] Clustered datastore failing with canCommit encountered an unexpected failure Created: 28/Apr/15  Updated: 19/Oct/17  Resolved: 19/May/15

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Post-Helium
Fix Version/s: None

Type: Bug
Reporter: Andrej Marcinek Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Duplicate
duplicates CONTROLLER-1246 Restarting controller after configuri... Resolved
External issue ID: 3079

 Description   

sal-netconf-connector failing to write data after it is connected

2015-04-28 07:25:00,340 | ERROR | CommitFutures-12 | NetconfDeviceDatastoreAdapter | 245 - org.opendaylight.controller.sal-netconf-connector - 1.2.0.SNAPSHOT | RemoteDevice

{controller-config}

: Transaction(init) DOM-CHAIN-0-0 FAILED!
TransactionCommitFailedException

{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.controller.cluster.datastore.exceptions.NotInitializedException: Found primary shard member-1-shard-inventory-config but it's not initialized yet. Please try again later]]}

at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)[187:org.opendaylight.controller.sal-broker-impl:1.2.0.SNAPSHOT]
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:18)[187:org.opendaylight.controller.sal-broker-impl:1.2.0.SNAPSHOT]
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:96)[97:org.opendaylight.yangtools.util:0.7.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ConcurrentDOMDataBroker.handleException(ConcurrentDOMDataBroker.java:212)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ConcurrentDOMDataBroker.access$100(ConcurrentDOMDataBroker.java:43)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ConcurrentDOMDataBroker$1.onFailure(ConcurrentDOMDataBroker.java:122)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)[94:com.google.guava:18.0.0]
at org.opendaylight.controller.cluster.datastore.ConcurrentDOMDataBroker$SimpleSameThreadExecutor.execute(ConcurrentDOMDataBroker.java:334)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[94:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[94:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)[94:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)[94:com.google.guava:18.0.0]
at org.opendaylight.controller.cluster.datastore.SingleCommitCohortProxy$1.onComplete(SingleCommitCohortProxy.java:60)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at akka.dispatch.OnComplete.internal(Future.scala:246)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.OnComplete.internal(Future.scala:244)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171)[197:com.typesafe.akka.actor:2.3.9]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)[197:com.typesafe.akka.actor:2.3.9]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)[197:com.typesafe.akka.actor:2.3.9]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
Caused by: org.opendaylight.controller.cluster.datastore.exceptions.NotInitializedException: Found primary shard member-1-shard-inventory-config but it's not initialized yet. Please try again later
at org.opendaylight.controller.cluster.datastore.ShardManager.createNotInitializedException(ShardManager.java:416)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardManager.onShardNotInitializedTimeout(ShardManager.java:224)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardManager.handleCommand(ShardManager.java:187)[212:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedPersistentActor.onReceiveCommand(AbstractUntypedPersistentActor.java:36)[204:org.opendaylight.controller.sal-clustering-commons:1.2.0.SNAPSHOT]
at akka.persistence.UntypedPersistentActor.onReceive(Eventsourced.scala:430)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:97)[204:org.opendaylight.controller.sal-clustering-commons:1.2.0.SNAPSHOT]
at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:534)[197:com.typesafe.akka.actor:2.3.9]
at akka.persistence.Recovery$State$class.process(Recovery.scala:30)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.ProcessorImpl$$anon$2.process(Processor.scala:103)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.ProcessorImpl$$anon$2.aroundReceive(Processor.scala:114)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.Recovery$class.aroundReceive(Recovery.scala:265)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(Eventsourced.scala:428)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.Eventsourced$$anon$2.doAroundReceive(Eventsourced.scala:82)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.Eventsourced$$anon$2.aroundReceive(Eventsourced.scala:78)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:369)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.persistence.UntypedPersistentActor.aroundReceive(Eventsourced.scala:428)[202:com.typesafe.akka.persistence.experimental:2.3.9]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)[197:com.typesafe.akka.actor:2.3.9]
at akka.actor.ActorCell.invoke(ActorCell.scala:487)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.Mailbox.run(Mailbox.scala:221)[197:com.typesafe.akka.actor:2.3.9]
at akka.dispatch.Mailbox.exec(Mailbox.scala:231)[197:com.typesafe.akka.actor:2.3.9]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)[194:org.scala-lang.scala-library:2.10.4.v20140209-180020-VFINAL-b66a39653b]
... 2 more



 Comments   
Comment by Moiz Raja [ 28/Apr/15 ]

How do we reproduce this? Or logs could be helpful.

Comment by Maros Marsalek [ 29/Apr/15 ]

This failed when loopback netconf connector tried to write its initial data into datastore. So you just have to install odl-netconf-connector-all into empty distribution.

Comment by Maros Marsalek [ 29/Apr/15 ]

We install this:
feature:install odl-restconf odl-netconf-connector-all

Comment by Andrej Marcinek [ 06/May/15 ]

seems that this bug occurs when odl is using CDS with persistence turned on and odl is killed. then when it starts again, this error is showing and also i'm not able to do GET through restconf, for example: /restconf/config/opendaylight-inventory:nodes/node/controller-config

Comment by Maros Marsalek [ 07/May/15 ]

We should try shutting down the ODL properly using the shutdown karaf command and see if the issue is still present.

Comment by Andrej Marcinek [ 12/May/15 ]

correct shutdown did not help, but we found workaround for this: just delete controller.currentconfig.xml and data, journal, snapshots folders and everything works fine then in test, no timeouts occured

Comment by Tony Tkacik [ 12/May/15 ]

Seems to be related to CONTROLLER-1246

Comment by Maros Marsalek [ 19/May/15 ]

Caused by unwanted reconfiguration of CDS

Generated at Wed Feb 07 19:55:08 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.