[NETVIRT-985] java.lang.OutOfMemoryError: Java heap space Created: 10/Nov/17 Updated: 08/May/18 Resolved: 30/Nov/17 |
|
| Status: | Resolved |
| Project: | netvirt |
| Component/s: | None |
| Affects Version/s: | Carbon |
| Fix Version/s: | Nitrogen, Carbon, Oxygen |
| Type: | Bug | Priority: | Medium |
| Reporter: | Sai Sindhur Malleni | Assignee: | Michael Vorburger |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
During perf/scale testing, we see OOM on ODL when a large number of neutron resources are created and deleted we see that ODL is killed due to OOM. Looking at the stdout when the JVM crashed due to OOM, we see: Heap dump file created [3089813876 bytes in 19.232 secs] |
| Comments |
| Comment by Michael Vorburger [ 13/Nov/17 ] |
|
Attached java_pid1_Leak_Suspects.zip from smalleni environment shows this another one like Attached is the usual MAT produced Leak Suspects report, from a HPROF from 1st of 3 cluster nodes; the 2nd one show basically the same - 800 MB on 1 GB of total 2 GB due to huge Map in ShardDataTree. PS: The 3rd node does an OOM with a slightly different HPROF, see attached controller-2__java_pid1_Leak_Suspects.zip, showing 745 MB used by the thread MonitorableThreadFactory$AkkaForkJoinWorkerThread local variables, and 310 MB in org.opendaylight.controller.cluster.datastore.FrontendMetadata. I don't know enough about clustering internals to judge if that is just an impact of above, but unless educated otherwise, will have to suspect it is so. |
| Comment by Michael Vorburger [ 13/Nov/17 ] |
|
The Karaf log on controller-0 and controller-1 (both with the ShardDataTree OOM) do not contain "IllegalStateException: Attempted to close chain with outstanding transaction" as in |
| Comment by Sai Sindhur Malleni [ 13/Nov/17 ] |
|
Michael, shouldn't you be looking at logs on controller-1 and controller-2 since that is where the ODLs hit OOM. |
| Comment by Michael Vorburger [ 13/Nov/17 ] |
|
smalleni yeah you're right, I'll pour more through controller-1 log... but FYI you're controller-0 WOULD HAVE (was about to) OOM as well anyway... |
| Comment by Michael Vorburger [ 13/Nov/17 ] |
|
Grep'ing for "chain" in the attached karaf.log.xz from controller-1 reveals a lot of the following kinds messages again from openflowplugin which I have a hunch may have something to do with this: 1. "Transaction chain failed, recreating chain due to ..." : 2017-11-08 20:26:13,390 | WARN | CommitFutures-1 | TransactionChainManager | 304 - org.opendaylight.openflowplugin.common - 0.4.2.Carbon-redhat-2 | Transaction chain failed, recreating chain due to
OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was created by other transaction.]]}
at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:751)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(ShardDataTree.java:789)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction(ShardDataTree.java:736)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(ShardDataTree.java:819)
at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:90)
at org.opendaylight.controller.cluster.datastore.ChainedCommitCohort.canCommit(ChainedCommitCohort.java:58)
(...)
Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was created by other transaction.
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkWriteApplicable(SchemaAwareApplyOperation.java:174)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:128)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
(...)
2. "WARN: ConflictingModificationAppliedException" && "ERROR Exception during INITIAL transaction submitting. OptimisticLockFailedException" : 2017-11-09 13:28:33,002 | WARN | lt-dispatcher-41 | ConcurrentDOMDataBroker | 211 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.Carbon-redhat-2 | Tx: DOM-CHAIN-20-0 Error during phase CAN_COMMIT, starting Abort
OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.ap
i.schema.tree.ConflictingModificationAppliedException: Node children was modified by other transaction]]}
at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:751)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(ShardDataTree.java:789)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction(ShardDataTree.java:736)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(ShardDataTree.java:819)
at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:90)
at org.opendaylight.controller.cluster.datastore.ChainedCommitCohort.canCommit(ChainedCommitCohort.java:58)
at org.opendaylight.controller.cluster.datastore.CohortEntry.canCommit(CohortEntry.java:97)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleCanCommit(ShardCommitCoordinator.java:236)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleForwardedReadyTransaction(ShardCommitCoordinator.java:114)
at org.opendaylight.controller.cluster.datastore.Shard.handleForwardedReadyTransaction(Shard.java:699)
at org.opendaylight.controller.cluster.datastore.Shard.handleNonRaftCommand(Shard.java:314)
at org.opendaylight.controller.cluster.raft.RaftActor.handleCommand(RaftActor.java:270)
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedPersistentActor.onReceiveCommand(AbstractUntypedPersistentActor.java:44)
at akka.persistence.UntypedPersistentActor.onReceive(PersistentActor.scala:170)
at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:104)
at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(PersistentActor.scala:168)
at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:727)
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:183)
at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:168)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node children was modified by other transaction
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkConflicting(SchemaAwareApplyOperation.java:78)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkNotConflicting(SchemaAwareApplyOperation.java:108)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkWriteApplicable(SchemaAwareApplyOperation.java:170)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:128)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:288)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.StructuralContainerModificationStrategy.checkApplicable(StructuralContainerModificationStrategy.java:99)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:288)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.RootModificationApplyOperation.checkApplicable(RootModificationApplyOperation.java:72)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractDataTreeTip.validate(AbstractDataTreeTip.java:35)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:743)
... 29 more
2017-11-09 13:28:33,003 | ERROR | ntLoopGroup-7-36 | TransactionChainManager | 304 - org.opendaylight.openflowplugin.common - 0.4.2.Carbon-redhat-2 | Exception during INITIAL transaction submitting.
java.util.concurrent.ExecutionException: OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node children was modified by other transaction]]}
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:272)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:96)
at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:69)
at org.opendaylight.openflowplugin.common.txchain.TransactionChainManager.submitTransaction(TransactionChainManager.java:180)
at org.opendaylight.openflowplugin.common.txchain.TransactionChainManager.initialSubmitWriteTransaction(TransactionChainManager.java:85)
at org.opendaylight.openflowplugin.impl.device.DeviceContextImpl.initialSubmitTransaction(DeviceContextImpl.java:241)
at org.opendaylight.openflowplugin.impl.statistics.StatisticsContextImpl$4.onSuccess(StatisticsContextImpl.java:363)
at org.opendaylight.openflowplugin.impl.statistics.StatisticsContextImpl$4.onSuccess(StatisticsContextImpl.java:355)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1319)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:185)
at com.google.common.util.concurrent.Futures$ChainingListenableFuture$1.run(Futures.java:918)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:185)
at com.google.common.util.concurrent.Futures$ChainingListenableFuture$1.run(Futures.java:918)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:185)
at com.google.common.util.concurrent.Futures$ChainingListenableFuture$1.run(Futures.java:918)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.Futures$ImmediateFuture.addListener(Futures.java:106)
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:914)
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:185)
at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:53)
at org.opendaylight.openflowplugin.impl.rpc.AbstractRequestContext.setResult(AbstractRequestContext.java:32)
at org.opendaylight.openflowplugin.impl.device.listener.MultiMsgCollectorImpl.endCollecting(MultiMsgCollectorImpl.java:70)
at org.opendaylight.openflowplugin.impl.device.listener.MultiMsgCollectorImpl.addMultipartMsg(MultiMsgCollectorImpl.java:58)
at org.opendaylight.openflowplugin.impl.services.AbstractMultipartRequestCallback.onSuccess(AbstractMultipartRequestCallback.java:54)
at org.opendaylight.openflowplugin.impl.services.AbstractMultipartRequestCallback.onSuccess(AbstractMultipartRequestCallback.java:22)
at org.opendaylight.openflowjava.protocol.impl.core.connection.OutboundQueueEntry.complete(OutboundQueueEntry.java:121)
at org.opendaylight.openflowjava.protocol.impl.core.connection.StackedSegment.completeEntry(StackedSegment.java:113)
at org.opendaylight.openflowjava.protocol.impl.core.connection.StackedSegment.pairRequest(StackedSegment.java:153)
at org.opendaylight.openflowjava.protocol.impl.core.connection.AbstractStackedOutboundQueue.pairRequest(AbstractStackedOutboundQueue.java:203)
at org.opendaylight.openflowjava.protocol.impl.core.connection.AbstractOutboundQueueManager.onMessage(AbstractOutboundQueueManager.java:208)
at org.opendaylight.openflowjava.protocol.impl.core.connection.ConnectionAdapterImpl.consumeDeviceMessage(ConnectionAdapterImpl.java:143)
at org.opendaylight.openflowjava.protocol.impl.core.connection.AbstractConnectionAdapterStatistics.consume(AbstractConnectionAdapterStatistics.java:66)
at org.opendaylight.openflowjava.protocol.impl.core.connection.ConnectionAdapterImpl.consume(ConnectionAdapterImpl.java:44)
at org.opendaylight.openflowjava.protocol.impl.core.DelegatingInboundHandler.channelRead(DelegatingInboundHandler.java:43)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at org.opendaylight.openflowjava.protocol.impl.core.connection.AbstractOutboundQueueManager.channelRead(AbstractOutboundQueueManager.java:195)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at org.opendaylight.openflowjava.protocol.impl.core.IdleHandler.channelRead(IdleHandler.java:39)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:565)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:479)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:433)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Thread.java:748)
Caused by: OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node children was modified by other transaction]]}
at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:751)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(ShardDataTree.java:789)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction(ShardDataTree.java:736)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(ShardDataTree.java:819)
at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:90)
at org.opendaylight.controller.cluster.datastore.ChainedCommitCohort.canCommit(ChainedCommitCohort.java:58)
at org.opendaylight.controller.cluster.datastore.CohortEntry.canCommit(CohortEntry.java:97)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleCanCommit(ShardCommitCoordinator.java:236)
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleForwardedReadyTransaction(ShardCommitCoordinator.java:114)
at org.opendaylight.controller.cluster.datastore.Shard.handleForwardedReadyTransaction(Shard.java:699)
at org.opendaylight.controller.cluster.datastore.Shard.handleNonRaftCommand(Shard.java:314)
at org.opendaylight.controller.cluster.raft.RaftActor.handleCommand(RaftActor.java:270)
at org.opendaylight.controller.cluster.common.actor.AbstractUntypedPersistentActor.onReceiveCommand(AbstractUntypedPersistentActor.java:44)
at akka.persistence.UntypedPersistentActor.onReceive(PersistentActor.scala:170)
at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:104)
at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(PersistentActor.scala:168)
at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:727)
at akka.persistence.Eventsourced$class.aroundReceive(Eventsourced.scala:183)
at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:168)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node children was modified by other transaction
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkConflicting(SchemaAwareApplyOperation.java:78)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkNotConflicting(SchemaAwareApplyOperation.java:108)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkWriteApplicable(SchemaAwareApplyOperation.java:170)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:128)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:313)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:288)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.StructuralContainerModificationStrategy.checkApplicable(StructuralContainerModificationStrategy.java:99)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:305)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:288)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.RootModificationApplyOperation.checkApplicable(RootModificationApplyOperation.java:72)
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractDataTreeTip.validate(AbstractDataTreeTip.java:35)
at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:743)
2017-11-09 13:28:33,003 | WARN | ntLoopGroup-7-36 | ContextChainHolderImpl | 307 - org.opendaylight.openflowplugin.impl - 0.4.2.Carbon-redhat-2 | Not able to set MASTER role on device openflow:123674267060747, reason: Initial transaction cannot be submitted.
2017-11-09 13:28:33,003 | WARN | ntLoopGroup-7-36 | ContextChainHolderImpl | 307 - org.opendaylight.openflowplugin.impl - 0.4.2.Carbon-redhat-2 | This mastering is mandatory, destroying context chain and closing connection for device openflow:123674267060747.
2017-11-09 13:28:33,003 | INFO | ntLoopGroup-7-36 | ContextChainImpl | 307 - org.opendaylight.openflowplugin.impl - 0.4.2.Carbon-redhat-2 | Closed clustering services registration for node openflow:123674267060747
3. "Failed to shut down transaction chain for device" (more in 4.): 2017-11-09 13:28:33,004 | WARN | ntLoopGroup-7-36 | DeviceContextImpl | 307 - org.opendaylight.openflowplugin.impl - 0.4.2.Carbon-redhat-2 | Failed to shut down transaction chain for device openflow:123674267060747: java.util.concurrent.ExecutionException: OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node children was modified by other transaction]]}
|
| Comment by Michael Vorburger [ 13/Nov/17 ] |
|
4. "ShardLeaderNotRespondingException: Could not create a READ_WRITE transaction on shard default. The shard leader isn't responding." & "Failed to shut down transaction chain for device openflow:": 2017-11-09 20:14:30,886 | ERROR | lt-dispatcher-46 | TransactionChainProxy | 211 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.Carbon-redhat-2 | Tx: member-1-datastore-operational-fe-1-chn-122-txn-86-0 - ready future failed f
or previous Tx member-1-datastore-operational-fe-1-chn-122-txn-86-0
2017-11-09 20:14:30,886 | WARN | lt-dispatcher-78 | ConcurrentDOMDataBroker | 211 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.Carbon-redhat-2 | Tx: DOM-CHAIN-121-85 Error during phase CAN_COMMIT, starting Abort
akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational/shard-default-member-1:datastore-operational@1:122-8
5_4948293#1836430803)]] after [30000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.BatchedModifications".
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)[185:com.typesafe.akka.actor:2.4.18]
at java.lang.Thread.run(Thread.java:748)[:1.8.0_151]
2017-11-09 20:14:30,886 | WARN | CommitFutures-64 | TransactionChainManager | 304 - org.opendaylight.openflowplugin.common - 0.4.2.Carbon-redhat-2 | Transaction chain failed, recreating chain due to
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational/shard-default-member-1:datastore-operational@1:122-85_4948293#1836430803)]] after [30000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.BatchedModifications".]]}
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)[171:org.opendaylight.controller.sal-broker-impl:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:18)[171:org.opendaylight.controller.sal-broker-impl:1.5.2.Carbon-redhat-2]
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:102)[69:org.opendaylight.yangtools.util:1.1.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker.handleException(ConcurrentDOMDataBroker.java:231)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker.access$000(ConcurrentDOMDataBroker.java:46)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker$1.onFailure(ConcurrentDOMDataBroker.java:126)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)[65:com.google.guava:18.0.0]
at org.opendaylight.controller.cluster.datastore.SingleCommitCohortProxy$1.onComplete(SingleCommitCohortProxy.java:58)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at akka.dispatch.OnComplete.internal(Future.scala:258)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.OnComplete.internal(Future.scala:256)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational/shard-default-member-1:datastore-operational@1:122-85_4948293#1836430803)]] after [30000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.BatchedModifications".
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)[185:com.typesafe.akka.actor:2.4.18]
at java.lang.Thread.run(Thread.java:748)[:1.8.0_151]
2017-11-09 20:14:30,886 | ERROR | CommitFutures-64 | TransactionChainManager | 304 - org.opendaylight.openflowplugin.common - 0.4.2.Carbon-redhat-2 | Transaction commit failed.
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational/shard-default-member-1:datastore-operational@1:1
122-85_4948293#1836430803)]] after [30000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.BatchedModifications".]]}
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)[171:org.opendaylight.controller.sal-broker-impl:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:18)[171:org.opendaylight.controller.sal-broker-impl:1.5.2.Carbon-redhat-2]
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:102)[69:org.opendaylight.yangtools.util:1.1.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker.handleException(ConcurrentDOMDataBroker.java:231)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker.access$000(ConcurrentDOMDataBroker.java:46)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker$1.onFailure(ConcurrentDOMDataBroker.java:126)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)[65:com.google.guava:18.0.0]
at org.opendaylight.controller.cluster.datastore.SingleCommitCohortProxy$1.onComplete(SingleCommitCohortProxy.java:58)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at akka.dispatch.OnComplete.internal(Future.scala:258)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.OnComplete.internal(Future.scala:256)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational/shard-default-member-1:datastore-operational@1:122-85_4948293#1836430803)]] after [30000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.BatchedModifications".
:
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)[185:com.typesafe.akka.actor:2.4.18]
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)[185:com.typesafe.akka.actor:2.4.18]
at java.lang.Thread.run(Thread.java:748)[:1.8.0_151]
2017-11-09 20:14:30,886 | ERROR | CommitFutures-64 | TransactionChainManager | 304 - org.opendaylight.openflowplugin.common - 0.4.2.Carbon-redhat-2 | Transaction commit failed.
TransactionCommitFailedException{message=canCommit encountered an unexpected failure, errorList=[RpcError [message=canCommit encountered an unexpected failure, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational/shard-default-member-1:datastore-operational@1:122-85_4948293#1836430803)]] after [30000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.BatchedModifications".]]}
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)[171:org.opendaylight.controller.sal-broker-impl:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.md.sal.dom.broker.impl.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:18)[171:org.opendaylight.controller.sal-broker-impl:1.5.2.Carbon-redhat-2]
at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:102)[69:org.opendaylight.yangtools.util:1.1.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker.handleException(ConcurrentDOMDataBroker.java:231)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker.access$000(ConcurrentDOMDataBroker.java:46)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker$1.onFailure(ConcurrentDOMDataBroker.java:126)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)[65:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:68)[65:com.google.guava:18.0.0]
at org.opendaylight.controller.cluster.datastore.SingleCommitCohortProxy$1.onComplete(SingleCommitCohortProxy.java:58)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at akka.dispatch.OnComplete.internal(Future.scala:258)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.OnComplete.internal(Future.scala:256)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)[185:com.typesafe.akka.actor:2.4.18]
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)[181:org.scala-lang.scala-library:2.11.11.v20170413-090219-8a413ba7cc]
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)[185:com.typesafe.akka.actor:2.4.18]
(...)
2017-11-09 20:14:32,467 | WARN | lt-dispatcher-19 | ConcurrentDOMDataBroker | 211 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.Carbon-redhat-2 | Tx: DOM-CHAIN-171-0 Error during phase CAN_COMMIT, starting Abort
org.opendaylight.controller.cluster.datastore.exceptions.ShardLeaderNotRespondingException: Could not create a READ_WRITE transaction on shard default. The shard leader isn't responding.
at org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.createTransactionContext(RemoteTransactionContextSupport.java:213)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.onCreateTransactionComplete(RemoteTransactionContextSupport.java:195)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport.access$000(RemoteTransactionContextSupport.java:39)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
at org.opendaylight.controller.cluster.datastore.RemoteTransactionContextSupport$1.onComplete(RemoteTransactionContextSupport.java:135)[211:org.opendaylight.controller.sal-distributed-datastore:1.5.2.Carbon-redhat-2]
(...)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [ActorSelection[Anchor(akka.tcp://opendaylight-cluster-data@172.16.0.22:2550/), Path(/user/shardmanager-operational/member-0-shard-default-operational)]] after [5000 ms]. Sender[null] sent message of type "org.opendaylight.controller.cluster.datastore.messages.CreateTransaction".
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)[185:com.typesafe.akka.actor:2.4.18]
(...)
2017-11-09 20:14:32,467 | WARN | CommitFutures-66 | PingPongTransactionChain | 171 - org.opendaylight.controller.sal-broker-impl - 1.5.2.Carbon-redhat-2 | Transaction chain org.opendaylight.controller.cluster.databroker.DOMBrokerTransactionChain@5770b26 failed with no pending transactions
(...)
2017-11-09 20:14:34,170 | WARN | ntLoopGroup-7-40 | DeviceContextImpl | 307 - org.opendaylight.openflowplugin.impl - 0.4.2.Carbon-redhat-2 | Failed to shut down transaction chain for device openflow:123674267060747: java.util.concurrent.TimeoutException: Timeout waiting for task.
I'll reach out to openflowplugin-dev & controller-dev again to see if any of these means anything to them... |
| Comment by Michael Vorburger [ 15/Nov/17 ] |
|
> I'll reach out to openflowplugin-dev & controller-dev again to see if any of these means anything to them... On https://lists.opendaylight.org/pipermail/controller-dev/2017-November/014064.html rovarga says "a large FrontendMetadata part could indicate dangling The attached open-transactions_controller-*.txt, produced by the trace:transaction CLI Command (see |
| Comment by Michael Vorburger [ 15/Nov/17 ] |
|
https://git.opendaylight.org/gerrit/#/q/topic:NETVIRT-985 ... |
| Comment by Michael Vorburger [ 17/Nov/17 ] |
|
Once everything that is open now, PLUS what skitt will help me with in |
| Comment by Michael Vorburger [ 28/Nov/17 ] |
|
Everything done here is now merged to master & carbon; and nitrogen should get merged in the coming days (stable/nitrogen was re-open'd today). In attached Controller[1-3]_open-transactions.txt from latest test with a build including these fixes shows, we can clearly see that all the "big" leaks (as in with hundreds of open TXs) have been plugged; as expected. Preliminary early feedback indicates that they are not hitting an OOM anymore. |
| Comment by Michael Vorburger [ 29/Nov/17 ] |
|
Testing of another scenario ("nova boot") has hit an OOM that looks exactly this again (another huge 857 MB of 1.6 GB Map inside the MD SAL ShardDataTree), so clogging these TX leaks is likely going to be more of an ongoing repetitive than a one time action.. Basically, any time we test new scenarios at scale on a path that hasn't been threaded before, if we hit an OOM that shows a blown up ShardDataTree, we have to run trace:transactions and clog more non-closed TXs - best by using the ManagedTransactionRunner, continuing on the repetitive pattern of fixes which the patches of the past 2 weeks on the https://git.opendaylight.org/gerrit/#/q/topic:NETVIRT-985 have shown. |
| Comment by Michael Vorburger [ 30/Nov/17 ] |
|
Closing this issue to avoid confusing, carrying on more similar work in different places in |