|
Should I use master or stable/lithium to reproduce this?
|
|
This must be happening with master as openflowplugin does not have a stable/lithium branch yet.
|
|
To reproduce this problem do this,
1. Build openflowplugin master
2. Run the openflowplugin karaf distribution
3. Run mininet - with --topo tree,3
4. ./flow_config_blaster.py --auth --threads 5 --flows 1000 --no-delete run from the integration project
|
|
Michal, I tried reproducing this issue but I wasn't able to. I'm probably still missing a step.
|
|
MORAJA-M-301N:clustering-performance-test moizr$ ./flow_config_blaster.py --auth --threads 5 --flows 5000 --no-delete
Cycle 0:
Thread 2:
Adding 5000 flows on 7 nodes
Thread 4:
Adding 5000 flows on 7 nodes
Thread 0:
Adding 5000 flows on 7 nodes
Thread 1:
Adding 5000 flows on 7 nodes
Thread 3:
Adding 5000 flows on 7 nodes
Thread 3 results (ADD):
Elapsed time: 60.13s,
Requests/s: 83.16 OK, 83.16 Total
Flows/s: 83.16 OK, 83.16 Total
Stats (
{Requests}, {Flows}): {200: 0, 204: 5000} {200: 0, 204: 5000}
Thread 4 results (ADD):
Elapsed time: 60.19s,
Requests/s: 83.07 OK, 83.07 Total
Flows/s: 83.07 OK, 83.07 Total
Stats ({Requests}
,
{Flows}): {200: 0, 204: 5000} {200: 0, 204: 5000}
Thread 1 results (ADD):
Elapsed time: 60.29s,
Requests/s: 82.93 OK, 82.93 Total
Flows/s: 82.93 OK, 82.93 Total
Stats ({Requests}, {Flows}
):
{200: 0, 204: 5000}
{200: 0, 204: 5000}
Thread 0 results (ADD):
Elapsed time: 60.30s,
Requests/s: 82.92 OK, 82.92 Total
Flows/s: 82.92 OK, 82.92 Total
Stats (
{Requests}, {Flows}): {200: 0, 204: 5000} {200: 0, 204: 5000}
Thread 2 results (ADD):
Elapsed time: 60.32s,
Requests/s: 82.89 OK, 82.89 Total
Flows/s: 82.89 OK, 82.89 Total
Stats ({Requests}
,
{Flows}
):
{200: 0, 204: 5000}
{200: 0, 204: 5000}
-
-
- Test summary:
Elapsed time: 60.52s
Peak requests/s: 414.97 OK, 414.97 Total
Peak flows/s: 414.97 OK, 414.97 Total
Avg. requests/s: 413.09 OK, 413.09 Total (99.55% of peak total)
Avg. flows/s: 413.09 OK, 413.09 Total (99.55% of peak total)
-
-
- Total flows added: 25000
HTTP[OK] results: 25000
|
|
By any chance would you have local changes that I do not have?
|
|
No. Everything is merged in master.
|
|
Martin, do you follow the same steps documented in this bug or do you do something else to reproduce this problem?
|
|
Yes Moiz I do the same as documented. Maybe try increasing number of flows for config blaster to 10 000 (or even more). :
./flow_config_blaster.py --auth --threads 5 --flows 10000 --no-delete
|
|
I am able now to observe this issue by using batching when writing flows. If I use command below, I am able to see only 5k flows in operational DS and many exceptions in log. With 100 fpr it works fine.
./flow_config_blaster.py --auth --fpr 400 --flows 1000 --threads 8 --no-delete
|
|
Finally I was able to reproduce the problem. The key step which I missed was to install odl-openflowplugin-all-li. I was installing odl-openflowplugin-all. Anyway I see the problem now.
I looked through the code in TransactionChainManager. Basically it seems that we are submitting chained transactions in one thread and may create them in a separate thread. Is this a supported use case for TransactionChains?
Just from examining the code in the databroker I can see why this exception could happen. There is no synchronization between creating a transaction on a chain and submitting that transaction which can result in some application thread attempting to create a transaction on a chain before the cohorts on the previous transaction being made ready.
|
|
Finally I was able to reproduce the problem. The key step which I missed was to install odl-openflowplugin-all-li. I was installing odl-openflowplugin-all. Anyway I see the problem now.
I looked through the code in TransactionChainManager. Basically it seems that we are submitting chained transactions in one thread and may create them in a separate thread. Is this a supported use case for TransactionChains?
Just from examining the code in the databroker I can see why this exception could happen. There is no synchronization between creating a transaction on a chain and submitting that transaction which can result in some application thread attempting to create a transaction on a chain before the cohorts on the previous transaction being made ready.
|
|
Openflowplugin's write to tx and commit are synchronized operations. That should not cause this issue.
|
|
You're right Martin, incorrect analysis on my part.
|
|
The primary exception reported in this bug is a red herring.
The root cause of this issue are some exceptions that happen before. I've captured those exceptions in this gist -> https://gist.github.com/moizr/4205d9c6b47a655e411f
Initially a ModifiedNodeDoesNotExist happens on a couple of transactions. This causes the transaction chains to fail. Because of this the TransactionChainManager's onTransactionFailed method is called. Over there we attempt to close the txChainFactory this results in the following exception. This exception does not allow a new transaction chain to get created. So any subsequent transaction that gets created on the old chain results in the "Previous transaction is not ready yet exception"
2015-05-08 21:43:37,872 | ERROR | ntLoopGroup-6-10 | Futures$CombinedFuture | 64 - com.google.guava - 18.0.0 | input future failed.
java.lang.IllegalStateException: Previous transaction is not ready yet
at org.opendaylight.controller.cluster.datastore.TransactionChainProxy.allocateWriteTransaction(TransactionChainProxy.java:134)[182:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.TransactionChainProxy.newReadWriteTransaction(TransactionChainProxy.java:105)[182:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.databroker.DOMBrokerReadWriteTransaction.createTransaction(DOMBrokerReadWriteTransaction.java:49)[182:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.databroker.DOMBrokerReadWriteTransaction.createTransaction(DOMBrokerReadWriteTransaction.java:22)[182:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.databroker.AbstractDOMBrokerTransaction.getSubtransaction(AbstractDOMBrokerTransaction.java:57)[182:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.cluster.databroker.DOMBrokerWriteOnlyTransaction.put(DOMBrokerWriteOnlyTransaction.java:78)[182:org.opendaylight.controller.sal-distributed-datastore:1.2.0.SNAPSHOT]
at org.opendaylight.controller.md.sal.dom.spi.ForwardingDOMDataReadWriteTransaction.put(ForwardingDOMDataReadWriteTransaction.java:49)[156:org.opendaylight.controller.sal-core-spi:1.2.0.SNAPSHOT]
at org.opendaylight.controller.md.sal.binding.impl.AbstractWriteTransaction.put(AbstractWriteTransaction.java:49)[160:org.opendaylight.controller.sal-binding-broker-impl:1.2.0.SNAPSHOT]
at org.opendaylight.controller.md.sal.binding.impl.BindingDOMWriteTransactionAdapter.put(BindingDOMWriteTransactionAdapter.java:40)[160:org.opendaylight.controller.sal-binding-broker-impl:1.2.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.device.TransactionChainManager.writeToTransaction(TransactionChainManager.java:85)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.device.DeviceContextImpl.writeToTransaction(DeviceContextImpl.java:166)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.statistics.StatisticsGatheringUtils.processNodeConnectorStatistics(StatisticsGatheringUtils.java:287)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.statistics.StatisticsGatheringUtils.access$300(StatisticsGatheringUtils.java:97)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.statistics.StatisticsGatheringUtils$1.apply(StatisticsGatheringUtils.java:137)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.statistics.StatisticsGatheringUtils$1.apply(StatisticsGatheringUtils.java:121)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at com.google.common.util.concurrent.Futures$2.apply(Futures.java:760)[64:com.google.guava:18.0.0]
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:906)[64:com.google.guava:18.0.0]
at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)[64:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)[64:com.google.guava:18.0.0]
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)[64:com.google.guava:18.0.0]
at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:185)[64:com.google.guava:18.0.0]
at com.google.common.util.concurrent.SettableFuture.set(SettableFuture.java:53)[64:com.google.guava:18.0.0]
at org.opendaylight.openflowplugin.impl.device.DeviceContextImpl.processReply(DeviceContextImpl.java:283)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.device.listener.MultiMsgCollectorImpl$MultiCollectorObject.publishCollection(MultiMsgCollectorImpl.java:131)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.device.listener.MultiMsgCollectorImpl.addMultipartMsg(MultiMsgCollectorImpl.java:103)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowplugin.impl.device.listener.OpenflowProtocolListenerFullImpl.onMultipartReplyMessage(OpenflowProtocolListenerFullImpl.java:86)[198:org.opendaylight.openflowplugin.impl:0.1.0.SNAPSHOT]
at org.opendaylight.openflowjava.protocol.impl.core.connection.ConnectionAdapterImpl.consume(ConnectionAdapterImpl.java:297)[195:org.opendaylight.openflowjava.openflow-protocol-impl:0.6.0.SNAPSHOT]
at org.opendaylight.openflowjava.protocol.impl.core.DelegatingInboundHandler.channelRead(DelegatingInboundHandler.java:46)[195:org.opendaylight.openflowjava.openflow-protocol-impl:0.6.0.SNAPSHOT]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[118:io.netty.transport:4.0.26.Final]
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)[144:io.netty.codec:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[118:io.netty.transport:4.0.26.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)[144:io.netty.codec:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[118:io.netty.transport:4.0.26.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)[144:io.netty.codec:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[118:io.netty.transport:4.0.26.Final]
at io.netty.handler.timeout.ReadTimeoutHandler.channelRead(ReadTimeoutHandler.java:150)[145:io.netty.handler:4.0.26.Final]
at org.opendaylight.openflowjava.protocol.impl.core.IdleHandler.channelRead(IdleHandler.java:39)[195:org.opendaylight.openflowjava.openflow-protocol-impl:0.6.0.SNAPSHOT]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)[118:io.netty.transport:4.0.26.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:349)[118:io.netty.transport:4.0.26.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)[119:io.netty.common:4.0.26.Final]
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)[119:io.netty.common:4.0.26.Final]
at java.lang.Thread.run(Thread.java:745)[:1.7.0_67]
|
|
Reassigning this to openflowplugin as this does not appear to be an issue with the datastore.
@Override
public void onTransactionChainFailed(final TransactionChain<?, ?> chain,
final AsyncTransaction<?, ?> transaction, final Throwable cause)
{
LOG.warn("txChain failed -> recreating", cause);
txChainFactory.close(); // <--- Exception caused by this line prevents the next line from executing
txChainFactory = dataBroker.createTransactionChain(TransactionChainManager.this);
}
|
|
Reassigning it to Martin (he thinks this is no longer an issue - and he will close it)
|
Generated at Wed Feb 07 20:32:26 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.