[OPNFLWPLUG-1107] OpenFlow longevity does not launch Created: 12/Jan/21  Updated: 22/Jan/21  Resolved: 22/Jan/21

Status: Resolved
Project: OpenFlowPlugin
Component/s: openflowplugin-impl
Affects Version/s: Silicon
Fix Version/s: Silicon

Type: Bug Priority: High
Reporter: Robert Varga Assignee: Sangwook Ha
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

After fixing the obvious problems with flow/group handling, longevity suite fails very quickly on:

2021-01-11T10:38:06,599 | INFO  | SystemReadyService-0 | OpenFlowPluginProviderImpl       | 302 - org.opendaylight.openflowplugin.impl - 0.12.0.SNAPSHOT | onSystemBootReady() received, starting the switch connections
2021-01-11T10:38:06,728 | INFO  | OFP-SwitchConnectionProvider-Udp/TcpHandler_6653-0 | TcpHandler                       | 309 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.12.0.SNAPSHOT | Switch listener started and ready to accept incoming tcp/tls connections on port: 6653
2021-01-11T10:38:06,728 | INFO  | OFP-SwitchConnectionProvider-Udp/TcpHandler_6633-0 | TcpHandler                       | 309 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.12.0.SNAPSHOT | Switch listener started and ready to accept incoming tcp/tls connections on port: 6633
2021-01-11T10:38:06,730 | INFO  | OFP-SwitchConnectionProvider-Udp/TcpHandler_6633-0 | OpenflowDiagStatusProviderImpl   | 309 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.12.0.SNAPSHOT | The report status is OPERATIONAL for OPENFLOW_SERVER_6633
2021-01-11T10:38:06,730 | INFO  | OFP-SwitchConnectionProvider-Udp/TcpHandler_6653-0 | OpenflowDiagStatusProviderImpl   | 309 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.12.0.SNAPSHOT | The report status is OPERATIONAL for OPENFLOW_SERVER_6653
2021-01-11T10:38:06,731 | INFO  | OFP-SwitchConnectionProvider-Udp/TcpHandler_6633-0 | OpenFlowPluginProviderImpl       | 302 - org.opendaylight.openflowplugin.impl - 0.12.0.SNAPSHOT | All switchConnectionProviders are up and running (2).
2021-01-11T10:38:06,731 | INFO  | OFP-SwitchConnectionProvider-Udp/TcpHandler_6633-0 | OpenflowDiagStatusProviderImpl   | 309 - org.opendaylight.openflowplugin.openflowjava.openflow-protocol-impl - 0.12.0.SNAPSHOT | The report status is OPERATIONAL for OPENFLOW
2021-01-11T10:38:13,918 | INFO  | sshd-SshServer[1a04469](port=8101)-nio2-thread-1 | OpenSSHKeyPairProvider           | 113 - org.apache.karaf.shell.ssh - 4.2.10 | Creating ssh server private key at /tmp/karaf-0.14.0-SNAPSHOT/etc/host.key
2021-01-11T10:38:13,922 | INFO  | sshd-SshServer[1a04469](port=8101)-nio2-thread-1 | OpenSSHKeyPairGenerator          | 113 - org.apache.karaf.shell.ssh - 4.2.10 | generateKeyPair(RSA) generating host key - size=2048
2021-01-11T10:38:14,553 | INFO  | sshd-SshServer[1a04469](port=8101)-nio2-thread-1 | ServerSessionImpl                | 121 - org.apache.sshd.osgi - 2.5.1 | Session karaf@/10.30.170.241:41624 authenticated
2021-01-11T10:38:14,982 | INFO  | CM Configuration Updater (Update: pid=org.ops4j.pax.logging) | EventAdminConfigurationNotifier  | 4 - org.ops4j.pax.logging.pax-logging-log4j2 - 1.11.7 | Logging configuration changed. (Event Admin service unavailable - no notification sent).
2021-01-11T10:38:15,089 | INFO  | pipe-logout      | LogoutAction                     | 110 - org.apache.karaf.shell.commands - 4.2.10 | Disconnecting from current session...
2021-01-11T10:38:15,139 | WARN  | sshd-SshServer[1a04469](port=8101)-nio2-thread-2 | ServerSessionImpl                | 121 - org.apache.sshd.osgi - 2.5.1 | exceptionCaught(ServerSessionImpl[karaf@/10.30.170.241:41624])[state=Opened] IOException: Connection reset by peer
2021-01-11T10:38:16,125 | INFO  | epollEventLoopGroup-9-1 | SystemNotificationsListenerImpl  | 302 - org.opendaylight.openflowplugin.impl - 0.12.0.SNAPSHOT | ConnectionEvent: Connection closed by device, Device:/10.30.170.59:45900, NodeId:null
2021-01-11T10:38:16,127 | WARN  | epollEventLoopGroup-9-1 | DefaultChannelPipeline           | 47 - io.netty.common - 4.1.56.Final | An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.lang.NullPointerException: null
	at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) ~[?:?]
	at org.opendaylight.openflowplugin.impl.lifecycle.ContextChainHolderImpl.onDeviceDisconnected(ContextChainHolderImpl.java:271) ~[?:?]
	at org.opendaylight.openflowplugin.impl.connection.ConnectionContextImpl.propagateDeviceDisconnectedEvent(ConnectionContextImpl.java:178) ~[?:?]
	at org.opendaylight.openflowplugin.impl.connection.ConnectionContextImpl.disconnectDevice(ConnectionContextImpl.java:167) ~[?:?]
	at org.opendaylight.openflowplugin.impl.connection.ConnectionContextImpl.onConnectionClosed(ConnectionContextImpl.java:125) ~[?:?]
	at org.opendaylight.openflowplugin.impl.connection.listener.SystemNotificationsListenerImpl.onDisconnectEvent(SystemNotificationsListenerImpl.java:69) ~[?:?]
	at org.opendaylight.openflowjava.protocol.impl.core.connection.ConnectionAdapterImpl.consumeDeviceMessage(ConnectionAdapterImpl.java:126) ~[?:?]
	at org.opendaylight.openflowjava.protocol.impl.core.connection.AbstractConnectionAdapterStatistics.consume(AbstractConnectionAdapterStatistics.java:68) ~[?:?]
	at org.opendaylight.openflowjava.protocol.impl.core.connection.ConnectionAdapterImpl.consume(ConnectionAdapterImpl.java:65) ~[?:?]
	at org.opendaylight.openflowjava.protocol.impl.core.DelegatingInboundHandler.channelInactive(DelegatingInboundHandler.java:53) ~[?:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [bundleFile:4.1.56.Final]
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81) [bundleFile:4.1.56.Final]
	at io.netty.handler.timeout.IdleStateHandler.channelInactive(IdleStateHandler.java:277) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [bundleFile:4.1.56.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389) [bundleFile:4.1.56.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [bundleFile:4.1.56.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389) [bundleFile:4.1.56.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [bundleFile:4.1.56.Final]
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:81) [bundleFile:4.1.56.Final]
	at org.opendaylight.openflowjava.protocol.impl.core.connection.ChannelOutboundQueue.channelInactive(ChannelOutboundQueue.java:254) [bundleFile:?]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [bundleFile:4.1.56.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [bundleFile:4.1.56.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) [bundleFile:4.1.56.Final]
	at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:819) [bundleFile:4.1.56.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [bundleFile:4.1.56.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [bundleFile:4.1.56.Final]
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [bundleFile:4.1.56.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [bundleFile:4.1.56.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [bundleFile:4.1.56.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [bundleFile:4.1.56.Final]
	at java.lang.Thread.run(Thread.java:834) [?:?]

This looks like a mixup with initialization – this points towards DeviceInfo being null.



 Comments   
Comment by Luis Gomez [ 12/Jan/21 ]

The test does not progress because it detects an NPE exception in the log. The ERROR is thrown after this:

2021-01-11T10:38:16,125 | INFO  | epollEventLoopGroup-9-1 | SystemNotificationsListenerImpl  | 302 - org.opendaylight.openflowplugin.impl - 0.12.0.SNAPSHOT | ConnectionEvent: Connection closed by device, Device:/10.30.170.59:45900, NodeId:null

The above happens every time mininet is started and it is the result of a mininet controller check (telnet to the controller OFP port) as shown in the mininet log:

vagrant@mininet-vm-ubuntu-25:~\> sudo mn --controller=remote,ip=192.168.0.254 --topo linear,3 --switch=ovsk,protocols=OpenFlow13 -v debug
*** errRun: ['grep', '-c', 'processor', '/proc/cpuinfo'] 
4
  0*** Setting resource limits
*** Creating network
*** Adding controller
*** errRun: ['which', 'mnexec'] 
/usr/bin/mnexec
  0*** errRun: ['which', 'ifconfig'] 
/sbin/ifconfig
  0*** c0 : ('unset HISTFILE; stty -echo; set +m',)
unset HISTFILE; stty -echo; set +m
*** c0 : ('echo A | telnet -e A 192.168.0.254 6653',)
Telnet escape character is 'A'.
Trying 192.168.0.254...
Connected to 192.168.0.254.
Escape character is 'A'.

telnet> Connection closed.

The issue therefore is easy to reproduce by just doing a telnet to controller IP and OFP port and then exiting the connection:

vagrant@mininet-vm-ubuntu-25:~\> telnet 192.168.0.254 6653
Trying 192.168.0.254...
Connected to 192.168.0.254.
Escape character is '^]'.
^]
Connection closed by foreign host.
Comment by Luis Gomez [ 12/Jan/21 ]

BTW this bug only shows in Silicon.

Generated at Wed Feb 07 20:34:13 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.