<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 20:14:37 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[NETCONF-284] Deadlock between filterNotification of NetconfDevice and onSessionDown of NetconfDeviceCommunicator</title>
                <link>https://jira.opendaylight.org/browse/NETCONF-284</link>
                <project id="10142" key="NETCONF">netconf</project>
                    <description>&lt;p&gt;Recently we met a deadlock between filterNotification of NetconfDevice and onSessionDown of NetconfDeviceCommunicator. The filterNotification of NetconfDevice was executed in a thread in the threadpool of remote-connector-processing-executor to filter a NetconfCapabilityChange notification which would cause disconnecting the netconf connector to odl internal netconf server. The onSessionDown of NetconfDeviceCommunicator was executed in a netty thread when the netty channel of netconf client received the close operation issued by the disconnecting. The onSessionDown would first hold the sessionLock and then would call NetconfDevice&apos;s onRemoteSessionDown which would call NotificationHandler&apos;s onRemoteSchemaDown which is a synchoronized method. But the monitor of NotificationHandler was held by the filterNotification of NetconfDevice from the beginning of its execution. Its following excecution was to call NetconfDeviceCommunicator&apos;s onSessionTerminated which would also try to hold sessionLock. Then deadlocked! The deadlock information from jstack is as following:&lt;/p&gt;

&lt;p&gt;Found one Java-level deadlock:&lt;br/&gt;
=============================&lt;br/&gt;
&quot;nettyThreadgroupModule$NioEventLoopGroupCloseable-2-4&quot;:&lt;br/&gt;
  waiting to lock monitor 0x00007fda6408f4f8 (object 0x00000005c544ed48, a org.opendaylight.netconf.sal.connect.netconf.NotificationHandler),&lt;br/&gt;
  which is held by &quot;remote-connector-processing-executor-9&quot;&lt;br/&gt;
&quot;remote-connector-processing-executor-9&quot;:&lt;br/&gt;
  waiting for ownable synchronizer 0x00000005c5450158, (a java.util.concurrent.locks.ReentrantLock$NonfairSync),&lt;br/&gt;
  which is held by &quot;nettyThreadgroupModule$NioEventLoopGroupCloseable-2-4&quot;&lt;/p&gt;

&lt;p&gt;Java stack information for the threads listed above:&lt;br/&gt;
===================================================&lt;br/&gt;
&quot;nettyThreadgroupModule$NioEventLoopGroupCloseable-2-4&quot;:&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NotificationHandler.onRemoteSchemaDown(NotificationHandler.java:97)&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;waiting to lock &amp;lt;0x00000005c544ed48&amp;gt; (a org.opendaylight.netconf.sal.connect.netconf.NotificationHandler)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice.onRemoteSessionDown(NetconfDevice.java:249)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.tearDown(NetconfDeviceCommunicator.java:175)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.onSessionDown(NetconfDeviceCommunicator.java:206)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.onSessionDown(NetconfDeviceCommunicator.java:47)&lt;br/&gt;
	at org.opendaylight.netconf.nettyutil.AbstractNetconfSession.endOfInput(AbstractNetconfSession.java:107)&lt;br/&gt;
	at org.opendaylight.protocol.framework.AbstractProtocolSession.channelInactive(AbstractProtocolSession.java:40)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:219)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:212)&lt;br/&gt;
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:360)&lt;br/&gt;
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:325)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:219)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:212)&lt;br/&gt;
	at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:360)&lt;br/&gt;
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:325)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:233)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:219)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:212)&lt;br/&gt;
	at org.opendaylight.netconf.nettyutil.handler.ssh.client.AsyncSshHandler.disconnect(AsyncSshHandler.java:243)&lt;/li&gt;
	&lt;li&gt;locked &amp;lt;0x00000005c9eb3300&amp;gt; (a org.opendaylight.netconf.nettyutil.handler.ssh.client.AsyncSshHandler)&lt;br/&gt;
	at org.opendaylight.netconf.nettyutil.handler.ssh.client.AsyncSshHandler.close(AsyncSshHandler.java:234)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:604)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:588)&lt;br/&gt;
	at io.netty.channel.ChannelOutboundHandlerAdapter.close(ChannelOutboundHandlerAdapter.java:71)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:604)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.close(AbstractChannelHandlerContext.java:588)&lt;br/&gt;
	at io.netty.channel.ChannelOutboundHandlerAdapter.close(ChannelOutboundHandlerAdapter.java:71)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.invokeClose(AbstractChannelHandlerContext.java:604)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext.access$1100(AbstractChannelHandlerContext.java:33)&lt;br/&gt;
	at io.netty.channel.AbstractChannelHandlerContext$13.run(AbstractChannelHandlerContext.java:593)&lt;br/&gt;
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:358)&lt;br/&gt;
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:394)&lt;br/&gt;
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)&lt;br/&gt;
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:145)&lt;br/&gt;
	at java.lang.Thread.run(Thread.java:745)&lt;br/&gt;
&quot;remote-connector-processing-executor-9&quot;:&lt;br/&gt;
	at sun.misc.Unsafe.park(Native Method)&lt;/li&gt;
	&lt;li&gt;parking to wait for  &amp;lt;0x00000005c5450158&amp;gt; (a java.util.concurrent.locks.ReentrantLock$NonfairSync)&lt;br/&gt;
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)&lt;br/&gt;
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)&lt;br/&gt;
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)&lt;br/&gt;
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)&lt;br/&gt;
	at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)&lt;br/&gt;
	at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.tearDown(NetconfDeviceCommunicator.java:154)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.onSessionTerminated(NetconfDeviceCommunicator.java:212)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.onSessionTerminated(NetconfDeviceCommunicator.java:47)&lt;br/&gt;
	at org.opendaylight.netconf.nettyutil.AbstractNetconfSession.close(AbstractNetconfSession.java:58)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.listener.NetconfDeviceCommunicator.disconnect(NetconfDeviceCommunicator.java:147)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice$3.filterNotification(NetconfDevice.java:178)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NotificationHandler.passNotification(NotificationHandler.java:87)&lt;/li&gt;
	&lt;li&gt;locked &amp;lt;0x00000005c544ed48&amp;gt; (a org.opendaylight.netconf.sal.connect.netconf.NotificationHandler)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NotificationHandler.onRemoteSchemaUp(NotificationHandler.java:61)&lt;/li&gt;
	&lt;li&gt;locked &amp;lt;0x00000005c544ed48&amp;gt; (a org.opendaylight.netconf.sal.connect.netconf.NotificationHandler)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice.handleSalInitializationSuccess(NetconfDevice.java:216)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice$RecursiveSchemaSetup$2.onSuccess(NetconfDevice.java:457)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice$RecursiveSchemaSetup$2.onSuccess(NetconfDevice.java:449)&lt;br/&gt;
	at com.google.common.util.concurrent.Futures$6.run(Futures.java:1319)&lt;br/&gt;
	at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)&lt;br/&gt;
	at com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)&lt;br/&gt;
	at com.google.common.util.concurrent.ExecutionList.add(ExecutionList.java:101)&lt;br/&gt;
	at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:170)&lt;br/&gt;
	at com.google.common.util.concurrent.ForwardingListenableFuture.addListener(ForwardingListenableFuture.java:47)&lt;br/&gt;
	at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1322)&lt;br/&gt;
	at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1258)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice$RecursiveSchemaSetup.setUpSchema(NetconfDevice.java:489)&lt;br/&gt;
	at org.opendaylight.netconf.sal.connect.netconf.NetconfDevice$RecursiveSchemaSetup.run(NetconfDevice.java:411)&lt;br/&gt;
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)&lt;br/&gt;
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)&lt;br/&gt;
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)&lt;br/&gt;
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)&lt;br/&gt;
	at java.lang.Thread.run(Thread.java:745)&lt;/li&gt;
&lt;/ul&gt;
</description>
                <environment>&lt;p&gt;Operating System: Linux&lt;br/&gt;
Platform: PC&lt;/p&gt;</environment>
        <key id="21297">NETCONF-284</key>
            <summary>Deadlock between filterNotification of NetconfDevice and onSessionDown of NetconfDeviceCommunicator</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10000">Done</resolution>
                                        <assignee username="yin.kangqian@zte.com.cn">Kangqian Yin</assignee>
                                    <reporter username="yin.kangqian@zte.com.cn">Kangqian Yin</reporter>
                        <labels>
                    </labels>
                <created>Sat, 24 Sep 2016 03:52:46 +0000</created>
                <updated>Fri, 15 Mar 2019 22:22:30 +0000</updated>
                            <resolved>Fri, 30 Sep 2016 11:27:14 +0000</resolved>
                                                                    <component>netconf</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>2</watches>
                                                                                                                <comments>
                            <comment id="39658" author="jmorvay@cisco.com" created="Tue, 27 Sep 2016 06:57:52 +0000"  >&lt;p&gt;Hi Kanggian,&lt;/p&gt;

&lt;p&gt;Thank you for your report. How many times have you ran into this deadlock?&lt;/p&gt;</comment>
                            <comment id="39659" author="yin.kangqian@zte.com.cn" created="Tue, 27 Sep 2016 09:13:09 +0000"  >&lt;p&gt;100% deadlocking in installing one of our own features.&lt;/p&gt;

&lt;p&gt;I&apos;ve found the deadlocking condition and the way to break the conditon. I&apos;ll push a patch to fix it later.&lt;/p&gt;</comment>
                            <comment id="39660" author="jmorvay@cisco.com" created="Tue, 27 Sep 2016 15:55:01 +0000"  >&lt;p&gt;I can confirm this deadlock. The thing is that onCapabilityChange notification should be cached and after successful connection is processed by thread of remote-connector-processing-executor. Not cached notifications are processed in netty thread so they shouldn&apos;t deadlock.&lt;/p&gt;

&lt;p&gt;I simulated this behavior by always caching onCapabilityChange notification, but I wasn&apos;t able to reproduce this deadlock on every reconnect. &lt;/p&gt;

&lt;p&gt;So If I am missing something, please let me know what other condition has to be met. Also patches are always welcomed so feel free to push your fix for this deadlock. Since you are working on this, you can also take this bug here in bugzilla.&lt;/p&gt;</comment>
                            <comment id="39661" author="yin.kangqian@zte.com.cn" created="Wed, 28 Sep 2016 03:14:24 +0000"  >&lt;p&gt;I met this deadlock in Beryllium-SR3 distribution. In this version, YangStoreService&apos;s notifiyListeners has the function to issue NetconfCapabilityChange notification to registered remote netconf clients. &lt;/p&gt;

&lt;p&gt;Just as you say, if the NetconfCapabilityChange notification is executed in netty thread, deadlock won&apos;t happen. &lt;/p&gt;

&lt;p&gt;In all the deadlocks I met, the NetconfCapabilityChange notification is executed in a thread of remote-connector-processing-executor in the calling tree of NetconfDevice$RecursiveSchemaSetup.setUpSchema which is submitted by NetconfDevice.onRemoteSessionUp, i.e, triggered when the netconf session is reconnected. The filter of NetconfCapabilityChange will cause disconnect this netconf session again. &lt;/p&gt;

&lt;p&gt;The deadlock condition requires there must be a netty thread executing the close of this netconf session at the same time. If so, the two threads will race for NetconfDeviceCommunicator&apos;s tearDown operation. If the netty thread enters tearDown first, deadlock will happen for such tearDown cannot take the monitor of NotificationHandler. The monitor of Notification is held by the remote-connector-processing-executor thread at the beginning of filtering of NetconfCapabilityChange.&lt;/p&gt;

&lt;p&gt;Nevertheless, reproducing this deadlock is hard. There&apos;re several features I&apos;ve tested, but only one feature to install will cause this deadlock.&lt;/p&gt;

&lt;p&gt;However, the racing of NetconfDeviceCommunicator&apos;s tearDown actually exists and can be drawn from just the code of NetconfDeviceCommunicator. Both onSessionDown and onSessionTerminated will call tearDown, and they can be executed in two independent threads seperately.&lt;/p&gt;</comment>
                            <comment id="39662" author="yin.kangqian@zte.com.cn" created="Thu, 29 Sep 2016 03:51:08 +0000"  >&lt;p&gt;&lt;a href=&quot;https://git.opendaylight.org/gerrit/#/c/46270/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/#/c/46270/&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6797</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=6797]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i01xlb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>