<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 20:16:08 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[NETCONF-880] Netconf does not close device mountpoint properly</title>
                <link>https://jira.opendaylight.org/browse/NETCONF-880</link>
                <project id="10142" key="NETCONF">netconf</project>
                    <description>&lt;p&gt;We have an ODL-based app which manages netconf devices in a DC.&lt;br/&gt;
Recently, we&apos;ve been experiencing netconf connectivity/sync issues between ODL app and netconf devices in some of our prod environments.&lt;br/&gt;
So here&apos;s what has been happening:&lt;/p&gt;

&lt;p&gt;1. We start the ODL app which connects to devices and fetches data from them.&lt;br/&gt;
&#160; &#160; All is good, devices are connected properly and are in sync (data was fetched correctly).&lt;br/&gt;
2. After a short time, ODL lose connection to some devices.&lt;br/&gt;
3. ODL netconf attempts automatic reconnect and the device gets reconnected.&lt;br/&gt;
4. ODL app tries to load data from device again using a netconf get RPC call.&lt;br/&gt;
5. ODL netconf (not the device) returns &quot;transport error&quot; and ODL app now marks the&#160; &#160; &#160; &#160; device with sync-failed status. See the attachment &quot;netconf-rpc-transport-error.txt&quot;&lt;br/&gt;
6. User then calls ODL app RPC for reconnecting the device.&lt;br/&gt;
&#160; &#160;This RPC first deletes the node from ODL netconf-topology and then recreates it.&lt;br/&gt;
&#160; &#160;We added a 5 seconds delay (for investigation purposes only) between the topology&#160; node deletion and recreation to be sure ODL netconf has enough time to dismount and mount the device again.&lt;br/&gt;
&#160; &#160;We added a check if the ODL netconf mountpoint exists (using DOMMountPointService) after the topology node was deleted and before it&apos;s going to be recreated again.&lt;br/&gt;
7. ODL app asks ODL netconf to close the netconf session.&lt;br/&gt;
&#160; &#160;Then we see the ODL netconf logs as they are in the attachment &quot;netconf-logs-after-device-reconnect-rpc-call.txt&quot;.&lt;br/&gt;
8. ODL app asks ODL netconf to start the netconf session again, but it seems that the previous session is stuck.&lt;br/&gt;
&#160; &#160;Our check if the mountpoint still exists, throws an exception because it still exists, even though it should not at this point.&lt;br/&gt;
&#160; &#160;Devops guys from customer also reported lots of hanging netconf session between the device and ODL.&lt;br/&gt;
9. Afterwards we see lots of logs &quot;netconf session established&quot; followed by &quot;netconf session promise complete already&quot; errors in short intervals (few secs).&lt;br/&gt;
&#160; &#160;See the logs in the attachment &quot;netconf-session-promise-complete-already-error.txt&quot;&lt;/p&gt;</description>
                <environment></environment>
        <key id="35830">NETCONF-880</key>
            <summary>Netconf does not close device mountpoint properly</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.opendaylight.org/images/icons/priorities/major.svg">Medium</priority>
                        <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10002">Duplicate</resolution>
                                        <assignee username="rovarga">Robert Varga</assignee>
                                    <reporter username="ifoltin">Igor Foltin</reporter>
                        <labels>
                    </labels>
                <created>Fri, 27 May 2022 14:08:24 +0000</created>
                <updated>Tue, 31 May 2022 19:28:50 +0000</updated>
                            <resolved>Tue, 31 May 2022 19:27:53 +0000</resolved>
                                    <version>2.0.14</version>
                                                    <component>netconf</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="71115" author="ifoltin" created="Mon, 30 May 2022 11:34:45 +0000"  >&lt;p&gt;I added more logs:&lt;br/&gt;
&lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.opendaylight.org/secure/attachment/17718/17718_reproducer-debug-logs.txt&quot; title=&quot;reproducer-debug-logs.txt attached to NETCONF-880&quot;&gt;reproducer-debug-logs.txt&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.opendaylight.org/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;They represent a scenario when a user starts the device reconnection via ODL app RPC (mountpoint removal and recreation).&lt;/p&gt;

&lt;p&gt;The device name is: DCS:DCS:DCS-N9K-LEAF04&lt;br/&gt;
I tried to remove all parts of the logs which were related to other devices present in the DC, so hopefully there should not be any irrelevant logs there.&lt;/p&gt;</comment>
                            <comment id="71119" author="ivanhrasko" created="Tue, 31 May 2022 14:32:17 +0000"  >&lt;p&gt;Are you able to reproduce with netconf 3?&lt;/p&gt;</comment>
                            <comment id="71120" author="rovarga" created="Tue, 31 May 2022 16:56:41 +0000"  >&lt;p&gt;So if I am reading this right, the session goes up after the node has already been deleted:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[DEBUG] 2022-05-30 09:27:50.266 [opendaylight-cluster-data-notification-dispatcher-44] AbstractNetconfTopology - Disconnecting RemoteDevice{DCS:DCS:DCS-N9K-LEAF04}
[TRACE] 2022-05-30 09:27:50.267 [opendaylight-cluster-data-notification-dispatcher-44] NetconfDeviceSalProvider - RemoteDevice{DCS:DCS:DCS-N9K-LEAF04}: Not removing TOPOLOGY mountpoint from MD-SAL, mountpoint was not registered yet
[...]
[TRACE] 2022-05-30 09:27:50.269 [opendaylight-cluster-data-notification-dispatcher-44] NetconfDeviceSalProvider - RemoteDevice{DCS:DCS:DCS-N9K-LEAF04}: TransactionChain(org.opendaylight.mdsal.binding.dom.adapter.BindingDOMTransactionChainAdapter@7b6c02b7) SUCCESSFUL
[DEBUG] 2022-05-30 09:27:51.714 [nioEventLoopGroup-3-1] NetconfDeviceCommunicator - RemoteDevice{DCS:DCS:DCS-N9K-LEAF04}: Session established
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think the problem lies somewhere between NetconfSessionPromise&apos;s retry logic and AsyncSshHandler. Most notably the former does not check for isFailure() and the latter calls explicit setFailure(). The setFailure() is problematic because it renders the Future.cancel() inoperative and hence reconnection attempts are not stopped when the node is deleted.&lt;/p&gt;

&lt;p&gt;We need debug logs for netconf-netty-util including initial connection setup to ascertain what went down.&lt;/p&gt;</comment>
                            <comment id="71121" author="rovarga" created="Tue, 31 May 2022 19:28:50 +0000"  >&lt;p&gt;The recorded failure in NetconfSessionPromise is the same as the one reported in &lt;a href=&quot;https://jira.opendaylight.org/browse/NETCONF-827&quot; title=&quot;Mount loop when setting too low connection-timeout&quot; class=&quot;issue-link&quot; data-issue-key=&quot;NETCONF-827&quot;&gt;&lt;del&gt;NETCONF-827&lt;/del&gt;&lt;/a&gt;, hence I believe these are duplicates.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10002">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="34819">NETCONF-827</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="17717" name="netconf-logs-after-device-reconnect-rpc-call.txt" size="2879" author="ifoltin" created="Fri, 27 May 2022 14:07:25 +0000"/>
                            <attachment id="17716" name="netconf-rpc-transport-error.txt" size="7489" author="ifoltin" created="Fri, 27 May 2022 14:07:25 +0000"/>
                            <attachment id="17715" name="netconf-session-promise-complete-already-error.txt" size="15215" author="ifoltin" created="Fri, 27 May 2022 14:07:25 +0000"/>
                            <attachment id="17718" name="reproducer-debug-logs.txt" size="50369" author="ifoltin" created="Mon, 30 May 2022 11:31:47 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i042hj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>