<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 19:56:24 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[CONTROLLER-1762] ODL is up and ports are listening but not functional</title>
                <link>https://jira.opendaylight.org/browse/CONTROLLER-1762</link>
                <project id="10113" key="CONTROLLER">controller</project>
                    <description>&lt;p&gt;Description of problem: On running longevity tests in a clustered ODL setup we see that one of the ODL instances seems to be up and running as reported by ps output, systemctl and netstat listening ports, however it doesn&apos;t seem to be functional. We could not even ssh into the karaf terminal using ssh -p 8101 karaf@172.16.0.16 until we restarted opendaylight. On performing a service restart we were able to get into the karaf shell and ODL seemed to come back up.&lt;br/&gt;
Out of the other two instances of ODL, one was killed due to OOM and the other seemed to be running fine. This happens after about 42 hours of running the tests.&lt;br/&gt;
Setup:&lt;br/&gt;
3 ODLs&lt;br/&gt;
3 OpenStack Controllers&lt;br/&gt;
3 Compute nodes&lt;/p&gt;

&lt;p&gt;Test:&lt;br/&gt;
Create 40 neutron resources (rotuers, networks etc) 2 at a time using Rally and delete them over and over again. This is a long running low stress test.&lt;/p&gt;


&lt;p&gt;Entire Karaf Log: &lt;a href=&quot;http://8.43.86.1:8088/smalleni/karaf-controller-0.log.tar.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://8.43.86.1:8088/smalleni/karaf-controller-0.log.tar.gz&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ODL RPM from upstream: python-networking-odl-11.0.0-0.20170806093629.2e78dca.el7ost.noarch&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="26316">CONTROLLER-1762</key>
            <summary>ODL is up and ports are listening but not functional</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10002">Duplicate</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="smalleni@redhat.com">Sai Sindhur Malleni</reporter>
                        <labels>
                    </labels>
                <created>Mon, 28 Aug 2017 21:25:12 +0000</created>
                <updated>Thu, 19 Oct 2017 21:25:01 +0000</updated>
                            <resolved>Tue, 12 Sep 2017 17:44:14 +0000</resolved>
                                    <version>Carbon</version>
                                                    <component>clustering</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                                                                <comments>
                            <comment id="52663" author="smalleni@redhat.com" created="Mon, 28 Aug 2017 21:52:37 +0000"  >&lt;p&gt;ODL became non-functional around 10:44 UTC 08/28/2017. This was confirmed as collectd which talks tothe Karaf JMX suddenly stopped reporting values for heap size. Collectd was able to talk to Karaf JMX after the service restart. The break can be clearly observed at: &lt;a href=&quot;https://snapshot.raintank.io/dashboard/snapshot/nf6OWq7jNSeT6vwjM71jlUSWc31E9LdW&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://snapshot.raintank.io/dashboard/snapshot/nf6OWq7jNSeT6vwjM71jlUSWc31E9LdW&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52664" author="jluhrsen" created="Mon, 28 Aug 2017 22:03:55 +0000"  >&lt;p&gt;the karaf.log file is ~1G so it&apos;s hard to debug with it. I did see a ton of timed out messages in genius.lockmanager-impl at one point, but that could just be a &lt;br/&gt;
symptom. And in reality they are taking up less than 1% of the total log messages.&lt;/p&gt;</comment>
                            <comment id="52665" author="vorburger" created="Tue, 29 Aug 2017 09:06:15 +0000"  >&lt;p&gt;&amp;gt; it doesn&apos;t seem to be functional. We could not even ssh&lt;br/&gt;
&amp;gt; into the karaf terminal using ssh -p 8101 karaf@172.16.0.16&lt;/p&gt;

&lt;p&gt;Sai, when you hit this kind of situation, it would interesting if we could have a &quot;thread stack dump&quot; of that JVM, to see if we can spot anything obvious (like e.g. an obvious deadlock, or an extreme number of threads).  Use the JDK&apos;s &quot;jstack&quot; utility to obtain this.  Try &quot;jstack -h&quot; to learn it, if you&apos;ve never used it.  Use the -l flag.  You&apos;ll need to check as what user you have to run jstack to work - try first on a non-stuck instance?  If jstack still doesn&apos;t work, its -F flag helps sometimes, but having to use that is a bad sign.  If still doesn&apos;t work, then we would probably need help from our openjdk team friends at Red Hat to be able to understand what horrible thing ODL code may be doing to get a JVM that badly stuck.&lt;/p&gt;</comment>
                            <comment id="52666" author="smalleni@redhat.com" created="Tue, 29 Aug 2017 11:20:40 +0000"  >
&lt;p&gt;Sure,  Michael, Next time I will do that.&lt;/p&gt;


&lt;p&gt;If it helps: The karaf thread count &lt;a href=&quot;https://snapshot.raintank.io/dashboard/snapshot/EgrJsRB7HJ6tl1pjLlSY4hb6wWvJS7nT&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://snapshot.raintank.io/dashboard/snapshot/EgrJsRB7HJ6tl1pjLlSY4hb6wWvJS7nT&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We can see that arund 10:44 UTC the thread count suddenly spikes and falls back after a restart.&lt;/p&gt;</comment>
                            <comment id="52667" author="smalleni@redhat.com" created="Tue, 29 Aug 2017 11:21:24 +0000"  >&lt;p&gt;ODL RPM used was opendaylight-6.2.0-0.1.20170817rel1931.el7.noarch&lt;/p&gt;</comment>
                            <comment id="52668" author="vorburger" created="Tue, 29 Aug 2017 11:42:51 +0000"  >&lt;p&gt;&amp;gt; If it helps: The karaf thread count &lt;/p&gt;

&lt;p&gt;Sai, that actually is an interesting observation - it&apos;s certainly possible that, in addition to OOM problems due to Memory Leaks related to TransactionChain (&lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1756&quot; title=&quot;OOM due to huge Map in ShardDataTree&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1756&quot;&gt;&lt;del&gt;CONTROLLER-1756&lt;/del&gt;&lt;/a&gt;) and broken clustering (&lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1763&quot; title=&quot;On restarting ODL on one node, ODL on another node dies in a clustered setup&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1763&quot;&gt;&lt;del&gt;CONTROLLER-1763&lt;/del&gt;&lt;/a&gt;) we also have a &quot;Thread leak&quot; issue (i.e. unbounded creation of new threads, instead of correctly using a pool / Executor thing), somewhere in the code... could be anywhere - netvirt, genius, mdsal - who knows.  But before digging more into that, we would first need definitive proof that is the underlying root cause here - maybe the systemctl status in &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1763&quot; title=&quot;On restarting ODL on one node, ODL on another node dies in a clustered setup&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1763&quot;&gt;&lt;del&gt;CONTROLLER-1763&lt;/del&gt;&lt;/a&gt; will give us that, let&apos;s see.&lt;/p&gt;</comment>
                            <comment id="52669" author="vorburger" created="Wed, 6 Sep 2017 16:35:28 +0000"  >&lt;p&gt;Wondering if &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1755&quot; title=&quot;RaftActor lastApplied index moves backwards&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1755&quot;&gt;&lt;del&gt;CONTROLLER-1755&lt;/del&gt;&lt;/a&gt; may helped fix this - let&apos;s re-test and confirm is still seen.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10000">
                    <name>Blocks</name>
                                                                <inwardlinks description="is blocked by">
                                        <issuelink>
            <issuekey id="26309">CONTROLLER-1755</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                            <issuelinktype id="10002">
                    <name>Duplicate</name>
                                            <outwardlinks description="duplicates">
                                        <issuelink>
            <issuekey id="26310">CONTROLLER-1756</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9063</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=9063]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10204" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>ODL SR Target Milestone</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10320"><![CDATA[Nitrogen]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10202" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Priority</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10313"><![CDATA[Highest]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i02skn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>