<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 20:36:22 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[OVSDB-428] Suspected memory leak in TransactionInvokerImpl</title>
                <link>https://jira.opendaylight.org/browse/OVSDB-428</link>
                <project id="10158" key="OVSDB">ovsdb</project>
                    <description>&lt;p&gt;HPROF analysis with MAT in the context of &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1756&quot; title=&quot;OOM due to huge Map in ShardDataTree&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1756&quot;&gt;&lt;del&gt;CONTROLLER-1756&lt;/del&gt;&lt;/a&gt; is showing 50 MB used up in org.opendaylight.ovsdb.southbound.transactions.md.TransactionInvokerImpl, at a stage when (according to Sridhar/Sai) the system should be &quot;at rest&quot; - so this looks suspicious.. so I&apos;m guessing something is probably not really done right there.&lt;/p&gt;

&lt;p&gt;see also &lt;a href=&quot;https://jira.opendaylight.org/browse/OVSDB-423&quot; title=&quot;TransactionChain created in TransactionInvokerImpl.&amp;lt;init&amp;gt; line 53 is never closed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;OVSDB-423&quot;&gt;&lt;del&gt;OVSDB-423&lt;/del&gt;&lt;/a&gt; in related code, but that just cleaned up shutdown, not fix this.&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="22120">OVSDB-428</key>
            <summary>Suspected memory leak in TransactionInvokerImpl</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.opendaylight.org/images/icons/priorities/critical.svg">High</priority>
                        <status id="1" iconUrl="https://jira.opendaylight.org/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="blue-gray"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="Avishnoi">Anil Vishnoi</assignee>
                                    <reporter username="vorburger">Michael Vorburger</reporter>
                        <labels>
                    </labels>
                <created>Thu, 7 Sep 2017 12:43:38 +0000</created>
                <updated>Wed, 24 Feb 2021 12:40:09 +0000</updated>
                                                                            <component>Southbound.Open_vSwitch</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="41690" author="vorburger" created="Fri, 8 Sep 2017 14:55:11 +0000"  >&lt;p&gt;&lt;a href=&quot;https://drive.google.com/open?id=0B7gTXYDlI5sLcDdaQTFZUDlyX2s&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://drive.google.com/open?id=0B7gTXYDlI5sLcDdaQTFZUDlyX2s&lt;/a&gt; has the HPROF where I&apos;ve seen this 50 MB in TransactionInvokerImpl. (Please just completely IGNORE the md.sal.trace memory consumption you&apos;ll also see in this HPROF; that&apos;s just a debug thing, not real production memory consumption.)&lt;/p&gt;</comment>
                            <comment id="63068" author="thapar" created="Wed, 23 May 2018 04:24:18 +0000"  >&lt;p&gt;&lt;a href=&quot;https://jira.opendaylight.org/secure/ViewProfile.jspa?name=vorburger&quot; class=&quot;user-hover&quot; rel=&quot;vorburger&quot;&gt;vorburger&lt;/a&gt; Is this still an issue or has it been fixed by subsequent patches?&lt;/p&gt;</comment>
                            <comment id="63076" author="vorburger" created="Wed, 23 May 2018 08:57:59 +0000"  >&lt;p&gt;I am not aware of any subsequent patches which could have fixed this, so unless proven otherwise, most likely still an issue.&lt;/p&gt;</comment>
                            <comment id="63077" author="thapar" created="Wed, 23 May 2018 09:16:10 +0000"  >&lt;p&gt;Was this a leak or actual usage? OVS sends status updates every 5 seconds unless disabled. Each update from OVS will result in Operational DS read for data on that switch. Stephen fixed transaction leaks in OVSDB, so wondering if we still have actual leaks or just high usage.&lt;/p&gt;

&lt;p&gt;Would it be possible to run this analysis with more recent code, with some of the tweaks we done recently in configuration to reduce data needed at &apos;rest&apos; by cutting down on unnecessary status updates.&lt;/p&gt;</comment>
                            <comment id="63078" author="vorburger" created="Wed, 23 May 2018 09:36:25 +0000"  >&lt;p&gt;Suspected leak not actual usage (50 MB is A LOT even for&#160;actual usage) because, see above, &quot;seen&#160;at a stage when (according to Sridhar/Sai) the system should be &quot;at rest&quot; - so this looks suspicious&quot;. &lt;a href=&quot;https://jira.opendaylight.org/secure/ViewProfile.jspa?name=skitt&quot; class=&quot;user-hover&quot; rel=&quot;skitt&quot;&gt;skitt&lt;/a&gt;&#160;fixed transaction leaks in OVSDB references to JIRA/Gerrits? master only or stable/oxygen also? Re. re-run, we need &lt;a href=&quot;https://jira.opendaylight.org/secure/ViewProfile.jspa?name=SaiMarapaReddy&quot; class=&quot;user-hover&quot; rel=&quot;SaiMarapaReddy&quot;&gt;SaiMarapaReddy&lt;/a&gt; and/or &lt;a href=&quot;https://jira.opendaylight.org/secure/ViewProfile.jspa?name=JankiChhatbar&quot; class=&quot;user-hover&quot; rel=&quot;JankiChhatbar&quot;&gt;JankiChhatbar&lt;/a&gt; to produce a new HPROF heap dump whenever the next scale lab runs so that we can check ther eif this is still visible.&lt;/p&gt;</comment>
                            <comment id="67540" author="rovarga" created="Thu, 5 Dec 2019 19:42:26 +0000"  >&lt;p&gt;Looking at the dump, it is a single TransactionInvokerImpl instance, which breaks down as follows:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;transactionToCommand -&amp;gt; holding two OvsdbOperationalCommandAggregators, one weighing 9MB, the other 10K&lt;/li&gt;
	&lt;li&gt;inputQueue, which has 4248 elements, for a total of 41MB retained size. again, these are OvsdbOperationalCommandAggregators &amp;lt;10K each&lt;/li&gt;
&lt;/ul&gt;
</comment>
                            <comment id="67541" author="rovarga" created="Thu, 5 Dec 2019 19:47:44 +0000"  >&lt;p&gt;Looking at the retained objects, &lt;a href=&quot;https://git.opendaylight.org/gerrit/c/ovsdb/+/86105&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/c/ovsdb/+/86105&lt;/a&gt; should help a bit (by making RowUpdates smaller). Other than that, I am not sure &#8211; the queue certainly should be drained at some point.&lt;/p&gt;</comment>
                            <comment id="67542" author="rovarga" created="Thu, 5 Dec 2019 20:37:58 +0000"  >&lt;p&gt;Okay, so looking at TransactionInvokerImpl, it should be left bereft of life.&lt;/p&gt;

&lt;p&gt;What it does is provide a MPSC queueing to make things be asynchronous, hence all OVSDB nodes are submitting all commands through this queue &#8211; which ends up dancing with a transaction chain and transaction-per command thing.&lt;/p&gt;

&lt;p&gt;Now ... each OvsdbConnectionInstance is inherently bound to a particular InstanceIdentifier&amp;lt;Node&amp;gt;, which (I think, needs to be confirmed) essentially binds what subtrees each instance actually updates through its commands. If we have an overlap, we need to understand where and why (and what can we do about it).&lt;/p&gt;

&lt;p&gt;The invoker is always invoked from the context of a particular netty thread, which is also bound to an instance (the channel, each is serviced by at most one thread).&lt;/p&gt;

&lt;p&gt;Hence we really want to have a per-OvsdbConnectionInstance invoker, which is synchronous (in terms of interaction with other OVSDB code) and does the transaction thing with commands &#8211; no command queue, just talk directly to the transaction chain. Note the access needs to be synchronized to guard against concurrent failures (i.e. chain failing when we are touching it).&lt;/p&gt;

&lt;p&gt;The transaction chain should then be switched to pingpong &#8211; we do not care if transactions are failing in bunches, that&apos;s completely fine, as we will just retry them.&lt;/p&gt;

&lt;p&gt;Finally, a command&apos;s access to the transaction should be mediated by a memoizing supplier &#8211; we pass that to the invoker and if it asks for a transaction, we know we need to allocate it, etc. etc.&lt;/p&gt;

&lt;p&gt;With that, we:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;will eliminate 41MiB out of the 50MiB overhead reported here&lt;/li&gt;
	&lt;li&gt;will exert natural backpressure on incoming channels&lt;/li&gt;
	&lt;li&gt;will eliminate unnecessary synchronization on the MPSC queue&lt;/li&gt;
	&lt;li&gt;will elide transaction allocation when not needed&lt;/li&gt;
	&lt;li&gt;will be submitting things into DS in parallel, while also taking advantage of its batching capabilities&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&#160;&lt;/p&gt;</comment>
                            <comment id="67543" author="rovarga" created="Thu, 5 Dec 2019 20:43:31 +0000"  >&lt;p&gt;Based on the analysis below, I think this is not a leak, but a tail-end of a workload: the system was under pressure, which has ceized &#8211; but we have a queued backlog we are still processing. Dark buffers &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.opendaylight.org/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;Eliminating the queue will improve latency and while it may impact test case time negatively (by keeping backpressure towards OVSes), the tail end will essentially be non-existent or it will be pushed down to DS processing queue (which I think is pretty efficient).&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9114</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=9114]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10202" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Priority</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10312"><![CDATA[High]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i022o7:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>