<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 20:33:04 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[OPNFLWPLUG-668] [Clustering] Switch state resync after cluster restart.</title>
                <link>https://jira.opendaylight.org/browse/OPNFLWPLUG-668</link>
                <project id="10155" key="OPNFLWPLUG">OpenFlowPlugin</project>
                    <description>&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;I was testing the cluster restart scenario with latest  Be Code with He Plugin.&lt;br/&gt;
Tested in 3 node cluster with Stable Beryllium + Helium Plugin + JDK8 with G1GC.Tested with OVS 2.3.2.&lt;/p&gt;

&lt;p&gt;Build used        : &lt;br/&gt;
===================&lt;br/&gt;
Karaf distro from latest ODL stable Beryllium code&lt;/p&gt;

&lt;p&gt;Objective of test : &lt;br/&gt;
===================&lt;br/&gt;
To validate the cluster restart and see if all the flows get configured in the switch.&lt;/p&gt;

&lt;p&gt;Configuration and topology:&lt;br/&gt;
===========================&lt;/p&gt;

&lt;p&gt;i.Controllers (c1, c2 and c3) VMs are running in Dell machine say h1, each VM has 8 vcpu and 16g RAM configuration&lt;/p&gt;

&lt;p&gt;ii. Mininet (m1, m2 and m3) VMs with ovs version 2.3.2 are running in different Dell machine say h2, each VM has 8 vcpu and 16g RAM configuration.&lt;/p&gt;

&lt;p&gt;m1 with 5 switches (1 to 6) connected to c1&lt;br/&gt;
m2 with 5 switches (6 - 11) connected to c2&lt;br/&gt;
m3 with 5 switches (11 - 16) connected to c3 &lt;/p&gt;

&lt;p&gt;Test Steps :&lt;br/&gt;
============&lt;br/&gt;
Pre-requisite: Pushed 10K flows in 15 switches(150k flows in the cluster) and config DS shows 150k flows.&lt;br/&gt;
Note all the 15 switches were connected and then the flows were pushed.&lt;br/&gt;
So Cluster has 10k flows per switch,5 switches per node.Total 15 switches,150K flows configured in the 3 node cluster.&lt;/p&gt;

&lt;p&gt;Steps:&lt;br/&gt;
i. Stop all the Nodes c1,c2,c3 &lt;br/&gt;
ii. Disconnected 15 switches connecting to Node c1,c2,c3.&lt;br/&gt;
iii. Connect the 15 switches to the Node c1,c2,c3.&lt;br/&gt;
iv. Start the  Node c1,c2,c3.&lt;/p&gt;

&lt;p&gt;Observations&lt;br/&gt;
============&lt;br/&gt;
After cluster restart we see that in some switches there are less than 10K flows.Out of 15, 6 switches has 0 flows and remaining 9 switches shows 10K flows.&lt;/p&gt;

&lt;p&gt;Attaching the logs for more clarity.&lt;/p&gt;

&lt;p&gt;Thanks &amp;amp; Regards,&lt;br/&gt;
Saibal Roy.&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="27936">OPNFLWPLUG-668</key>
            <summary>[Clustering] Switch state resync after cluster restart.</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10003">Cannot Reproduce</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="saibal.roy@ericsson.com">Saibal Roy</reporter>
                        <labels>
                    </labels>
                <created>Tue, 5 Apr 2016 11:41:57 +0000</created>
                <updated>Mon, 27 Sep 2021 09:01:47 +0000</updated>
                            <resolved>Fri, 29 Jul 2016 10:34:03 +0000</resolved>
                                                                    <component>General</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                                                                <comments>
                            <comment id="57800" author="saibal.roy@ericsson.com" created="Tue, 5 Apr 2016 11:41:57 +0000"  >&lt;p&gt;Attachment 15switches.zip has been added with description: logs for Switch state resync after cluster restart&lt;/p&gt;</comment>
                            <comment id="57798" author="muthukumaran.k@ericsson.com" created="Tue, 5 Apr 2016 14:01:13 +0000"  >&lt;p&gt;Hi Saibal, &lt;/p&gt;

&lt;p&gt;Looking at logs and the symptoms you have observed, this could be a case where datastore may not be fully available (all persisted data restored consistently across the cluster) when switches reconnect. &lt;/p&gt;

&lt;p&gt;Specifically in this case, switches are constantly hunting for port 6633 to be opened on all cluster nodes. So, soon after the ports get opened (rather prematurely) switches pounce upon the controller nodes like hungry tigers. &lt;/p&gt;

&lt;p&gt;But at that juncture, perhaps datastore is still in mode of coming up (doing restoration etc.). &lt;/p&gt;

&lt;p&gt;One quick way to verify is to add a linux firewall rule which blocks port 6633 for 3-5 minutes as part of karaf.sh. &lt;/p&gt;

&lt;p&gt;This can prevent switches from connecting prematurely before datastore becomes fully &quot;ready&quot;. &lt;/p&gt;

&lt;p&gt;If we can confirm the clean behavior with this hack, we can discuss on a more cleaner solution for how to prevent 6633 from opening up when all backends are in clean &quot;ready&quot; state&lt;/p&gt;

&lt;p&gt;Regards&lt;br/&gt;
Muthu&lt;/p&gt;</comment>
                            <comment id="57799" author="muthukumaran.k@ericsson.com" created="Fri, 29 Jul 2016 10:34:03 +0000"  >&lt;p&gt;To be retested on latest boron master with lithium plugin combination to re-establish this. Mainly because Boron release is moving with Lithium&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="14056" name="15switches.zip" size="215415" author="saibal.roy@ericsson.com" created="Tue, 5 Apr 2016 11:41:57 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5659</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=5659]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    <customfield id="customfield_10202" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Priority</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10312"><![CDATA[High]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i032kn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>