<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 19:55:38 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[CONTROLLER-1473] Node exists in entity-owner data even after killing the instance</title>
                <link>https://jira.opendaylight.org/browse/CONTROLLER-1473</link>
                <project id="10113" key="CONTROLLER">controller</project>
                    <description>&lt;p&gt;In ovsdb southbound 3 nodes clustering csit, &lt;/p&gt;

&lt;p&gt;ovsdb-csit-3node-clustering-only-beryllium&lt;/p&gt;

&lt;p&gt;when a node goes down and if i check the entity owner among the other two nodes&lt;br/&gt;
 the member 1/2/3 which is killed should not exist but its there in the candidates list.&lt;/p&gt;

&lt;p&gt;Added more delay of 20s to get the entity owner after killing one node. But still the down node exist in operational ds for entity candidates list.&lt;/p&gt;

&lt;p&gt;Here goes the log for the observed behaviour:&lt;/p&gt;

&lt;p&gt;KEYWORD ${data} = Utils . Get Data From URI controller@&lt;/p&gt;
{controller_index_list}&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;, /restconf/operational/entity-owners:entity-owners&lt;br/&gt;
Documentation:	&lt;br/&gt;
Issue a GET request and return the data obtained or on error log the error and fail.&lt;br/&gt;
Start / End / Elapsed:	20160118 20:06:49.764 / 20160118 20:06:49.792 / 00:00:00.028&lt;br/&gt;
00:00:00.020KEYWORD ${response} = RequestsLibrary . Get Request ${session}, ${uri}, ${headers}&lt;br/&gt;
00:00:00.005KEYWORD BuiltIn . Return From Keyword If ${response.status_code} == 200, ${response.text}&lt;br/&gt;
Documentation:	&lt;br/&gt;
Returns from the enclosing user keyword if condition is true.&lt;br/&gt;
Start / End / Elapsed:	20160118 20:06:49.787 / 20160118 20:06:49.792 / 00:00:00.005&lt;br/&gt;
20:06:49.791	INFO	Returning from the enclosing user keyword.	&lt;br/&gt;
20:06:49.792	INFO	${data} = {&quot;entity-owners&quot;:{&quot;entity-type&quot;:[{&quot;type&quot;:&quot;ovsdb&quot;,&quot;entity&quot;:[{&quot;id&quot;:&quot;/network-topology:network-topology/network-topology:topology&lt;span class=&quot;error&quot;&gt;&amp;#91;network-topology:topology-id=&amp;#39;ovsdb:1&amp;#39;&amp;#93;&lt;/span&gt;/network-topology:node[network-top...	&lt;br/&gt;
00:00:00.001KEYWORD BuiltIn . Log ${data}&lt;br/&gt;
Documentation:	&lt;br/&gt;
Logs the given message with the given level.&lt;br/&gt;
Start / End / Elapsed:	20160118 20:06:49.793 / 20160118 20:06:49.794 / 00:00:00.001&lt;br/&gt;
20:06:49.793	INFO	{&quot;entity-owners&quot;:{&quot;entity-type&quot;:[{&quot;type&quot;:&quot;ovsdb&quot;,&quot;entity&quot;:[{&quot;id&quot;:&quot;/network-topology:network-topology/network-topology:topology&lt;span class=&quot;error&quot;&gt;&amp;#91;network-topology:topology-id=&amp;#39;ovsdb:1&amp;#39;&amp;#93;&lt;/span&gt;/network-topology:node&lt;span class=&quot;error&quot;&gt;&amp;#91;network-topology:node-id=&amp;#39;ovsdb://uuid/a96ec4e2-c457-4a2c-963c-1e6300210032&amp;#39;&amp;#93;&lt;/span&gt;&quot;,&quot;candidate&quot;:&lt;span class=&quot;error&quot;&gt;&amp;#91;{&quot;name&quot;:&quot;member-1&quot;},{&quot;name&quot;:&quot;member-2&quot;},{&quot;name&quot;:&quot;member-3&quot;}&amp;#93;&lt;/span&gt;,&quot;owner&quot;:&quot;member-2&quot;}]},{&quot;type&quot;:&quot;ovsdb-southbound-provider&quot;,&quot;entity&quot;:[{&quot;id&quot;:&quot;/general-entity:entity&lt;span class=&quot;error&quot;&gt;&amp;#91;general-entity:name=&amp;#39;ovsdb-southbound-provider&amp;#39;&amp;#93;&lt;/span&gt;&quot;,&quot;candidate&quot;:&lt;span class=&quot;error&quot;&gt;&amp;#91;{&quot;name&quot;:&quot;member-1&quot;},{&quot;name&quot;:&quot;member-3&quot;},{&quot;name&quot;:&quot;member-2&quot;}&amp;#93;&lt;/span&gt;,&quot;owner&quot;:&quot;member-3&quot;}]}]}}KEYWORD ${data} = Utils . Get Data From URI controller@{controller_index_list}
&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt;, /restconf/operational/entity-owners:entity-owners&lt;br/&gt;
00:00:00.001KEYWORD BuiltIn . Log ${data}&lt;br/&gt;
Documentation:	&lt;br/&gt;
Logs the given message with the given level.&lt;br/&gt;
Start / End / Elapsed:	20160118 17:51:08.456 / 20160118 17:51:08.457 / 00:00:00.001&lt;br/&gt;
17:51:08.457	INFO	{&quot;entity-owners&quot;:{&quot;entity-type&quot;:[{&quot;type&quot;:&quot;ovsdb&quot;,&quot;entity&quot;:[{&quot;id&quot;:&quot;/network-topology:network-topology/network-topology:topology&lt;span class=&quot;error&quot;&gt;&amp;#91;network-topology:topology-id=&amp;#39;ovsdb:1&amp;#39;&amp;#93;&lt;/span&gt;/network-topology:node&lt;span class=&quot;error&quot;&gt;&amp;#91;network-topology:node-id=&amp;#39;ovsdb://uuid/a96ec4e2-c457-4a2c-963c-1e6300210032&amp;#39;&amp;#93;&lt;/span&gt;&quot;,&quot;candidate&quot;:[&lt;/p&gt;
{&quot;name&quot;:&quot;member-3&quot;}
&lt;p&gt;,&lt;/p&gt;
{&quot;name&quot;:&quot;member-1&quot;}
&lt;p&gt;,&lt;/p&gt;
{&quot;name&quot;:&quot;member-2&quot;}
&lt;p&gt;],&quot;owner&quot;:&quot;member-1&quot;}]},{&quot;type&quot;:&quot;ovsdb-southbound-provider&quot;,&quot;entity&quot;:[{&quot;id&quot;:&quot;/general-entity:entity&lt;span class=&quot;error&quot;&gt;&amp;#91;general-entity:name=&amp;#39;ovsdb-southbound-provider&amp;#39;&amp;#93;&lt;/span&gt;&quot;,&quot;candidate&quot;:[&lt;/p&gt;
{&quot;name&quot;:&quot;member-3&quot;}
&lt;p&gt;,&lt;/p&gt;
{&quot;name&quot;:&quot;member-1&quot;}
&lt;p&gt;,&lt;/p&gt;
{&quot;name&quot;:&quot;member-2&quot;}
&lt;p&gt;],&quot;owner&quot;:&quot;member-1&quot;}]}]}}&lt;/p&gt;



&lt;p&gt;Full log is available in below link:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://jenkins.opendaylight.org/sandbox/job/ovsdb-csit-3node-clustering-only-beryllium/11/robot/report/log.html&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jenkins.opendaylight.org/sandbox/job/ovsdb-csit-3node-clustering-only-beryllium/11/robot/report/log.html&lt;/a&gt;&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="26027">CONTROLLER-1473</key>
            <summary>Node exists in entity-owner data even after killing the instance</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10003">Cannot Reproduce</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="priya.ramasubbu@hcl.com">Priya Ramasubbu</reporter>
                        <labels>
                    </labels>
                <created>Mon, 18 Jan 2016 14:44:11 +0000</created>
                <updated>Thu, 19 Oct 2017 22:33:46 +0000</updated>
                            <resolved>Fri, 9 Sep 2016 01:10:25 +0000</resolved>
                                    <version>Beryllium</version>
                                                    <component>clustering</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                                                                <comments>
                            <comment id="51199" author="vishnoianil@gmail.com" created="Mon, 18 Jan 2016 21:13:42 +0000"  >&lt;p&gt;It seems to be an issue with the EntityOwnerShipService at clustering level. Moving it to controller/clustering project to get feedback from clustering experts.&lt;/p&gt;</comment>
                            <comment id="51200" author="ecelgp" created="Sat, 23 Jan 2016 23:32:08 +0000"  >&lt;p&gt;This is duplicate of &lt;a href=&quot;https://bugs.opendaylight.org/show_bug.cgi?id=4992&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugs.opendaylight.org/show_bug.cgi?id=4992&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="51201" author="ecelgp" created="Sat, 23 Jan 2016 23:32:48 +0000"  >&lt;p&gt;I meant: &lt;a href=&quot;https://bugs.opendaylight.org/show_bug.cgi?id=5004&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://bugs.opendaylight.org/show_bug.cgi?id=5004&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="51202" author="ecelgp" created="Sun, 24 Jan 2016 00:56:53 +0000"  >&lt;p&gt;This issue is seen in both OpenFlow and OVSDB cluster tests.&lt;/p&gt;</comment>
                            <comment id="51203" author="ecelgp" created="Tue, 26 Jan 2016 19:22:28 +0000"  >&lt;p&gt;Rising priority, it would be good to fix this by Be release.&lt;/p&gt;</comment>
                            <comment id="51204" author="moraja@cisco.com" created="Tue, 26 Jan 2016 19:41:51 +0000"  >&lt;p&gt;Controller logs with loglevel set to DEBUG for org.opendaylight.controller.cluster.datastore.entityownership would be useful.&lt;/p&gt;

&lt;p&gt;Based on the code it seems like we would run into this situation if the switch master was also the leader of the entity-ownership shard.&lt;/p&gt;</comment>
                            <comment id="51205" author="tpantelis" created="Tue, 26 Jan 2016 23:27:32 +0000"  >&lt;p&gt;Patch &lt;a href=&quot;https://git.opendaylight.org/gerrit/#/c/26808/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/#/c/26808/&lt;/a&gt; added the code to onPeerDown to remove the downed node as a candidate if the first node is the leader. Previously we didn&apos;t remove it b/c of the possibility of network partition as opposed to the process actually being down but 26808 addressed that as well. &lt;/p&gt;

&lt;p&gt;In onLeaderChanged, it searches for entities owned by the old leader and re-assigns them but it doesn&apos;t remove the old leader as candidate. Patch 26808 didn&apos;t modify this behavior. &lt;/p&gt;

&lt;p&gt;When the leader is taken down, if a follower becomes the leader before PeerDown then the old leader gets removed from the candidate list. However if PeerDown occurs before the follower becomes the leader then the old leader candidate isn&apos;t removed. The former scenario will occur if the leader is taken down gracefully as it will transfer leadership on shut down. The latter scenario can occur due to timing if the leader process is killed since immediate leadership transfer won&apos;t occur. It sounds like the integration test kills the leader so it will be hit or miss if the candidate is removed (due to the 10s election timeout I think PeerDown will occur first most of the time).&lt;/p&gt;

&lt;p&gt;I created a unit test which illustrates both scenarios. &lt;/p&gt;

&lt;p&gt;It seems we have a disconnect between the onLeaderChanged and onPeerDown behaviors. In onLeaderChanged, I think we need to remove the old leader as candidate if it&apos;s deemed as down (i.e. if PeerDown was previously received). We dot need the logic of re-assigning entities previously owned by the old leader - that&apos;s a remnant from prior to 26808.&lt;/p&gt;

&lt;p&gt;Moiz - what do you think?&lt;/p&gt;</comment>
                            <comment id="51206" author="moraja@cisco.com" created="Wed, 27 Jan 2016 00:33:04 +0000"  >&lt;p&gt;That confirms what I was thinking that in this case onPeerDown did not happen on the leader and thus the candidates were not removed. LeaderChanged and onPeerDown can both indicate that a partition has occurred so I agree with you that we should treat both these cases similarly. &lt;/p&gt;

&lt;p&gt;One thing I would differently is not assume that onPeerDown will happen before LeaderChanged. The reason being that peerDown is controlled by the akka heartbeat configuration and in some cases people may want to set that heartbeat to be greater than the Raft heartbeat and so akka might not be the first to notice the partition.&lt;/p&gt;</comment>
                            <comment id="51207" author="tpantelis" created="Wed, 27 Jan 2016 01:07:10 +0000"  >&lt;p&gt;The behavior of onPeerDown and onLeaderChanged should be the same and, prior to 26808, they were, i.e. call selectNewOwnerForEntitiesOwnedBy. 26808 should&apos;ve also changed onLeaderChanged - my bad. I&apos;ll push a patch (I have the unit test already).&lt;/p&gt;

&lt;p&gt;(In reply to Moiz Raja from comment #9)&lt;br/&gt;
&amp;gt; That confirms what I was thinking that in this case onPeerDown did not&lt;br/&gt;
&amp;gt; happen on the leader and thus the candidates were not removed. LeaderChanged&lt;br/&gt;
&amp;gt; and onPeerDown can both indicate that a partition has occurred so I agree&lt;br/&gt;
&amp;gt; with you that we should treat both these cases similarly. &lt;br/&gt;
&amp;gt; &lt;br/&gt;
&amp;gt; One thing I would differently is not assume that onPeerDown will happen&lt;br/&gt;
&amp;gt; before LeaderChanged. The reason being that peerDown is controlled by the&lt;br/&gt;
&amp;gt; akka heartbeat configuration and in some cases people may want to set that&lt;br/&gt;
&amp;gt; heartbeat to be greater than the Raft heartbeat and so akka might not be the&lt;br/&gt;
&amp;gt; first to notice the partition.&lt;/p&gt;</comment>
                            <comment id="51208" author="tpantelis" created="Wed, 27 Jan 2016 02:12:55 +0000"  >&lt;p&gt;Submitted &lt;a href=&quot;https://git.opendaylight.org/gerrit/#/c/33601/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/#/c/33601/&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="51209" author="ecelgp" created="Fri, 29 Jan 2016 02:39:07 +0000"  >&lt;p&gt;Right this works now according to our test suites.&lt;/p&gt;</comment>
                            <comment id="51210" author="priya.ramasubbu@hcl.com" created="Wed, 7 Sep 2016 13:13:58 +0000"  >&lt;p&gt;The issue still exists in boron.&lt;br/&gt;
The owner node which is killed, does exists in the entity candidate list.&lt;/p&gt;

&lt;p&gt;Current Sandbox link which reproduce the bug again. &lt;br/&gt;
&lt;a href=&quot;https://jenkins.opendaylight.org/sandbox/job/ovsdb-csit-3node-clustering-only-boron/14/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jenkins.opendaylight.org/sandbox/job/ovsdb-csit-3node-clustering-only-boron/14/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;-Priya&lt;/p&gt;</comment>
                            <comment id="51211" author="tpantelis" created="Wed, 7 Sep 2016 13:23:32 +0000"  >&lt;p&gt;This is expected behavior now with &lt;a href=&quot;https://git.opendaylight.org/gerrit/#/c/45271/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/#/c/45271/&lt;/a&gt;. On PeerDown, the leader cannot assume the &quot;down&quot; member node is actually down - it might be isolated. Thus we leave it as candidate but re-assign ownership. If the node was actually down, on restart it will first remove itself as a candidate in case no client on the new incarnation registers a candidate.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10002">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="26028">CONTROLLER-1474</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>4992</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=4992]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i02qsf:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>