<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 19:56:19 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[CONTROLLER-1738] RequestTimeoutException due to &quot;Shard has no current leader&quot; after shutdown-shard-replica with ShardLeaderStateChanged not delivered</title>
                <link>https://jira.opendaylight.org/browse/CONTROLLER-1738</link>
                <project id="10113" key="CONTROLLER">controller</project>
                    <description>&lt;p&gt;This is similar to &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1717&quot; title=&quot;RequestTimeoutException due to &amp;quot;Failed to transfer leadership&amp;quot; after become-prefix-leader with RoleChangeNotification not delivered&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1717&quot;&gt;&lt;del&gt;CONTROLLER-1717&lt;/del&gt;&lt;/a&gt; but a different message has been lost here.&lt;br/&gt;
I believe &lt;a href=&quot;https://jira.opendaylight.org/browse/CONTROLLER-1714&quot; title=&quot;RequestTimeoutException after remove-shard-replica (without apparent cause)&quot; class=&quot;issue-link&quot; data-issue-key=&quot;CONTROLLER-1714&quot;&gt;&lt;del&gt;CONTROLLER-1714&lt;/del&gt;&lt;/a&gt; was exactly this, but with less evidence in karaf log.&lt;/p&gt;

&lt;p&gt;The Robot failure &lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; is the usual 120s timeout we can see caused by multiple bugs (from transaction writer for module-based shard, tell-based protocol):&lt;br/&gt;
  RequestTimeoutException: Timed out after 120.029805238seconds&lt;/p&gt;

&lt;p&gt;Looking at karaf log &lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; of member-1 (writer, old leader), we can see leadership has been successfully transferred at 04:04:59,250 but the information about the new leader being there has been lost:&lt;br/&gt;
2017-07-04 04:04:59,252 | INFO  | lt-dispatcher-42 | LocalActorRef                    | 174 - com.typesafe.akka.slf4j - 2.4.18 | Message &lt;span class=&quot;error&quot;&gt;&amp;#91;org.opendaylight.controller.cluster.datastore.messages.ShardLeaderStateChanged&amp;#93;&lt;/span&gt; from Actor&lt;a href=&quot;#145361760&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;akka://opendaylight-cluster-data/user/shardmanager-config/member-1-shard-default-config#145361760&lt;/a&gt; to Actor&lt;a href=&quot;#-591265397&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;akka://opendaylight-cluster-data/user/shardmanager-config/member-1-shard-default-config/member-1-shard-default-config-notifier#-591265397&lt;/a&gt; was not delivered. &lt;span class=&quot;error&quot;&gt;&amp;#91;5&amp;#93;&lt;/span&gt; dead letters encountered. This logging can be turned off or adjusted with configuration settings &apos;akka.log-dead-letters&apos; and &apos;akka.log-dead-letters-during-shutdown&apos;.&lt;/p&gt;

&lt;p&gt;So the member new it is a Follower, but it was unable to tell client who the new leader is.&lt;br/&gt;
2017-07-04 04:05:19,248 | WARN  | monPool-worker-2 | AbstractShardBackendResolver     | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.1.Carbon | Failed to resolve shard&lt;br/&gt;
java.util.concurrent.TimeoutException: Shard has no current leader&lt;/p&gt;

&lt;p&gt;Perhaps there is a common underlying Bug which causes occasional undelivered messages, and we see different symptoms depending on which message gets lost.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;0&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/771/log.html.gz#s1-s20-t1-k2-k8&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/771/log.html.gz#s1-s20-t1-k2-k8&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;1&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/771/odl1_karaf.log.gz&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/771/odl1_karaf.log.gz&lt;/a&gt;&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="26292">CONTROLLER-1738</key>
            <summary>RequestTimeoutException due to &quot;Shard has no current leader&quot; after shutdown-shard-replica with ShardLeaderStateChanged not delivered</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="10003" iconUrl="https://jira.opendaylight.org/images/icons/status_generic.gif" description="">Confirmed</status>
                    <statusCategory id="2" key="new" colorName="blue-gray"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="-1">Unassigned</assignee>
                                    <reporter username="vrpolak">Vratko Polak</reporter>
                        <labels>
                    </labels>
                <created>Tue, 4 Jul 2017 08:53:03 +0000</created>
                <updated>Tue, 25 Jul 2023 08:24:44 +0000</updated>
                                                                            <component>clustering</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                                                                <comments>
                            <comment id="52504" author="tpantelis" created="Tue, 4 Jul 2017 11:05:14 +0000"  >&lt;p&gt;Can you describe the detailed sequence of the test? You mention shutdown-shard-replica - I assume this is the &quot;backdoor&quot; RPC to gracefully shut down a shard actor unbeknownst to the ShardManager. However the ShardManager does not know the shard actor was shutdown so still has a record of it. I believe  the ShardLeaderStateChanged goes to dead letters b/c the notifier actor goes away with the shard actor, which wouldn&apos;t matter if the ShardManager had initiated the shutdown thru the normal code paths.&lt;/p&gt;

&lt;p&gt;I&apos;ve warned about using that backdoor for testing - I still think it&apos;s better to use normal code paths.&lt;/p&gt;</comment>
                            <comment id="52505" author="rovarga" created="Tue, 4 Jul 2017 14:17:02 +0000"  >&lt;p&gt;Right, since the notifier is a child of the shard actor, it gets reaped alongside with it. &lt;a href=&quot;https://git.opendaylight.org/gerrit/59210&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/59210&lt;/a&gt; adds a deathwatch, which should solve this case, I think.&lt;/p&gt;</comment>
                            <comment id="52506" author="vrpolak" created="Thu, 6 Jul 2017 13:05:34 +0000"  >&lt;p&gt;&amp;gt; You mention shutdown-shard-replica&lt;/p&gt;

&lt;p&gt;Yes, it is a RPC implemented here &lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;The interesting fact is that the test passes most of the time on RelEng.&lt;br/&gt;
Here is the failure encountered on Sandbox run with debug logs &lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt;.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;2&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://git.opendaylight.org/gerrit/gitweb?p=controller.git;a=blob;f=opendaylight/md-sal/samples/clustering-test-app/provider/src/main/java/org/opendaylight/controller/clustering/it/provider/MdsalLowLevelTestProvider.java;h=e0e8d99d1aab3eac8df847bf7075de6e15b0257e;hb=refs/heads/stable/carbon#l574&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/gitweb?p=controller.git;a=blob;f=opendaylight/md-sal/samples/clustering-test-app/provider/src/main/java/org/opendaylight/controller/clustering/it/provider/MdsalLowLevelTestProvider.java;h=e0e8d99d1aab3eac8df847bf7075de6e15b0257e;hb=refs/heads/stable/carbon#l574&lt;/a&gt;&lt;br/&gt;
&lt;span class=&quot;error&quot;&gt;&amp;#91;3&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-cls-only-carbon/1/log.html.gz#s1-s2-t1-k2-k8&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-clustering-cls-only-carbon/1/log.html.gz#s1-s2-t1-k2-k8&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52507" author="vrpolak" created="Tue, 11 Jul 2017 14:17:46 +0000"  >&lt;p&gt;As a workaround, we can change the tests &lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; to use remove-shard-replica instead.&lt;br/&gt;
This bug would stay opened, but importance would be lower.&lt;/p&gt;

&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; &lt;a href=&quot;https://git.opendaylight.org/gerrit/60140&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/60140&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="52508" author="vrpolak" created="Wed, 12 Jul 2017 07:22:34 +0000"  >&lt;p&gt;&lt;span class=&quot;error&quot;&gt;&amp;#91;4&amp;#93;&lt;/span&gt; merged, importance lowered.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10002">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="26268">CONTROLLER-1714</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>8794</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=8794]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10206" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Issue Type</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10300"><![CDATA[Bug]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i02sfb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>