<!-- 
RSS generated by JIRA (8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d) at Wed Feb 07 19:55:51 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>OpenDaylight JIRA</title>
    <link>https://jira.opendaylight.org</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>8.20.10</version>
        <build-number>820010</build-number>
        <build-date>22-06-2022</build-date>
    </build-info>


<item>
            <title>[CONTROLLER-1558] Routed RPCs in cluster breaks after isolation/heal</title>
                <link>https://jira.opendaylight.org/browse/CONTROLLER-1558</link>
                <project id="10113" key="CONTROLLER">controller</project>
                    <description>&lt;p&gt;If routed RPC is registered on one node in cluster then it is routed to this node from any other cluster node (using restconf-rcp).&lt;br/&gt;
But after isolation and heal (valid for both - leader and follower) routing gets broken. Most common result is that one of survival nodes - which never owned the service - is unable to deliver RPC call and the other 2 work. &lt;/p&gt;

&lt;p&gt;Restconf output:&lt;br/&gt;
&amp;lt;errors xmlns=&quot;urn:ietf:params:xml:ns:yang:ietf-restconf&quot;&amp;gt;&amp;lt;error&amp;gt;&amp;lt;error-type&amp;gt;application&amp;lt;/error-type&amp;gt;&amp;lt;error-tag&amp;gt;operation-not-supported&amp;lt;/error-tag&amp;gt;&amp;lt;error-message&amp;gt;Rpc implementation for {} was removed during processing.&amp;lt;/error-message&amp;gt;&amp;lt;/error&amp;gt;&amp;lt;/errors&amp;gt;&lt;/p&gt;

&lt;p&gt;or&lt;/p&gt;

&lt;p&gt;&amp;lt;errors xmlns=&quot;urn:ietf:params:xml:ns:yang:ietf-restconf&quot;&amp;gt;&amp;lt;error&amp;gt;&amp;lt;error-type&amp;gt;application&amp;lt;/error-type&amp;gt;&amp;lt;error-tag&amp;gt;operation-not-supported&amp;lt;/error-tag&amp;gt;&amp;lt;error-message&amp;gt;No local or remote implementation available for rpc AbsoluteSchemaPath&lt;/p&gt;
{path=[(urn:opendaylight:groupbasedpolicy:base_endpoint?revision=2016-04-27)register-endpoint]}
&lt;p&gt;&amp;lt;/error-message&amp;gt;&amp;lt;/error&amp;gt;&amp;lt;/errors&amp;gt;&lt;/p&gt;



&lt;p&gt;Tested on 3node cluster, branch:master (mvn -U @ 2016-10-13).&lt;br/&gt;
Note: while node#3 was isolated we got this at node#2 (and then after heal node#3 was broken)&lt;br/&gt;
2016-10-13 09:41:06,842 | DEBUG | lt-dispatcher-18 | QuarantinedMonitorActor          | 212 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | received AssociationErrorEvent&lt;br/&gt;
akka.remote.EndpointAssociationException: Association failed with &lt;span class=&quot;error&quot;&gt;&amp;#91;akka.tcp://opendaylight-cluster-data@10.25.2.13:2550&amp;#93;&lt;/span&gt;&lt;br/&gt;
Caused by: java.util.concurrent.TimeoutException: No response from remote for outbound association. Associate timed out after &lt;span class=&quot;error&quot;&gt;&amp;#91;15000 ms&amp;#93;&lt;/span&gt;.&lt;br/&gt;
        at akka.remote.transport.ProtocolStateActor$$anonfun$2.applyOrElse(AkkaProtocolTransport.scala:362)&lt;span class=&quot;error&quot;&gt;&amp;#91;210:com.typesafe.akka.remote:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.remote.transport.ProtocolStateActor$$anonfun$2.applyOrElse(AkkaProtocolTransport.scala:336)&lt;span class=&quot;error&quot;&gt;&amp;#91;210:com.typesafe.akka.remote:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)&lt;span class=&quot;error&quot;&gt;&amp;#91;196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.actor.FSM$class.processEvent(FSM.scala:662)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.remote.transport.ProtocolStateActor.processEvent(AkkaProtocolTransport.scala:283)&lt;span class=&quot;error&quot;&gt;&amp;#91;210:com.typesafe.akka.remote:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:656)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:628)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.actor.Actor$class.aroundReceive(Actor.scala:484)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.remote.transport.ProtocolStateActor.aroundReceive(AkkaProtocolTransport.scala:283)&lt;span class=&quot;error&quot;&gt;&amp;#91;210:com.typesafe.akka.remote:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.actor.ActorCell.invoke(ActorCell.scala:495)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at akka.dispatch.Mailbox.exec(Mailbox.scala:234)&lt;span class=&quot;error&quot;&gt;&amp;#91;200:com.typesafe.akka.actor:2.4.7&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)&lt;span class=&quot;error&quot;&gt;&amp;#91;196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)&lt;span class=&quot;error&quot;&gt;&amp;#91;196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)&lt;span class=&quot;error&quot;&gt;&amp;#91;196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8&amp;#93;&lt;/span&gt;&lt;br/&gt;
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)&lt;span class=&quot;error&quot;&gt;&amp;#91;196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8&amp;#93;&lt;/span&gt;&lt;/p&gt;</description>
                <environment>&lt;p&gt;Operating System: All&lt;br/&gt;
Platform: All&lt;/p&gt;</environment>
        <key id="26112">CONTROLLER-1558</key>
            <summary>Routed RPCs in cluster breaks after isolation/heal</summary>
                <type id="10104" iconUrl="https://jira.opendaylight.org/secure/viewavatar?size=xsmall&amp;avatarId=10303&amp;avatarType=issuetype">Bug</type>
                                                <status id="5" iconUrl="https://jira.opendaylight.org/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="green"/>
                                    <resolution id="10000">Done</resolution>
                                        <assignee username="tcere">Tomas Cere</assignee>
                                    <reporter username="michal.rehak">Michal Rehak</reporter>
                        <labels>
                    </labels>
                <created>Thu, 13 Oct 2016 14:05:29 +0000</created>
                <updated>Tue, 25 Jul 2023 08:24:14 +0000</updated>
                            <resolved>Wed, 8 Mar 2017 19:20:40 +0000</resolved>
                                                                    <component>clustering</component>
                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                                                                <comments>
                            <comment id="51587" author="mirehak@cisco.com" created="Thu, 13 Oct 2016 14:05:29 +0000"  >&lt;p&gt;Attachment cluster-isolationX2-20161013.zip has been added with description: logs, scenario overview, restconf outputs&lt;/p&gt;</comment>
                            <comment id="51579" author="mirehak@cisco.com" created="Tue, 18 Oct 2016 16:13:45 +0000"  >&lt;p&gt;What happened in nutshell:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;started 3 node cluster&lt;/li&gt;
	&lt;li&gt;registered routed rpc on node#3 (election winner for DomainSpecificRegistryInstance application owner)&lt;br/&gt;
-! invoked rpc provided by that service through restconf on each node - 3x success&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;isolated node#3&lt;br/&gt;
-! invoked the same rpc in the same way - nodes #1,#2 = success, got timeout for node#3&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;cluster healed - node#3 rejoined&lt;br/&gt;
-! invoked the same rpc in the same way - nodes #1,#2 = success, but for node#3 got this:&lt;br/&gt;
&amp;lt;errors xmlns=&quot;urn:ietf:params:xml:ns:yang:ietf-restconf&quot;&amp;gt;&amp;lt;error&amp;gt;&amp;lt;error-type&amp;gt;application&amp;lt;/error-type&amp;gt;&amp;lt;error-tag&amp;gt;operation-not-supported&amp;lt;/error-tag&amp;gt;&amp;lt;error-message&amp;gt;No local or remote implementation available for rpc AbsoluteSchemaPath
{path=[(urn:opendaylight:groupbasedpolicy:base_endpoint?revision=2016-04-27)register-endpoint]}
&lt;p&gt;&amp;lt;/error-message&amp;gt;&amp;lt;/error&amp;gt;&amp;lt;/errors&amp;gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;If trying more then got Rpc implementation for {} was removed during processing. response.&lt;/p&gt;</comment>
                            <comment id="51588" author="mirehak@cisco.com" created="Tue, 18 Oct 2016 16:27:37 +0000"  >&lt;p&gt;Attachment cluster-isolationX2-20161018.zip has been added with description: logs, scenario overview, restconf outputs + DEBUG remoterpc&lt;/p&gt;</comment>
                            <comment id="51580" author="rovarga" created="Fri, 13 Jan 2017 15:52:09 +0000"  >&lt;p&gt;Is this still reproducible on current Carbon? The exception reported seems to be impossible with current codebase (and if it is, it points to classpath badness).&lt;/p&gt;</comment>
                            <comment id="51581" author="mirehak@cisco.com" created="Tue, 17 Jan 2017 10:53:43 +0000"  >&lt;p&gt;This is still broken on current master (mvn -U @ 20170117 08:00 UTC).&lt;/p&gt;</comment>
                            <comment id="51582" author="rovarga" created="Tue, 17 Jan 2017 21:00:33 +0000"  >&lt;p&gt;I think this will be addressed with the patch in BUG-3128, at least partially.&lt;/p&gt;

&lt;p&gt;The first RESTCONF output points to a sal-remoterpc-connector routing loop, i.e. the the RPC request is being invoked on a remote node, but that node tracks that RPC as remote &amp;#8211; like if the surving node is pointing to a previous (but not current) owner. Patch to correct the format string: &lt;a href=&quot;https://git.opendaylight.org/gerrit/50574&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/50574&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The second output points to a failure to find a router for something we are registered for, but is no longer present in RpcRegistry&apos;s RoutingTable. This codepath is eliminated in BUG-3128.&lt;/p&gt;

&lt;p&gt;The logs are pointing towards akka cluster not reforming (failing to associate). We need to reproduce this in CSIT and make sure akka cluster is working as expected.&lt;/p&gt;

&lt;p&gt;A thread dump could be useful (maybe there is something stuck).&lt;/p&gt;

&lt;p&gt;Is the data store working okay?&lt;/p&gt;</comment>
                            <comment id="51583" author="mirehak@cisco.com" created="Thu, 19 Jan 2017 16:26:54 +0000"  >&lt;p&gt;Yes - dataStore worked. Although I expected not to get any response from the isolated node.&lt;/p&gt;</comment>
                            <comment id="51584" author="rovarga" created="Thu, 26 Jan 2017 17:56:43 +0000"  >&lt;p&gt;master: &lt;a href=&quot;https://git.opendaylight.org/gerrit/51079&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://git.opendaylight.org/gerrit/51079&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="51585" author="tcere" created="Fri, 27 Jan 2017 10:13:03 +0000"  >&lt;p&gt;The above patch should fix the isolated node not having any rpc&apos;s registered after cluster heal. There is still an issue where one of the nodes in the cluster looses remotely registered rpcs once the cluster is healed this will need further analysis and most likely some additional logging in Gossiper.&lt;/p&gt;

&lt;p&gt;The EndpointAssociationException in the first post is harmless and all it actually sais is that the node cannot communicate to a cluster member which is expected on node isolation.&lt;/p&gt;</comment>
                            <comment id="51586" author="tcere" created="Fri, 10 Feb 2017 13:44:18 +0000"  >&lt;p&gt;Seems to be fixed now according to csit now. &lt;a href=&quot;https://jenkins.opendaylight.org/releng/view/controller/job/controller-csit-3node-clustering-only-carbon/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://jenkins.opendaylight.org/releng/view/controller/job/controller-csit-3node-clustering-only-carbon/&lt;/a&gt;&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="13592" name="cluster-isolationX2-20161013.zip" size="85351" author="michal.rehak" created="Thu, 13 Oct 2016 14:05:29 +0000"/>
                            <attachment id="13593" name="cluster-isolationX2-20161018.zip" size="98633" author="michal.rehak" created="Tue, 18 Oct 2016 16:27:37 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                            <customfield id="customfield_11400" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                        <customfield id="customfield_10208" key="com.atlassian.jira.plugin.system.customfieldtypes:textfield">
                        <customfieldname>External issue ID</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>6937</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10201" key="com.atlassian.jira.plugin.system.customfieldtypes:url">
                        <customfieldname>External issue URL</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue><![CDATA[https://bugs.opendaylight.org/show_bug.cgi?id=6937]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10206" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Issue Type</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10300"><![CDATA[Bug]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10204" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>ODL SR Target Milestone</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10344"><![CDATA[Boron-3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10000" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>0|i02rbb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>