Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1558

Routed RPCs in cluster breaks after isolation/heal

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Done
    • None
    • None
    • clustering
    • None
    • Operating System: All
      Platform: All

    • 6937

    Description

      If routed RPC is registered on one node in cluster then it is routed to this node from any other cluster node (using restconf-rcp).
      But after isolation and heal (valid for both - leader and follower) routing gets broken. Most common result is that one of survival nodes - which never owned the service - is unable to deliver RPC call and the other 2 work.

      Restconf output:
      <errors xmlns="urn:ietf:params:xml:ns:yang:ietf-restconf"><error><error-type>application</error-type><error-tag>operation-not-supported</error-tag><error-message>Rpc implementation for {} was removed during processing.</error-message></error></errors>

      or

      <errors xmlns="urn:ietf:params:xml:ns:yang:ietf-restconf"><error><error-type>application</error-type><error-tag>operation-not-supported</error-tag><error-message>No local or remote implementation available for rpc AbsoluteSchemaPath

      {path=[(urn:opendaylight:groupbasedpolicy:base_endpoint?revision=2016-04-27)register-endpoint]}

      </error-message></error></errors>

      Tested on 3node cluster, branch:master (mvn -U @ 2016-10-13).
      Note: while node#3 was isolated we got this at node#2 (and then after heal node#3 was broken)
      2016-10-13 09:41:06,842 | DEBUG | lt-dispatcher-18 | QuarantinedMonitorActor | 212 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | received AssociationErrorEvent
      akka.remote.EndpointAssociationException: Association failed with [akka.tcp://opendaylight-cluster-data@10.25.2.13:2550]
      Caused by: java.util.concurrent.TimeoutException: No response from remote for outbound association. Associate timed out after [15000 ms].
      at akka.remote.transport.ProtocolStateActor$$anonfun$2.applyOrElse(AkkaProtocolTransport.scala:362)[210:com.typesafe.akka.remote:2.4.7]
      at akka.remote.transport.ProtocolStateActor$$anonfun$2.applyOrElse(AkkaProtocolTransport.scala:336)[210:com.typesafe.akka.remote:2.4.7]
      at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
      at akka.actor.FSM$class.processEvent(FSM.scala:662)[200:com.typesafe.akka.actor:2.4.7]
      at akka.remote.transport.ProtocolStateActor.processEvent(AkkaProtocolTransport.scala:283)[210:com.typesafe.akka.remote:2.4.7]
      at akka.actor.FSM$class.akka$actor$FSM$$processMsg(FSM.scala:656)[200:com.typesafe.akka.actor:2.4.7]
      at akka.actor.FSM$$anonfun$receive$1.applyOrElse(FSM.scala:628)[200:com.typesafe.akka.actor:2.4.7]
      at akka.actor.Actor$class.aroundReceive(Actor.scala:484)[200:com.typesafe.akka.actor:2.4.7]
      at akka.remote.transport.ProtocolStateActor.aroundReceive(AkkaProtocolTransport.scala:283)[210:com.typesafe.akka.remote:2.4.7]
      at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)[200:com.typesafe.akka.actor:2.4.7]
      at akka.actor.ActorCell.invoke(ActorCell.scala:495)[200:com.typesafe.akka.actor:2.4.7]
      at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)[200:com.typesafe.akka.actor:2.4.7]
      at akka.dispatch.Mailbox.run(Mailbox.scala:224)[200:com.typesafe.akka.actor:2.4.7]
      at akka.dispatch.Mailbox.exec(Mailbox.scala:234)[200:com.typesafe.akka.actor:2.4.7]
      at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
      at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
      at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]
      at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)[196:org.scala-lang.scala-library:2.11.8.v20160304-115712-1706a37eb8]

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tcere Tomas Cere
            michal.rehak Michal Rehak
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: