[CONTROLLER-1583] sal-remoterpc-connector: install remote death watch Created: 17/Jan/17  Updated: 25/Jul/23  Resolved: 02/Feb/17

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Robert Varga Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
is blocked by CONTROLLER-1292 Clustering : If only one member have ... Resolved
External issue ID: 7573

 Description   

Gossiper exchanges data with its peers, caching the buckets. When a node leaves the cluster, it now (after BUG-3128) remotes buckets for disconnected nodes.

It does not handle the case of a hung or terminated peer Gossiper, in which case it will leave (and propagate) stale Buckets.

To handle this case Gossiper needs to install a remote death watch, http://doc.akka.io/docs/akka/2.4/scala/remoting.html#Watching_Remote_Actors, to get notified when the advertising actor dies. When such an event occurs, it needs to invalidate the corresponding remote bucket and fire a corresponding message to RpcRegistry.

That way remote RPCs will be correctly unregistered and any requests to those RPCs will fail-fast instead of timing out.



 Comments   
Comment by Robert Varga [ 17/Jan/17 ]

As it turns out for DeathWatch we need an ActorRef, which points towards BucketStore/RpcRegistry for actual monitoring.

As it turns out RoutingTable (i.e. RpcRegistry-level logic) contains an ActorRef which is useful to monitor. Hence the BucketStore should perform monitoring based on information provided via a common interface (BucketData extends Copier).

When DeathWatch triggers, normal bucket removal operation can be done.

Comment by Robert Varga [ 18/Jan/17 ]

master: https://git.opendaylight.org/gerrit/50585

Comment by Robert Varga [ 02/Feb/17 ]

boron: https://git.opendaylight.org/gerrit/51258

Generated at Wed Feb 07 19:55:55 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.