[CONTROLLER-1667] Failure in singleton isolation longevity test Created: 11/May/17  Updated: 25/Jul/23

Status: Confirmed
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Vratko Polak Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
blocks CONTROLLER-1645 shard moved during 1M bgp prefix adve... Confirmed
External issue ID: 8420

 Description   

The recent run [0] has failed, this time it really looks like a bug in ODL.

One scenario iteration looks like this [1]. But the fourth iteration has failed [2] as the rejoining member-1 was not reporting the value from the new singleton instance (member-2).

Looking at karaf.log of member-1 [3] I see the sequence below.
My current hypothesis is that when isolated leader rejoins and learns the new state of entity ownership, the corresponding data tree change notification is (lost and) not re-generated, so the rejoining member does not know it should close its singleton instance.
(It should probably close that already after detecting its isolated status, but that is another bug, not tested by the suite.)

2017-05-10 19:07:36,747 | WARN | lt-dispatcher-20 | ShardDataTree | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | member-1-shard-entity-ownership-operational: Current transaction member-1-entity-ownership-internal-fe-0-txn-37-0 has timed out after 15000 ms in state COMMIT_PENDING
2017-05-10 19:07:36,747 | WARN | lt-dispatcher-20 | ShardDataTree | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | member-1-shard-entity-ownership-operational: Transaction member-1-entity-ownership-internal-fe-0-txn-37-0 is still committing, cannot abort
2017-05-10 19:07:37,689 | INFO | lt-dispatcher-22 | EntityOwnershipShard | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | member-1-shard-entity-ownership-operational (IsolatedLeader): Term 21 in "AppendEntriesReply [term=21, success=false, followerId=member-3-shard-entity-ownership-operational, logLastIndex=60, logLastTerm=21, forceInstallSnapshot=false, payloadVersion=5, raftVersion=3]" message is greater than leader's term 20 - switching to Follower
2017-05-10 19:07:37,689 | INFO | lt-dispatcher-22 | EntityOwnershipShard | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | member-1-shard-entity-ownership-operational (IsolatedLeader) :- Switching from behavior IsolatedLeader to Follower, election term: 21
2017-05-10 19:07:37,690 | INFO | ult-dispatcher-6 | RoleChangeNotifier | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | RoleChangeNotifier for member-1-shard-entity-ownership-operational , received role change from IsolatedLeader to Follower
2017-05-10 19:07:37,690 | INFO | ult-dispatcher-2 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | shard-manager-operational: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-entity-ownership-operational, leaderId=null, leaderPayloadVersion=-1]
2017-05-10 19:07:37,690 | INFO | ult-dispatcher-2 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | shard-manager-operational: Received role changed for member-1-shard-entity-ownership-operational from IsolatedLeader to Follower
2017-05-10 19:07:37,846 | INFO | lt-dispatcher-17 | Shard | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | member-1-shard-inventory-config (Candidate): Cannot append entries because sender's term 6 is less than 20
2017-05-10 19:07:37,876 | INFO | ult-dispatcher-2 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | shard-manager-operational: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-entity-ownership-operational, leaderId=member-3-shard-entity-ownership-operational, leaderPayloadVersion=5]
2017-05-10 19:07:37,880 | INFO | ult-dispatcher-7 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | shard-manager-operational Received follower initial sync status for member-1-shard-entity-ownership-operational status sync done false
2017-05-10 19:07:37,880 | INFO | lt-dispatcher-17 | EntityOwnershipShard | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.0.SNAPSHOT | member-1-shard-entity-ownership-operational (Follower): Removing entries from log starting at 55
2017-05-10 19:07:37,882 | INFO | lt-dispatcher-17 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | shard-manager-operational Received follower initial sync status for member-1-shard-entity-ownership-operational status sync done true

[0] https://jenkins.opendaylight.org/releng/job/controller-csit-3node-cs-partnheal-longevity-only-carbon/4/
[1] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-partnheal-longevity-only-carbon/4/archives/log.html.gz#s1-t1-k3-k1-k1-k1-k1-k1-k1-k1-k1-k1
[2] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-partnheal-longevity-only-carbon/4/archives/log.html.gz#s1-t1-k3-k1-k1-k1-k1-k1-k1-k2-k1-k1-k5-k3-k1-k2
[3] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-partnheal-longevity-only-carbon/4/archives/odl1_karaf.log.gz



 Comments   
Comment by Vratko Polak [ 11/May/17 ]

> change notification is (lost and) not re-generated

On second thought, that would cause a failure in previous iterations as well.

So I guess it is the "is still committing, cannot abort" message, hinting at something getting stuck.

Comment by Tom Pantelis [ 11/May/17 ]

I assume all those jenkins links will disappear soon. I would suggest not linking to jenkins jobs but copy/paste/attach all info to the bug and also explain exactly what the test does. Unless you or a colleague of yours intends to look into this right away.

Comment by Tomas Cere [ 12/May/17 ]

Checked this out from a run that has debug logs enabled in sandbox:
https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-cs-partnheal-longevity-only-carbon/2/archives/log.html.gz#s1-t1-k3-k1-k1-k1-k1-k1-k1-k2-k1-k1-k5

seems like the same cause as in 4830, member-1 is briefly unreachable from member-3 which causes singletonService to instantiate the get-singleton-constant rpc

Comment by Vratko Polak [ 16/May/17 ]

Also this week the job failed [0] (this time after 12 iterations) and karaf.log [1] contains many messages, it is not clear which unreachability is due to the isolation scenario we are testing, and which are the additional ones which are causing this bug.

On the test side, we can use tell-based protocol and write messages about isolation (and rejoin) into karaf.log.

[0] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-partnheal-longevity-only-carbon/5/archives/log.html.gz#s1-t1-k3-k1-k1-k1-k1-k1-k1-k2-k1-k1-k5-k3-k1-k2
[1] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-partnheal-longevity-only-carbon/5/archives/odl1_karaf.log.gz

Comment by Peter Gubka [ 16/May/17 ]

Same test with failure-detector.acceptable-heartbeat-pause = 5 s
had the same results.

https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-cs-partnheal-longevity-only-carbon/1/archives/log.html.gz#s1-t1-k3-k1-k1-k1-k1-k1-k1-k2

Comment by Peter Gubka [ 17/May/17 ]

Same test with failure-detector.acceptable-heartbeat-pause = 10 s
had again the same result.

https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-cs-partnheal-longevity-only-carbon/3/archives/log.html.gz#s1-t1-k3-k1-k1-k1-k1-k1-k1-k2-k1-k1-k5-k3-k1-k2

Comment by Vratko Polak [ 18/May/17 ]

This just happened [9] in functional (as opposed to longevity) test which was running with tell-based protocol.

[9] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/694/archives/log.html.gz#s1-s42-t5-k2-k3-k1-k2

Comment by Tomas Cere [ 02/Jun/17 ]

With additional debugs this seems like a very weird behavior in akka.

Rejoin of node1 seems to trigger an unreachable from node 2.

2017-05-31 12:57:31,375 | INFO | h for user karaf | command | 46 - org.apache.karaf.log.command - 3.0.8 | ROBOT MESSAGE: Rejoining node1.

We get a couple of heartbeats first we dont get a response form member-1

2017-05-31 12:57:31,920 | DEBUG | lt-dispatcher-37 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550]
2017-05-31 12:57:31,920 | DEBUG | lt-dispatcher-37 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:31,921 | DEBUG | lt-dispatcher-38 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat response from [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:32,919 | DEBUG | lt-dispatcher-60 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550]
2017-05-31 12:57:32,920 | DEBUG | lt-dispatcher-60 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:32,921 | DEBUG | lt-dispatcher-30 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat response from [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:33,919 | DEBUG | lt-dispatcher-57 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550]
2017-05-31 12:57:33,920 | DEBUG | lt-dispatcher-57 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:33,921 | DEBUG | lt-dispatcher-60 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat response from [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:34,920 | DEBUG | ult-dispatcher-3 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550]
2017-05-31 12:57:34,920 | DEBUG | ult-dispatcher-3 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:34,922 | DEBUG | lt-dispatcher-37 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat response from [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]

Then we finally get a heartbeat from member-1

2017-05-31 12:57:35,919 | DEBUG | ult-dispatcher-3 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550]
2017-05-31 12:57:35,920 | DEBUG | ult-dispatcher-3 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat to [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:35,921 | DEBUG | lt-dispatcher-30 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat response from [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550]
2017-05-31 12:57:35,922 | DEBUG | lt-dispatcher-30 | ClusterHeartbeatSender | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Heartbeat response from [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550]
2017-05-31 12:57:36,550 | INFO | lt-dispatcher-59 | kka://opendaylight-cluster-data) | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Marking node(s) as REACHABLE [Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.4:2550, status = Up)]. Node roles [member-3]

And then we receive a gossip from member-1 which seems to trigger the following message and an unreachable on member-2

2017-05-31 12:57:36,967 | DEBUG | ult-dispatcher-3 | ClusterCoreDaemon | 201 - com.typesafe.akka.slf4j - 2.4.17 | Cluster Node [akka.tcp://opendaylight-cluster-data@172.17.0.6:2550] - Receiving gossip from [UniqueAddress(akka.tcp://opendaylight-cluster-data@172.17.0.4:2550,20791887)]
2017-05-31 12:57:36,969 | DEBUG | ult-dispatcher-3 | ClusterCoreDaemon | 201 - com.typesafe.akka.slf4j - 2.4.17 | Couldn't establish a causal relationship between "remote" gossip and "local" gossip - Remote[Gossip(members = [Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.4:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.5:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.6:2550, status = Up)], overview = GossipOverview(reachability = [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550 -> akka.tcp://opendaylight-cluster-data@172.17.0.5:2550: Unreachable [Unreachable] (7), akka.tcp://opendaylight-cluster-data@172.17.0.4:2550 -> akka.tcp://opendaylight-cluster-data@172.17.0.6:2550: Reachable [Reachable] (9)], seen = [UniqueAddress(akka.tcp://opendaylight-cluster-data@172.17.0.4:2550,20791887)]), version = VectorClock(950de35185ba8f3de15d6e342fd7dbf5 -> 11, ed3be0f6d0835e2fa024f22583639b6c -> 6, f0c01f773a4236a03ed6299dcab410f6 -> 4))] - Local[Gossip(members = [Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.4:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.5:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.6:2550, status = Up)], overview = GossipOverview(reachability = [akka.tcp://opendaylight-cluster-data@172.17.0.5:2550 -> akka.tcp://opendaylight-cluster-data@172.17.0.4:2550: Unreachable [Unreachable] (7)], seen = [UniqueAddress(akka.tcp://opendaylight-cluster-data@172.17.0.6:2550,-602086493), UniqueAddress(akka.tcp://opendaylight-cluster-data@172.17.0.5:2550,-1976714323)]), version = VectorClock(950de35185ba8f3de15d6e342fd7dbf5 -> 9, ed3be0f6d0835e2fa024f22583639b6c -> 7, f0c01f773a4236a03ed6299dcab410f6 -> 6))] - merged them into [Gossip(members = [Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.4:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.5:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@172.17.0.6:2550, status = Up)], overview = GossipOverview(reachability = [akka.tcp://opendaylight-cluster-data@172.17.0.4:2550 -> akka.tcp://opendaylight-cluster-data@172.17.0.5:2550: Unreachable [Unreachable] (7), akka.tcp://opendaylight-cluster-data@172.17.0.4:2550 -> akka.tcp://opendaylight-cluster-data@172.17.0.6:2550: Reachable [Reachable] (9), akka.tcp://opendaylight-cluster-data@172.17.0.5:2550 -> akka.tcp://opendaylight-cluster-data@172.17.0.4:2550: Unreachable [Unreachable] (7)], seen = []), version = VectorClock(950de35185ba8f3de15d6e342fd7dbf5 -> 11, ed3be0f6d0835e2fa024f22583639b6c -> 7, f0c01f773a4236a03ed6299dcab410f6 -> 6))]
2017-05-31 12:57:36,969 | INFO | ult-dispatcher-3 | ShardManager | 226 - org.opendaylight.controller.sal-distributed-datastore - 1.5.1.SNAPSHOT | Received UnreachableMember: memberName MemberName

{name=member-2}

, address: akka.tcp://opendaylight-cluster-data@172.17.0.5:2550
2017-05-31 12:57:36,969 | INFO | lt-dispatcher-60 | ShardManager

Overall this seems very strange and possibly a bug on akka side since there are no missed heartbeats from either member in this case.

Comment by Vratko Polak [ 25/Jul/17 ]

> And then we receive a gossip from member-1 which seems to trigger the
> following message and an unreachable on member-2

This is the same phenomenon as in CONTROLLER-1670 [10].

> We get a couple of heartbeats first

This is additional suspicious behavior, it does not happen in CONTROLLER-1670.
Thus it might be easier to focus on CONTROLLER-1670 hoping fixing that will also fix this.

Here is the critical part of huge karaf log [11] which made the suite detect leader movement [12] on Sandox:

2017-07-25 12:24:39,103 | DEBUG | lt-dispatcher-20 | EndpointWriter | 174 - com.typesafe.akka.slf4j - 2.4.18 | received local message RemoteMessage: [ActorSelectionMessage(akka.cluster.GossipEnvelope@350706c1,Vector(system, cluster, cor
e, daemon),false)] to [Actor[akka://opendaylight-cluster-data/]]<+[akka://opendaylight-cluster-data/] from [Actor[akka.tcp://opendaylight-cluster-data@10.29.15.173:2550/system/cluster/core/daemon#-465045840]()]
2017-07-25 12:24:39,104 | DEBUG | lt-dispatcher-20 | ClusterCoreDaemon | 174 - com.typesafe.akka.slf4j - 2.4.18 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.29.12.156:2550] - Receiving gossip from [UniqueAddress(akka.tcp://opend
aylight-cluster-data@10.29.15.173:2550,-1961660878)]
2017-07-25 12:24:39,105 | DEBUG | lt-dispatcher-19 | ClusterCoreDaemon | 174 - com.typesafe.akka.slf4j - 2.4.18 | Couldn't establish a causal relationship between "remote" gossip and "local" gossip - Remote[Gossip(members = [Member(address
= akka.tcp://opendaylight-cluster-data@10.29.12.150:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@10.29.12.156:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@10.29.15.173:2550, status = Up)], overvi
ew = GossipOverview(reachability = [akka.tcp://opendaylight-cluster-data@10.29.15.173:2550 -> akka.tcp://opendaylight-cluster-data@10.29.12.150:2550: Unreachable [Unreachable] (37), akka.tcp://opendaylight-cluster-data@10.29.15.173:2550 -> akka.tcp://ope
ndaylight-cluster-data@10.29.12.156:2550: Reachable [Reachable] (39)], seen = [UniqueAddress(akka.tcp://opendaylight-cluster-data@10.29.15.173:2550,-1961660878)]), version = VectorClock(0509e936b03b785faf066e69a04e621a -> 38, 5b6c3568af1c83fa01ae443deac1
7614 -> 29, c50fc393e7d7dd3654a21a7880f96d0a -> 25))] - Local[Gossip(members = [Member(address = akka.tcp://opendaylight-cluster-data@10.29.12.150:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@10.29.12.156:2550, status = Up),
Member(address = akka.tcp://opendaylight-cluster-data@10.29.15.173:2550, status = Up)], overview = GossipOverview(reachability = [akka.tcp://opendaylight-cluster-data@10.29.12.150:2550 -> akka.tcp://opendaylight-cluster-data@10.29.15.173:2550: Unreachabl
e [Unreachable] (37)], seen = [UniqueAddress(akka.tcp://opendaylight-cluster-data@10.29.12.156:2550,615998638), UniqueAddress(akka.tcp://opendaylight-cluster-data@10.29.12.150:2550,-504912273)]), version = VectorClock(0509e936b03b785faf066e69a04e621a ->
36, 5b6c3568af1c83fa01ae443deac17614 -> 30, c50fc393e7d7dd3654a21a7880f96d0a -> 27))] - merged them into [Gossip(members = [Member(address = akka.tcp://opendaylight-cluster-data@10.29.12.150:2550, status = Up), Member(address = akka.tcp://opendaylight-cl
uster-data@10.29.12.156:2550, status = Up), Member(address = akka.tcp://opendaylight-cluster-data@10.29.15.173:2550, status = Up)], overview = GossipOverview(reachability = [akka.tcp://opendaylight-cluster-data@10.29.12.150:2550 -> akka.tcp://opendayligh
t-cluster-data@10.29.15.173:2550: Unreachable [Unreachable] (37), akka.tcp://opendaylight-cluster-data@10.29.15.173:2550 -> akka.tcp://opendaylight-cluster-data@10.29.12.150:2550: Unreachable [Unreachable] (37), akka.tcp://opendaylight-cluster-data@10.29
.15.173:2550 -> akka.tcp://opendaylight-cluster-data@10.29.12.156:2550: Reachable [Reachable] (39)], seen = []), version = VectorClock(0509e936b03b785faf066e69a04e621a -> 38, 5b6c3568af1c83fa01ae443deac17614 -> 30, c50fc393e7d7dd3654a21a7880f96d0a -> 27)
)]
2017-07-25 12:24:39,105 | INFO | rd-dispatcher-34 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.SNAPSHOT | Received UnreachableMember: memberName MemberName

{name=member-2}, address: akka.tcp://opendaylight-cluster-data@10.29.12.150:2550
2017-07-25 12:24:39,105 | DEBUG | lt-dispatcher-19 | ClusterRemoteWatcher | 174 - com.typesafe.akka.slf4j - 2.4.18 | Unwatching: [akka://opendaylight-cluster-data/user/rpc/registry -> akka.tcp://opendaylight-cluster-data@10.29.12.150:2550/user/rpc/broker]
2017-07-25 12:24:39,105 | DEBUG | lt-dispatcher-19 | ClusterRemoteWatcher | 174 - com.typesafe.akka.slf4j - 2.4.18 | Cleanup self watch of [akka.tcp://opendaylight-cluster-data@10.29.12.150:2550/user/rpc/broker]
2017-07-25 12:24:39,105 | INFO | rd-dispatcher-34 | ShardManager | 199 - org.opendaylight.controller.sal-distributed-datastore - 1.5.2.SNAPSHOT | Received UnreachableMember: memberName MemberName{name=member-2}

, address: akka.tcp://opendaylight-cluster-data@10.29.12.150:2550
2017-07-25 12:24:39,105 | INFO | rd-dispatcher-30 | EntityOwnershipShard | 192 - org.opendaylight.controller.sal-clustering-commons - 1.5.2.SNAPSHOT | member-3-shard-entity-ownership-operational: onPeerDown: PeerDown [memberName=member-2, peerId=member-2-shard-entity-ownership-operational]
2017-07-25 12:24:39,105 | DEBUG | lt-dispatcher-21 | EndpointWriter | 174 - com.typesafe.akka.slf4j - 2.4.18 | sending message RemoteMessage: [Unwatch(Actor[akka.tcp://opendaylight-cluster-data@10.29.12.150:2550/user/rpc/broker#-944022860],Actorakka://opendaylight-cluster-data/system/remote-watcher#1339314764)] to [Actor[akka.tcp://opendaylight-cluster-data@10.29.12.150:2550/user/rpc/broker#-944022860]]<+[akka.tcp://opendaylight-cluster-data@10.29.12.150:2550/user/rpc/broker] from [Actor[akka://opendaylight-cluster-data/deadLetters]]

[10] https://bugs.opendaylight.org/show_bug.cgi?id=8430#c8
[11] https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-cs-partnheal-longevity-only-carbon/2/odl3_karaf.log.gz
[12] https://logs.opendaylight.org/sandbox/jenkins091/controller-csit-3node-cs-partnheal-longevity-only-carbon/2/log.html.gz#s1-s2-t1-k3-k1-k1-k1-k1-k1-k1-k2-k1-k1-k7-k2-k1-k2-k1-k2-k1-k6

Generated at Wed Feb 07 19:56:08 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.