Details
-
Bug
-
Status: Resolved
-
Resolution: Done
-
None
-
None
-
None
-
Operating System: All
Platform: All
-
3049
Description
I'm seeing strange behavior when a node in a 3 node cluster is restarted. It somehow causes the Shard and ShardManager actors in the other nodes in the cluster to terminate. I see these messages in the log:
2015-04-22 13:11:08,317 | INFO | lt-dispatcher-19 | Shard | 177 | 227 - org.opendaylight.controller.sal-akka-raft - 1.2.0.SNAPSHOT | | Stopping Shard member-1-shard-topology-operational
2015-04-22 13:11:08,323 | INFO | lt-dispatcher-18 | ShardManager | 159 | 234 - org.opendaylight.controller.sal-distributed-datastore - 1.2.0.SNAPSHOT | | Stopping ShardManager
There's no other messages in the log except the usual akka INFO messages about node addresses gated and nodes leaving and joining the cluster.
Note that this occurs after the node is started up and not after it is shutdown. Right after the Shard stopping messages above I see the akka message that the downed node is now re-joining:
2015-04-22 13:14:08,412 | INFO | lt-dispatcher-22 | receive$1$$anonfun$applyOrElse$3 | 74 | 220 - com.typesafe.akka.slf4j - 2.3.9 | | Cluster Node [akka.tcp://odl-cluster-rpc@127.0.0.1:2551] - Node [akka.tcp://odl-cluster-rpc@127.0.0.1:2555] is JOINING, roles []
I have all 3 nodes running in the same VM on different ports so I'm not sure if that's a factor but I've been running with this setup for a while without seeing this issue.