Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1572

ReadDataReply Message was too large can result in "Received UnreachableMember" in cluster

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Done
    • None
    • None
    • clustering
    • None
    • Operating System: All
      Platform: All

    • 7449

    Description

      The serialization would be very time consuming if the ReadDataReply Message is very large, it will result in "Received UnreachableMember" in cluster

      The default failure-detector will trigger if there are no heartbeats within 5.5s in akka cluster. "Received UnreachableMember" certainly occur if the serialization time exceeded 5.5s

      2016-12-27 20:06:30,999 | WARN | t-dispatcher-199 | ClusterCoreDaemon | 179 - com.typesafe.akka.slf4j - 2.4.12 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.46.60.132:2550] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://opendaylight-cluster-data@10.46.60.139:2550, status = Up)]. Node roles [member-1]
      2016-12-27 20:06:31,002 | INFO | t-dispatcher-293 | ShardManager | 214 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Received UnreachableMember: memberName MemberName

      {name=member-2}, address: akka.tcp://opendaylight-cluster-data@10.46.60.139:2550
      2016-12-27 20:06:31,002 | INFO | t-dispatcher-296 | ShardManager | 214 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Received UnreachableMember: memberName MemberName{name=member-2}

      , address: akka.tcp://opendaylight-cluster-data@10.46.60.139:2550
      2016-12-27 20:06:31,002 | INFO | t-dispatcher-199 | EntityOwnershipShard | 209 - org.opendaylight.controller.sal-akka-raft - 1.5.0.SNAPSHOT | member-1-shard-entity-ownership-operational: onPeerDown: PeerDown [memberName=member-2, peerId=member-2-shard-entity-ownership-operational]
      2016-12-27 20:06:31,362 | INFO | lt-dispatcher-33 | kka://opendaylight-cluster-data) | 179 - com.typesafe.akka.slf4j - 2.4.12 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.46.60.132:2550] - Ignoring received gossip status from unreachable [UniqueAddress(akka.tcp://opendaylight-cluster-data@10.46.60.139:2550,2101232233)]
      2016-12-27 20:06:32,362 | INFO | lt-dispatcher-86 | kka://opendaylight-cluster-data) | 179 - com.typesafe.akka.slf4j - 2.4.12 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.46.60.132:2550] - Ignoring received gossip status from unreachable [UniqueAddress(akka.tcp://opendaylight-cluster-data@10.46.60.139:2550,2101232233)]
      2016-12-27 20:06:32,979 | WARN | t-dispatcher-188 | Shard | 209 - org.opendaylight.controller.sal-akka-raft - 1.5.0.SNAPSHOT | member-1-shard-default-config: At least 1 followers need to be active, Switching member-1-shard-default-config from Leader to IsolatedLeader

      ------------------------------------------------------------------------------------

      In addition, akka EndpointWriter will throw OversizedPayloadException if the ReadDataReply size over the maximumPayloadBytes

      2016-12-27 20:06:41,571 | ERROR | lt-dispatcher-18 | EndpointWriter | 179 - com.typesafe.akka.slf4j - 2.4.12 | Transient association error (association remains live)
      akka.remote.OversizedPayloadException: Discarding oversized payload sent to Actor[akka.tcp://opendaylight-cluster-data@10.46.60.139:2550/temp/$J]: max allowed size 419430400 bytes, actual size of encoded class org.opendaylight.controller.cluster.datastore.messages.ReadDataReply was 523432856 bytes.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tpantelis Tom Pantelis
            he.yunbo@zte.com.cn HeYunBo
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: