Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-2074

Follower can retain more than 1 copy of the same Snapshot in memory leading to OOM

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Highest Highest
    • None
    • None
    • clustering

      There is a bug in the current InstallSnapshot process caused by the delayed reply for the last chunk. The Follower sends InstallSnapshotReply for the last chunk only after the Snapshot gets applied and persisted. The larger the snapshot, the longer it may take. And here the Leader can resend the Snapshot again. Here's the flow leading to the issue:

      • Follower receives the last chunk of the Snapshot.
      • Follower wraps the Snapshot in ApplyMessage and sends it to RaftActorSnapshotMessageSupport to be applied and persisted. Number of Snapshots in Follower's memory = 1
      • Follower closes SnapshotTracker
      • During this time the Leader awaits the Reply for the last chunk. When the chunk-timeout runs out, the Leader re-sends the last chunk again.
      • The Follower receives the last chunk again.
      • Follower creates new SnapshotTracker (previous one was closed)
      • Follower tries to add last chunk to the SnapshotTracker which ends in InvalidChunkException("expected chunk 1, adding chunk n")
      • Follower closes SnapshotTracker and sends InstallSnapshotReply(chunkIndex = -1, success =false)
      • Leader receives InstallSnapshotReply(chunkIndex=-1) and resets LeaderInstallSnapshotState
      • Leader starts sending chunks of the same Snapshot again from chunk 1
      • Follower starts collecting chunks
      • Follower receives the last chunk of the Snapshot
      • Follower wraps the Snapshot in ApplyMessage and sends it to RaftActorSnapshotMessageSupport to be applied and persisted. Number of Snapshots in Follower's memory = 2
      • If the previous Snapshot is still being applied/persisted, this ApplySnapshot message is simply dropped.
      • After chunk-timeout the Leader sends the last chunk again and starts the whole cycle over again.
      • And here is the issue - if for example:
      • the GC takes longer to clear the large Snapshot object
      • or the ActorSystem gets overwhelmed and the ApplySnapshot message is hanging in RaftActorSnapshotMessageSupport 's mailbox
        We can have the Leader start sending the third copy of the same Snapshot and Follower will attempt to collect it as he does.

      Possible solution is to make the Follower reject the InstallSnapshot message from Leader, if there's a Snapshot being applied at the same time. If the Follower just drops the InstallSnapshot message and doesn't reply, the Leader will just re-send the last chunk again and again after the next chunk-timeout elapses. And when the Follower finally finishes the ApplySnapshot, he'll send the InstallSnapshotReply for the last chunk and the Leader will simply close the LeaderInstallSnapshotState as if there was no extra delay.

            ivanhrasko Ivan Hrasko
            tibor.kral Tibor Král
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: