Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1927

Transaction can become stuck in COMMIT_PENDING when a node flaps leader -> follower -> leader

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: High High
    • Sodium SR3, Magnesium SR1, 2.0.0
    • Oxygen SR4, Fluorine SR3, Sodium SR1, Neon SR3
    • None
    • None

      Normally an entry application is as follows:

      1. leader sends an append entry off to persistence and replicates it to followers
      2. leaders creates its ClientRequestTracker
      3. when the entry is done with persistence and replication leader moves its commit index
      4. part of moving the commit index is sending an ApplyState message which finalizes the entry application in the DataTree
      5. The ApplyState determines if a ClientRequestTracker is present and adds an identifier to the ApplyState message if it is.
        • This determines the way in which the finalize of the entry application happens in the DataTree.
        • If it is present the entry is applied as if it originated on the leader,
        • if it is not present it is applied as if the node is a follower.

      The problem is when the leader flaps in a leader -> follower -> leader transition after 2. and before 4.. This would mean that the new leader no longer has the ClientRequestTracker which was created in the previous leader state, which means
      that when it starts with 5. It will create the ApplyState without an identifier and the entry finishes up the application as if the node is a follower.

      This means that it will be applied without finishCommit which means that the transaction will be forever stuck in COMMIT_PENDING state until the node would be restarted.

            tcere Tomas Cere
            tcere Tomas Cere
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: