[CONTROLLER-1067] Clustering : Removing all journal entries from a Followers in-memory journal causes Leader to send an InstallSnapshot Created: 13/Dec/14  Updated: 25/Jul/23  Resolved: 07/Jan/15

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Moiz Raja Assignee: Moiz Raja
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 2509

 Description   

Let's say we have a Follower with 5 applied entries in it's in-memory journal and a snapshot is triggered on the Follower. With the current logic the in-memory journal will be emptied and all the 5 entries will be put into the snapshot.

If the Leader was to send this Follower a 6th entry the follower will respond with a failure because it will not be able to get to the 5th entry and verify that the term in the 5th entry matches the prevLogTerm of the Leader.

To fix this problem snapshotting should never clear all the entries from the in-memory journal. Atleast one (or more) entries need to be left there so that a more efficient previous entry comparison can be done.



 Comments   
Comment by Moiz Raja [ 13/Dec/14 ]

Similarly if a follower was sent a snapshot with 0 unapplied entries present it is likely that it will require a further InstallSnapshot. Something must be done to prevent this situation as well.

Comment by Kamal Rameshan [ 13/Dec/14 ]

Rather than keeping a few entries in the follower journal even after snapshot, we can also check if the AE.previousLogIndex is <= snapshotIndex. If yes, then we can assume that the previous index is part of the snapshot.

Didnt get the 2nd comment.
My understanding was that if a snapshot with 0 unapplied entries is applied, the snapshot index is set. In the AppendEntries Reply we do sent the lastIndex() which is the snapshotIndex (if the journal is empty).
So we should be only getting appendentries, unless i am missing something very obvious.

Comment by Moiz Raja [ 13/Dec/14 ]

To ensure that the log is being replicated in the right order we need to check that,

AE.prevLogIndex is present in the journal and that the journal.logEntry(prevLogIndex).term == AE.prevLogTerm.

Just because a certain index is in the snapshot does not confirm that it matches the prev log entry that the leader has.

Comment by Moiz Raja [ 06/Jan/15 ]

https://git.opendaylight.org/gerrit/#/c/13675/ - master

To be cherry-picked to helium

Comment by Moiz Raja [ 06/Jan/15 ]

https://git.opendaylight.org/gerrit/#/c/13940/ - helium

Generated at Wed Feb 07 19:54:36 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.