[CONTROLLER-1183] Clustering : When handling an append entry Follower should apply log to state machine only till it's current commit index Created: 05/Mar/15 Updated: 06/Jun/15 Resolved: 06/Jun/15 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | mdsal |
| Affects Version/s: | Post-Helium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Moiz Raja | Assignee: | Tom Pantelis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 2787 |
| Priority: | Normal |
| Description |
|
In Follower we apply log to state machine as follows, Line 219 : applyLogToStateMachine(appendEntries.getLeaderCommit()); For a slow follower the leader commit may be far ahead of what is in it's log. This can cause the following message to get printed over and over in the logs. LOG.warn( A simple fix for this may be to simply apply only the Followers commit index to the state machine. |
| Comments |
| Comment by Kamal Rameshan [ 09/Mar/15 ] |
|
This happens for a slow follower. Whenever this happens, it means that the follower is trailing behind. Although we can remove this log and pass in the min(leadercommit, follower-lastindex) to the applystate, presence of this log indicates an issue. We need to come with a strategy to make a slow follower catch-up faster, possibly by sending multiple entries in 1 AE message. As i write this , i have an idle system with a slow inv-topology-follower at index 990 catching up to a leader (index 1213), and AE is coming in every 1 min!! |
| Comment by Tom Pantelis [ 31/May/15 ] |
|
This can also happen when a node is (re)started with an empty journal, either when adding a new node to the cluster or re-installing a node from catastrophic failure. I think at the very least we can change the log from warn to debug - the message looks a little ominous in the log at warn level. I agree we should look into sending multiple entries in an AE message. |
| Comment by Tom Pantelis [ 02/Jun/15 ] |
|
Submitted draft https://git.opendaylight.org/gerrit/#/c/21701 to batch AppenEntries. |