[CONTROLLER-1184] Clustering : When it is detected that a follower is too far behind send it a snapshot instead of append entries Created: 05/Mar/15  Updated: 19/Jul/18  Resolved: 19/Jul/18

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: Post-Helium
Fix Version/s: None

Type: Bug
Reporter: Moiz Raja Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 2788
Priority: Low

 Description   

This is an optimization to get a slow follower to catch up a little faster.

The basic idea is as follows. Let's say we have a Follower which is about n entries behind the Leader and those n entries happen to have a dataSize which is greater than the size of the state snapshot. Then it would be more optimal to send the Follower a snapshot than to send it the n append entries (either 1 by 1 or even in batches)



 Comments   
Comment by Robert Varga [ 13/Apr/16 ]

Natarajan, are you still working on this or should we return this back to the backlog?

Comment by Tom Pantelis [ 21/Dec/16 ]

I'm not sure this is really needed or that we want to do it. Re: "it would be more optimal to send the Follower a snapshot than to send it the n append entries". I've found this is not necessarily true. The data tree state could be much larger then the n append entries in which case sending a very large snapshot can be costly, both in serializing it and sending over the wire, especially over a slow link. I think in most cases we want to avoid sending a snapshot. Currently we'll mainly install a snapshot if the leader had progressed enough to compress its log via snapshot or if a follower's log is empty when it comes (back) online.

Comment by Robert Varga [ 21/Dec/16 ]

Moiz's point was that the sum of append entries may end up being larger than the full snapshot – which is valid, I think.

The trouble is that determining the snapshot size – for that we need to serialize the entire data tree, which costly. Some sort of heuristic may be able to help, but I don't think it's worth the effort at this time.

Comment by Tom Pantelis [ 21/Dec/16 ]

I don't think it's worth the effort at all. It could be larger but if it is then neither will typically be that large so it doesn't really matter. From what I've seen in testing and production environments, installing a snapshot is costly with a large data tree and is typically only needed if a follower starts up with an empty log which we handle with a snapshot now anyway.

(In reply to Robert Varga from comment #3)
> Moiz's point was that the sum of append entries may end up being larger than
> the full snapshot – which is valid, I think.
>
> The trouble is that determining the snapshot size – for that we need to
> serialize the entire data tree, which costly. Some sort of heuristic may be
> able to help, but I don't think it's worth the effort at this time.

Generated at Wed Feb 07 19:54:53 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.