[CONTROLLER-1589] Current raft implementation seems to be unstable when dynamically adding peers when new nodes come up. Created: 27/Jan/17 Updated: 25/Jul/23 Resolved: 13/Apr/17 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | clustering |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Tomas Cere | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Issue Links: |
|
||||||||
| External issue ID: | 7696 | ||||||||
| Description |
|
Scenario were trying to do: We want to have a shard started with replicas on all cluster members(even ones added at any time in the future). Currently the raft implementation seems to quite fragile when dynamically adding peers, when you send an AddServer message to the actor on MemberUp/Reachable events the followers seem to always ignore it leading to each peer having different set of peers which leads to never ending elections. |
| Comments |
| Comment by Robert Varga [ 31/Jan/17 ] |
|
Based on the discussion on Clustering Hackers' call, the configuration update should work as follows:
So sending AddServer to followers should not be necessary. |
| Comment by Tom Pantelis [ 16/Feb/17 ] |
|
Is there an actual issue here or can we close this? |
| Comment by Jakub Morvay [ 16/Feb/17 ] |
|
Hi Tom, I have tried approach mentioned above and I have been able to add new shards. However, I have seen some problems with creating shards in some scenarios, possibly, because of bugs in our shard starting logic. I will try to fix them and will see if this is an actual issue. If not I will close the bug. |