[CONTROLLER-1501] Two leaders for the same term emerged in a cluster which was restarted Created: 23/Mar/16 Updated: 19/Oct/17 Resolved: 05/Jul/17 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | clustering |
| Affects Version/s: | Beryllium |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Moiz Raja | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 5599 |
| Description |
|
This issue was seen on a system with a 5 node cluster. At some point the system was running out of disk space which was fixed and the nodes in the cluster were restarted. When the nodes were restarted two Leaders were seen for the default-config shard. Here is some information collection from the system, Current term on both nodes was : 22 Committed Transaction Count on member-3 : 22 Leadership Change count on both member-3/member-4 is 1. This seems to indicate that they both requested to become leader on startup and both got enough votes to become Leader. Member Name Last Voted For Missing info
|
| Comments |
| Comment by Tom Pantelis [ 05/Jul/17 ] |
|
Unfortunately not enough info to go on to determine what might have happened post-mortem - need at least the log files. Perhaps split-brain occurred. We now have more info logging wrt elections and raft state changes which will help if this is ever reported again. |