[CONTROLLER-1501] Two leaders for the same term emerged in a cluster which was restarted Created: 23/Mar/16  Updated: 19/Oct/17  Resolved: 05/Jul/17

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: Beryllium
Fix Version/s: None

Type: Bug
Reporter: Moiz Raja Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 5599

 Description   

This issue was seen on a system with a 5 node cluster. At some point the system was running out of disk space which was fixed and the nodes in the cluster were restarted. When the nodes were restarted two Leaders were seen for the default-config shard.

Here is some information collection from the system,

Current term on both nodes was : 22
Commit Index on all nodes seem to match the commit index on member-4
Commit Index on member-3 is less than the commit index on member-4

Committed Transaction Count on member-3 : 22
Committed Transaction Count on member-4: 32
Leadership Change Time on member-3 : 19:10:26.476
Leadership Change Time on member-4 : 19:10:26.463

Leadership Change count on both member-3/member-4 is 1. This seems to indicate that they both requested to become leader on startup and both got enough votes to become Leader.

Member Name Last Voted For
member-1 member-3
member-2 member-3
member-3 member-3
member-4 member-4
member-5 member-4

Missing info

  • Debugging logging could have shown the following information,
  • Who voted to whom when
  • Who changed their vote in term 22


 Comments   
Comment by Tom Pantelis [ 05/Jul/17 ]

Unfortunately not enough info to go on to determine what might have happened post-mortem - need at least the log files. Perhaps split-brain occurred. We now have more info logging wrt elections and raft state changes which will help if this is ever reported again.

Generated at Wed Feb 07 19:55:43 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.