[MDSAL-195] ClusterSingletonService is not closed after Leader moves to IsolatedLeader Created: 26/Aug/16 Updated: 09/Mar/18 Resolved: 13/Oct/16 |
|
| Status: | Resolved |
| Project: | mdsal |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Martin Mihálek | Assignee: | Vaclav Demcak |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| External issue ID: | 6540 | ||||||||||||||||
| Priority: | High | ||||||||||||||||
| Description |
|
After isolation of one leader node, new leader is selected and start its services but the old leader does not close its services and becomes unresponsive 2016-08-25 11:30:46,838 | WARN | ult-dispatcher-3 | ShardManager | 169 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Supervisor Strategy caught unexpected exception - resuming |
| Comments |
| Comment by Vaclav Demcak [ 26/Aug/16 ] |
|
EntityOwnershipChangeState doesn't support combination wasOwner=true, isOwner=true, hasOwner=true. This message comes because we would like to inform about inJeopardy=true flag. A hotfix could provide new State for EntityOwnershipChangeState LOCAL_OWNERSHIP_GRANTED_NOT_CHANGE, but we have to think about a way to move inJeoparyd flag to EntityOwnershipChangeState. |
| Comment by Martin Mihálek [ 26/Aug/16 ] |
|
While using 3 node cluster with Cluster singleton service it may get to unresponsive election state after these steps: 1. Isolate current leader (member-3) And new member cannot be elected thus no services will be brought up from this point. |
| Comment by Vaclav Demcak [ 26/Aug/16 ] |
|
(In reply to Vaclav Demcak from comment #1) |
| Comment by Vaclav Demcak [ 30/Aug/16 ] |
|
Attachment member3_karaf.log has been added with description: IsolatedLeader |
| Comment by Vaclav Demcak [ 30/Aug/16 ] |
|
Attachment member2_karaf.log has been added with description: member2_follower |
| Comment by Vaclav Demcak [ 30/Aug/16 ] |
|
Attachment member1_karaf.log has been added with description: member1_follower |
| Comment by Andrej Leitner [ 31/Aug/16 ] |
|
(In reply to Martin Mihálek from comment #2) Observed the same during testing openflowplugin. |
| Comment by A H [ 02/Sep/16 ] |
|
Is there an ETA for this bug and someone assigned to fix? |
| Comment by Colin Dixon [ 02/Sep/16 ] |
|
Talking with TomP, he's said that this issue is complicated and he's already pushed one patch, but he's not certain it can be fixed by the release. The silver lining is that this bug only relates to when nodes are isolated, which means we could consider changing this to a critical bug instead of a blocker if we are willing to note that during network partitions behavior might be undefined. |
| Comment by Colin Dixon [ 02/Sep/16 ] |
|
I'm reducing this to critical as TomP also says that this is not a regression from Beryllium. |
| Comment by Colin Dixon [ 02/Sep/16 ] |
|
Do we think this bug is causing BUG-6554? |
| Comment by Luis Gomez [ 02/Sep/16 ] |
|
No, this bugs occurs without cluster member isolation and even in single instance test. The bug you point seems to be related to https://bugs.opendaylight.org/show_bug.cgi?id=6177 which is not blocker (major bug). |
| Comment by Luis Gomez [ 02/Sep/16 ] |
|
Sorry the last comment was intended to the other bug, what I mean is this bug can only be rsponsible for major (not critical) bug: https://bugs.opendaylight.org/show_bug.cgi?id=6177. |
| Comment by Tom Pantelis [ 02/Sep/16 ] |
|
I've found several issues looking at the logs. One is that listeners (DTCL and DCL) are not notified when a snapshot is installed by the leader on a follower. This results in member-3 losing its candidate (or not re-gaining) for the ServiceEntityType when the network partition was healed. This looks like a regression in Boron. I have submitted https://git.opendaylight.org/gerrit/#/c/45028/ to fix this. The fact that the new leader, member-2, tried to install a snapshot is a result of another orthogonal bug that's been there all along (not a regression). The old leader, member-3, had a transaction journal entry at index 6, term 1 that was appended when it was isolated and thus wasn't replicated. Meanwhile, on the other side of the partition, member-2 became leader and committed its own entry at index 6 but with term 2. When the partition healed, member-3 switched to follower as it should. When syncing with the new leader member-2, member-3's entry at index 6 should have immediately been deemed a conflict with member-2's index 6 entry since the terms don't match and member-3's entry should've been removed/replaced by member-2's entry. This did eventually happen (via the snapshot) however member-3's entry was first committed and applied to the state. This is in violation of raft where an entry was committed without being replicated to a majority of the followers. I have an idea on how to fix this. The major issue that led to all this was a result of the leader removing a member as candidate when it is notified by akka when the member node is determined to be down. This results in another member being selected as owner. I believe initially we just selected a new owner, ignoring the down member. However, if the down member process was actually down this would result in a stale candidate when the member is restarted and no client on that member actually registers a candidate. Therefore the leader now removes the down member as a candidate. To handle the partition case where the member process is still running, when it reconnects it gets the update that its candidate was removed and re-registers it if a local candidate still exists. However this behavior is problematic in the case when the shard leader is isolated. The majority partition will elect a new leader which temporarily results in split-brain and 2 leaders which independently attempt to remove the side's candidates. When the partition is healed, all hell breaks loose trying to reconcile their differences. This is compounded with the singleton service because it uses 2 entities that are related to one another. I'm working on ideas to alleviate this issue. Also, from the logs, it looks like the internal DTCL's used by the EOS shard didn't get notified of all state transitions. This is because of the batching of updates into a single transaction when one is already in flight. This was done for efficiency but I think state transitions can be lost when combined in the same transaction because of compression in the DataTreeModification. So if a leaf is initially empty, then set to "foo" and then set to "bar" by another operation in the same transaction, the end result is that listeners will only observe the transition from "" -> "bar". I have to verify this. |
| Comment by Tom Pantelis [ 04/Sep/16 ] |
|
Submitted https://git.opendaylight.org/gerrit/#/c/45129/ to rework the behavior of re-assigning owners on peer node down. This also fixes a couple other issues revealed by the changes and unit tests. |
| Comment by Viera Zelcamova [ 05/Sep/16 ] |
|
Martin, can you try the patch? Thanks. |
| Comment by Martin Mihálek [ 05/Sep/16 ] |
|
Used updated sal-distributed-datastore and sal-akka-raft from provided patch, |
| Comment by Martin Mihálek [ 05/Sep/16 ] |
|
Attachment members.zip has been added with description: Logs using patch https://git.opendaylight.org/gerrit/#/c/45129/ |
| Comment by Tom Pantelis [ 06/Sep/16 ] |
|
(In reply to Martin Mihálek from comment #17) It failed b/c of the original error: java.lang.IllegalArgumentException: Invalid combination of wasOwner: true, isOwner: true, hasOwner: true You must be missing https://git.opendaylight.org/gerrit/#/c/44673/. With these patches, the scenario of isolating the EOS shard leader with the singleton service entity owned by the shard leader should work. In this case, EOS shard leader and entity owner is member-3. After isolation, either member-1 or member-2 should become owner. After the isolation is removed, nothing should happen and member-1 or member-2 should remain owner. However, the scenario where member-3 is the EOS shard leader but either member-1 or member-2 is the singleton service entity will probably run into similar issues as before and may not work correctly. This requires more work. Also the scenario where an EOS shard follower is the singleton service entity and is isolated, it will not get notified of inJeopardy. This was never implemented. |
| Comment by Ryan Goulding [ 06/Sep/16 ] |
|
Per discussion in Core Projects call, this is being escalated to a blocker status. These patches should be included in Boron. |
| Comment by Martin Mihálek [ 07/Sep/16 ] |
|
I updated features: sal-akka-raft, sal-distributed-datastore, mdsal-eos-common-api Tested Situation: 1. Cluster start, member 3 elected as leader Logs attached, using debug on: |
| Comment by Martin Mihálek [ 07/Sep/16 ] |
|
Attachment members_1.zip has been added with description: Cluster members logs |
| Comment by Tom Pantelis [ 07/Sep/16 ] |
|
> These steps worked correctly - this is what the patches were intended to address. 1. Cluster start, member 3 elected as EOS shard leader > 7. Isolate member 2 Since member 2 is a follower, we don't currently notify of InJeopardy in this case (was never implemented). Member 3 was actually selected as the new owner of the ServiceEntityType and the DOMClusterSingletonServiceProviderImpl was notified: 2016-09-07 09:56:28,155 | DEBUG | on-dispatcher-55 | EntityOwnerChangeListener | 217 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | member-3-shard-entity-ownership-operational: New owner: member-3, Original owner: member-2 2016-09-07 09:56:28,155 | DEBUG | lt-dispatcher-17 | EntityOwnershipListenerActor | 217 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | Notifying EntityOwnershipListener org.opendaylight.mdsal.singleton.dom.impl.DOMClusterSingletonServiceProviderImpl@289870b7: DOMEntityOwnershipChange [entity=DOMEntity [type=org.opendaylight.mdsal.ServiceEntityType, id=/(urn:opendaylight:params:xml:ns:yang:mdsal:core:general-entity?revision=2015-09-30)entity/entity[ {(urn:opendaylight:params:xml:ns:yang:mdsal:core:general-entity?revision=2015-09-30)name=org.opendaylight.controller.config.yang.sxp.controller.conf.SxpControllerInstance}]], state=LOCAL_OWNERSHIP_GRANTED [wasOwner=false, isOwner=true, hasOwner=true], inJeopardy=false] However, the ClusterSingletonServiceGroupImpl was not notified. It seem when it loses ownership it stops receiving updates via the DOMClusterSingletonServiceProviderImpl. I think think this is due to removing itself from the allServiceGroups in newAsyncCloseCallback. So this is a bug in the ClusterSingletonServiceGroupImpl - not sure what the intent is - I'll punt to Vaclav... |
| Comment by A H [ 08/Sep/16 ] |
|
Has this bug been verified as fixed in the latest Boron RC 3.1 Build? |
| Comment by Tom Pantelis [ 08/Sep/16 ] |
|
(In reply to A H from comment #23) It seems what we intended to fix for the RC with the patches has been verified. There's still other scenarios that don't work correctly as I outlined in my comments but we'll address those in SR1. We can downgrade this for critical for now. |
| Comment by Tom Pantelis [ 13/Sep/16 ] |
|
Submitted to address other issues outlined in previous comments. |
| Comment by Martin Ciglan [ 19/Sep/16 ] |
|
is this ready for review or still in progress? Thanks for update. |
| Comment by Tom Pantelis [ 19/Sep/16 ] |
|
(In reply to Martin Ciglan from comment #26) The patches above have been in review for a few days but no one has reviewed yet. If you want to review that would be great. I intend to backport to stable/boron. |
| Comment by Martin Mihálek [ 19/Sep/16 ] |
|
I updated features: sal-akka-raft, sal-distributed-datastore with change: https://git.opendaylight.org/gerrit/#/c/45638/1 Tested Situation: 1. Cluster start, member 1 hosts services Logs attached, using debug on: |
| Comment by Martin Mihálek [ 19/Sep/16 ] |
|
Attachment member_logs.zip has been added with description: Cluster member logs |
| Comment by Tom Pantelis [ 20/Sep/16 ] |
|
(In reply to Martin Mihálek from comment #28) This is the same scenario tested in Comment 21 (https://bugs.opendaylight.org/show_bug.cgi?id=6540#c21) and same issue I noted in Comment 22 (https://bugs.opendaylight.org/show_bug.cgi?id=6540#c22). Re-assigning to Vaclav. |
| Comment by Tom Pantelis [ 20/Sep/16 ] |
|
I should mention that https://git.opendaylight.org/gerrit/#/c/45638/ and related patches address this scenario: 1. Cluster start, member 1 elected as EOS shard leader So in this case, isolating/un-isolating leader member-1 shouldn't result in any service owner changes. |
| Comment by Michal Rehak [ 27/Sep/16 ] |
|
Vaclav added fix to mdsal: Testing results:
In the latter scenario the sxp service was running simultaneously on 2 cluster nodes during isolation. We also exercised the first scenario by killing karaf instead of touching iptables. Result is the same. So I guess we need to focus on situation where a Follower containing service owner gets isolated and keeps the service active. |
| Comment by Andrej Leitner [ 10/Oct/16 ] |
|
Hi all, Thanks. |