[CONTROLLER-1904] DistributedEntityOwnershipService may silently lose registrations Created: 10/Jul/19 Updated: 16/Nov/21 Resolved: 01/Jul/21 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | eos |
| Affects Version/s: | Sodium, Neon SR1, Fluorine SR3 |
| Fix Version/s: | 4.0.0 |
| Type: | Bug | Priority: | Medium |
| Reporter: | Robert Varga | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
When DistributedEntityOwnershipService dispatches a request to EntityOwnershipShard, it does not report errors (such as AskTimeoutExceptions) except on debug level. This means the application thinks it has a candidate registered, but in fact it is not propagated to backend. At the very least such failures should report an error to bring attention to the problem, but really the frontend should retry forwarding the registration (and unregistration) events. Furthermore, backend reports a success as soon as the request is enqueued to BatchedModifications - which does not guarantee the candidate has been propagated to all participants.
|
| Comments |
| Comment by Oleksii Mozghovyi [ 01/Apr/21 ] |
|
I think this issue will not be applicable after the rework happening on the eos module. |
| Comment by Robert Varga [ 01/Jul/21 ] |
|
The implementation has been completely rewritten in |