[CONTROLLER-1904] DistributedEntityOwnershipService may silently lose registrations Created: 10/Jul/19  Updated: 16/Nov/21  Resolved: 01/Jul/21

Status: Resolved
Project: controller
Component/s: eos
Affects Version/s: Sodium, Neon SR1, Fluorine SR3
Fix Version/s: 4.0.0

Type: Bug Priority: Medium
Reporter: Robert Varga Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

When DistributedEntityOwnershipService dispatches a request to EntityOwnershipShard, it does not report errors (such as AskTimeoutExceptions) except on debug level. This means the application thinks it has a candidate registered, but in fact it is not propagated to backend. At the very least such failures should report an error to bring attention to the problem, but really the frontend should retry forwarding the registration (and unregistration) events.

Furthermore, backend reports a success as soon as the request is enqueued to BatchedModifications - which does not guarantee the candidate has been propagated to all participants.

 



 Comments   
Comment by Oleksii Mozghovyi [ 01/Apr/21 ]

I think this issue will not be applicable after the rework happening on the eos module.

Comment by Robert Varga [ 01/Jul/21 ]

The implementation has been completely rewritten in CONTROLLER-1982.

Generated at Wed Feb 07 19:56:44 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.