[BGPCEP-785] OptimisticLockFailedException when closing multiple sessions Created: 06/Apr/18 Updated: 14/Jun/18 Resolved: 14/Jun/18 |
|
| Status: | Verified |
| Project: | bgpcep |
| Component/s: | BGP |
| Affects Version/s: | Fluorine |
| Fix Version/s: | Fluorine |
| Type: | Bug | Priority: | Medium |
| Reporter: | Tomas Markovic | Assignee: | Claudio David Gasparini |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
|
org.opendaylight.mdsal.common.api.OptimisticLockFailedException: Optimistic lock failed for path /(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2018-03-29)bgp-rib/rib/rib[ {(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2018-03-29)id=example-bgp-rib}]/peer/peer[ {(urn:opendaylight:params:xml:ns:yang:bgp-rib?revision=2018-03-29)peer-id=bgp://127.0.0.11}]/adj-rib-in/tables
Seems to be an internal race condition between DS modification submision.
Steps for replicate the bug
Observation resume
Changes coming from 1 [0] clashes with changes coming from 4. The interesting part is that even 1 is done and submitted before 4 ( the main reason to arrive to this conclusion is because 4 doesn't fail to close) seems that 4 is somehow executed before full 1 has been executed. Logs attached.
[0] AdjRibInWriter | 224 - org.opendaylight.bgpcep.bgp-rib-impl - 0.10.0.SNAPSHOT | Write routes failed 2018-06-01T15:28:26,038 | ERROR | CommitFutures-4 | AdjRibInWriter | 224 - org.opendaylight.bgpcep.bgp-rib-impl - 0.10.0.SNAPSHOT | Write routes failed ]/peer/peer[ ]/adj-rib-in/tables
|
| Comments |
| Comment by Claudio David Gasparini [ 04/Jun/18 ] |
|
Replicated using mdsal trace tool. We can define that there is no change coming from other different places than the identified ones (1,4). Conclusion seems to point that there is a race condition between chain1(update Peer/Adj-rib-in) and chain2 (delete of peer).
Logs added |
| Comment by Claudio David Gasparini [ 13/Jun/18 ] |
|
Multiple chain can be submitted at different times, but does not imply that which is submitted first, will finish first. Therefore what is happening here, is just that, sometimes ribout removal done by locRibWriter is been apply after peer close and removal from Data Store, ending on OptimisticLockFailedException. For solve this, we need to make Peer to be the only one on charge of handle and submit changes under Peer ribs, and vice versa, Rib apply through his chains only changes related to Rib ribs( loc-rib-in,..) In this way, when a Peer session is closed and clean up done, it wont conflict with changes done by other Rib thread handling changes from different peers. |