Details
-
Bug
-
Status: Resolved
-
Resolution: Done
-
None
-
None
-
None
-
Operating System: All
Platform: All
-
8636
Description
This is probably duplicate of CONTROLLER-1679 but that was opened as a blocker, and the current response is different.
This affects test cases where prefix-based shard leader is isolated while single transaction producer is on a different node (if it is on the same node, CONTROLLER-1687 happens instead).
Response from transaction producer [0] starts with:
{"errors":{"error":[{"error-type":"application","error-tag":"operation-failed","error-message":"Unexpected-exception","error-info":"TransactionCommitFailedException
\n\tat org.opendaylight.mdsal.dom.broker.TransactionCommitFailedExceptionMapper.newWithCause(TransactionCommitFailedExceptionMapper.java:37)\n\tat
This may happen if there was a transaction opened ~3 seconds before isolation, but the backend took more than that to process it, so the final confirmation was blocked by the isolation.
The rate is 1000 transaction per second (implemented by producer waiting 1 millisecond after each submit) and warmup period is 5 seconds. We may need to lower the transaction rate (considering those are writes to config datastore) and subtract few seconds from the period where failures are not tolerated.