[CONTROLLER-703] Handle remote failures for write Tx recording operations async Created: 18/Aug/14  Updated: 27/Aug/14  Resolved: 27/Aug/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Tom Pantelis Assignee: Tom Pantelis
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 1576

 Description   

In the distributed data store's TransactionProxy, for the write Tx put/merge/delete recording operations, we send the message using akka's "set and forget" tell mechanism. This sends the message async with no blocking but we don't get error feedback.

This can lead to unpredictable results. Eg, a write operation to a remote shard could fail due to a temporary network failure but the subsequent commit could succeed. If that happened, since the write message never reached the remote shard, it wouldn't actually be committed and we would erroneously report that it did.

In order to verify that put/merge/delete operations succeed, we need to use akka's "ask" pattern which returns a Future, from which we can ascertain failures. However, ideally, we don't want to block the put/merge/delete waiting for the result. Furthermore, the put/merge/delete methods on the public API don't throw any checked exceptions and we don't want to throw unchecked exceptions as this would technically violate the API contract. These methods were not intended to throw exceptions in the API design - the intention was to perform validation and report failures at commit time.

Therefore, in discussing with Moiz and Basheer, the proposed solution is to cache the Futures from all put/merge/delete operations and check them at commit time, specifically in the canCommit phase. If any Futures failed, we fail the commit.

In addition, to honor the uncommitted read semantics, for a read operation we need to also to check any prior cached put/merge/delete Futures to ensure consistency. If any prior Futures failed, we fail the read.



 Comments   
Comment by Tom Pantelis [ 20/Aug/14 ]

Submitted https://git.opendaylight.org/gerrit/#/c/10109/

Generated at Wed Feb 07 19:53:40 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.