Details
-
Bug
-
Status: Resolved
-
Resolution: Done
-
Helium
-
None
-
None
-
Operating System: All
Platform: All
-
1576
Description
In the distributed data store's TransactionProxy, for the write Tx put/merge/delete recording operations, we send the message using akka's "set and forget" tell mechanism. This sends the message async with no blocking but we don't get error feedback.
This can lead to unpredictable results. Eg, a write operation to a remote shard could fail due to a temporary network failure but the subsequent commit could succeed. If that happened, since the write message never reached the remote shard, it wouldn't actually be committed and we would erroneously report that it did.
In order to verify that put/merge/delete operations succeed, we need to use akka's "ask" pattern which returns a Future, from which we can ascertain failures. However, ideally, we don't want to block the put/merge/delete waiting for the result. Furthermore, the put/merge/delete methods on the public API don't throw any checked exceptions and we don't want to throw unchecked exceptions as this would technically violate the API contract. These methods were not intended to throw exceptions in the API design - the intention was to perform validation and report failures at commit time.
Therefore, in discussing with Moiz and Basheer, the proposed solution is to cache the Futures from all put/merge/delete operations and check them at commit time, specifically in the canCommit phase. If any Futures failed, we fail the commit.
In addition, to honor the uncommitted read semantics, for a read operation we need to also to check any prior cached put/merge/delete Futures to ensure consistency. If any prior Futures failed, we fail the read.