[CONTROLLER-996] Clustering : Exception thrown in Shard because a transaction was created on a chain when previous transaction was not yet ready Created: 05/Nov/14 Updated: 25/Jul/23 Resolved: 15/Nov/14 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | mdsal |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Moiz Raja | Assignee: | Tom Pantelis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Issue Links: |
|
||||||||||||||||
| External issue ID: | 2318 | ||||||||||||||||
| Description |
|
2014-10-31 11:29:44,814 | WARN | lt-dispatcher-42 | ShardManager | 248 - com.typesafe.akka.slf4j - 2.3.4 | Supervisor Strategy of resume applied 2014-10-31 11:29:44,815 | WARN | lt-dispatcher-42 | OneForOneStrategy | 248 - com.typesafe.akka.slf4j - 2.3.4 | Previous transaction member-1-shard-inventory-operational-40 is not ready yet |
| Comments |
| Comment by Tom Pantelis [ 05/Nov/14 ] |
|
I think the Tx chain is coming from the FlowCapableInventoryProvider and this error is related to The FlowCapableInventoryProvider uses a Tx chain to continuously batch and submit modification operations on a separate thread. After it submits a Tx batch, it creates a new read-write Tx from the chain and starts a new batch. It does not wait for the Future from the the previous Tx to complete. This is valid. The semantics of a Tx chain are such that the modifications from a previously submitted Tx in the chain are visible to the next Tx without the client having to wait for the previous Tx to be committed. So as soon as the previous Tx is readied, the next Tx can be created and its snapshot will contain the modifications made by the previous Tx. With the IMDS, i.e. w/o clustering, Tx's are readied synchronously and then submitted on a thread to be committed. However with the CDS, Tx's are readied async, i.e. the TransactionProxy sends a message to the ShardTransaction actor. So I think it's this timing difference that can cause issues with Tx chains and break the semantics. Here's a scenario that can break with the CDS: ReadWriteTransaction tx1 = txChain.newReadWriteTransaction() ReadWriteTransaction tx2 = txChain.newReadWriteTransaction() On tx.submit(), the ready operation is done async and may not complete before tx2 is created. If so, tx2 creation fails. With the IMDS, tx1 is readied when submit returns to the caller so this issue does not occur. In the CDS, we need to ensure the previous Tx in a chain completes its ready operation before it attempts to create the next Tx. I think this is the root cause of This issue may also be the cause of |
| Comment by Tom Pantelis [ 06/Nov/14 ] |
|
https://git.opendaylight.org/gerrit/#/c/12535/ for master https://git.opendaylight.org/gerrit/#/c/12537/ for Helium. |
| Comment by Tom Pantelis [ 08/Nov/14 ] |
|
Submitted follow-up patch https://git.opendaylight.org/gerrit/#/c/12582/ to master. Need to cherry pick to stable/helium so keeping this bug open for now. |
| Comment by Tom Pantelis [ 15/Nov/14 ] |
|
Merged follow-up patch https://git.opendaylight.org/gerrit/#/c/12878/ to helium |