[CONTROLLER-625] Potential issues with OptimisticLockFailedException Created: 14/Jul/14  Updated: 24/Jul/14  Due: 21/Jul/14  Resolved: 24/Jul/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Tom Pantelis Assignee: Tony Tkacik
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 1363

 Description   

An OptimisticLockFailedException can be thrown if changes conflict with another concurrent transaction or when putting a node whose parents don't exist yet. For the former case, for the transaction that failed, the client can retry with high probability that it will succeed. For the latter case, the same retry will not succeed.

We should ensure, as much as possible, that an OptimisticLockFailedException is thrown for a transient condition where a retry can be attempted and has a possibility of succeeding. It seems this isn't the case, as evidenced by this trace from the controller-dev mailing list on Jul 8th (titled "Lots of error trying to run ODL at the moment..."):

2014-07-08 14:46:36.990 PDT [pool-8-thread-1] WARN o.o.c.m.s.d.b.i.DOMDataCommitCoordinatorImpl - Tx: DOM-13 Error during phase CAN_COMMIT, starting Abort
org.opendaylight.controller.md.sal.common.api.data.OptimisticLockFailedException: Optimistic lock failed.
at org.opendaylight.controller.md.sal.dom.store.impl.InMemoryDOMDataStore$ThreePhaseCommitImpl$1.call(InMemoryDOMDataStore.java:321) ~[bundlefile:na]
at org.opendaylight.controller.md.sal.dom.store.impl.InMemoryDOMDataStore$ThreePhaseCommitImpl$1.call(InMemoryDOMDataStore.java:311) ~[bundlefile:na]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293) ~[bundlefile:na]
at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61) ~[bundlefile:na]
at org.opendaylight.controller.md.sal.dom.store.impl.InMemoryDOMDataStore$ThreePhaseCommitImpl.canCommit(InMemoryDOMDataStore.java:311) ~[bundlefile:na]
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMDataCommitCoordinatorImpl$CommitCoordinationTask.canCommitAll(DOMDataCommitCoordinatorImpl.java:345) [bundlefile:na]
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMDataCommitCoordinatorImpl$CommitCoordinationTask.canCommitBlocking(DOMDataCommitCoordinatorImpl.java:187) [bundlefile:na]
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMDataCommitCoordinatorImpl$CommitCoordinationTask.call(DOMDataCommitCoordinatorImpl.java:164) [bundlefile:na]
at org.opendaylight.controller.md.sal.dom.broker.impl.DOMDataCommitCoordinatorImpl$CommitCoordinationTask.call(DOMDataCommitCoordinatorImpl.java:144) [bundlefile:na]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_55]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_55]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_55]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_55]
Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was deleted by other transaction.
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkConflicting(SchemaAwareApplyOperation.java:80) ~[na:na]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.NormalizedNodeContainerModificationStrategy.checkSubtreeModificationApplicable(NormalizedNodeContainerModificationStrategy.java:167) ~[na:na]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) ~[na:na]

The underlying ex indicates "Node was deleted by other transaction" which seems to indicate that further retries wouldn't succeed.

If it's a case where a retry of the same changes definitely would not succeed, another type of TransactionCommitFailedException should be thrown to prevent callers from retrying.

Also, the javadocs for commit state that for an OptimisticLockFailedException, "it is the responsibility of the caller to create a new transaction and submit the same modification again in order to update data tree.". It is likely that clients will simply call into the same method recursively to retry. We should add to the docs a warning about limiting the number of retries so it doesn't loop for ever if it's a case where a retry will never succeed. Clients should probably stop after 2 tries. Adding an example would also help, eg:

private void doWrite( final int tries ) {
WriteTransaction writeTx = dataBroker.newWriteTransaction();

DataObject data = ...
writeTx.put(LogicalDatastoreType.OPERATIONAL, PATH, data);

Futures.addCallback(writeTx.commit(),
new FutureCallback<RpcResult<TransactionStatus>>() {
@Override
public void onSuccess(RpcResult<TransactionStatus> result) {
}

@Override
public void onFailure(Throwable t) {
if(t instanceof OptimisticLockFailedException) {
if( --tries > 0 )

{ // do retry doWrite( tries ); }

else

{ // out of retries }

}

}
});

}

...
doWrite( 2 );



 Comments   
Comment by Tom Pantelis [ 15/Jul/14 ]

I added the javadoc changes and example with CONTROLLER-624.

Comment by Tony Tkacik [ 16/Jul/14 ]

remote: https://git.opendaylight.org/gerrit/9058

Generated at Wed Feb 07 19:53:29 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.