[NETVIRT-971] ElanUtils...Error writing to datastore...OptimisticLockFailedException...InterfaceRemoveWorkerOnElanInterface Created: 30/Oct/17  Updated: 05/Apr/18  Resolved: 05/Apr/18

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Sam Hague Assignee: Swati Niture
Resolution: Done Votes: 0
Labels: csit:3node, csit:exception
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to NETVIRT-942 Conflicting modification for path...e... Resolved
Epic Link: Clustering Stability

 Description   
{{2017-10-30 03:00:49,072 | ERROR | nPool-1-worker-0 | ElanUtils                        | 341 - org.opendaylight.netvirt.elanmanager-impl - 0.4.3.SNAPSHOT | Error writing to datastore {}
java.util.concurrent.ExecutionException: OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was deleted by other transaction.]]}
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:64)[42:org.opendaylight.yangtools.util:1.1.3.SNAPSHOT]
	at org.opendaylight.yangtools.util.concurrent.MappingCheckedFuture.get(MappingCheckedFuture.java:77)[42:org.opendaylight.yangtools.util:1.1.3.SNAPSHOT]
	at org.opendaylight.netvirt.elan.utils.ElanUtils.waitForTransactionToComplete(ElanUtils.java:1446)[341:org.opendaylight.netvirt.elanmanager-impl:0.4.3.SNAPSHOT]
	at org.opendaylight.netvirt.elan.internal.ElanInterfaceManager.removeEntriesForElanInterface(ElanInterfaceManager.java:441)[341:org.opendaylight.netvirt.elanmanager-impl:0.4.3.SNAPSHOT]
	at org.opendaylight.netvirt.elan.internal.InterfaceRemoveWorkerOnElanInterface.call(InterfaceRemoveWorkerOnElanInterface.java:55)[341:org.opendaylight.netvirt.elanmanager-impl:0.4.3.SNAPSHOT]
	at org.opendaylight.netvirt.elan.internal.InterfaceRemoveWorkerOnElanInterface.call(InterfaceRemoveWorkerOnElanInterface.java:21)[341:org.opendaylight.netvirt.elanmanager-impl:0.4.3.SNAPSHOT]
	at org.opendaylight.genius.datastoreutils.DataStoreJobCoordinator$MainTask.run(DataStoreJobCoordinator.java:285)[292:org.opendaylight.genius.mdsalutil-api:0.2.3.SNAPSHOT]}}


 Comments   
Comment by Swati Niture [ 14/Nov/17 ]

https://git.opendaylight.org/gerrit/#/c/64753/6

Comment by Sam Hague [ 15/Nov/17 ]

The below logs are the only reason I thought 971 was related to 942/945. The first WARN is similar to 942/945 about the "elan-forwarding-tables/mac-table/mac-table" and then the second WARN is the exception for 971. It is a little different in that we don't see the exception from 942/945 at this point, also 942/945 is in the log but many minutes earlier without 971. So maybe they are not really related.

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-ocata-upstream-stateful-oxygen/369/odl_1/odl1_karaf.log.gz

2017-11-15 08:30:50,503 | WARN  | rd-dispatcher-44 | ShardDataTree                    | 217 - org.opendaylight.controller.sal-distributed-datastore - 1.7.0.SNAPSHOT | member-1-shard-default-operational: Store Tx member-1-datastore-operational-fe-0-txn-69033-0: Conflicting modification for path /(urn:opendaylight:netvirt:elan?revision=2015-06-02)elan-forwarding-tables/mac-table/mac-table[{(urn:opendaylight:netvirt:elan?revision=2015-06-02)elan-instance-name=267f6df6-9cf8-4696-aa36-53ea595c171d}].
2017-11-15 08:30:50,503 | WARN  | ult-dispatcher-2 | ConcurrentDOMDataBroker          | 217 - org.opendaylight.controller.sal-distributed-datastore - 1.7.0.SNAPSHOT | Tx: DOM-194871 Error during phase CAN_COMMIT, starting Abort
OptimisticLockFailedException{message=Optimistic lock failed., errorList=[RpcError [message=Optimistic lock failed., severity=ERROR, errorType=APPLICATION, tag=resource-denied, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was deleted by other transaction.]]}
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:731)[217:org.opendaylight.controller.sal-distributed-datastore:1.7.0.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(ShardDataTree.java:769)[217:org.opendaylight.controller.sal-distributed-datastore:1.7.0.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction(ShardDataTree.java:716)[217:org.opendaylight.controller.sal-distributed-
Comment by Sam Hague [ 21/Nov/17 ]

Still in carbon: https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/196/odl_1/odl1_karaf.log.gz

Comment by Michael Vorburger [ 22/Nov/17 ]

https://git.opendaylight.org/gerrit/#/c/65839/ is a counter proposal to https://git.opendaylight.org/gerrit/#/c/64753/ .. that should fix this OptimisticLockFailedException, because the removeEntriesForElanInterface() is changed to take WriteTransaction interfaceTx, WriteTransaction flowTx parameters which at the caller at the end of removeElanInterface() runs (now) inside a RetryingManagedNewTransactionRunner's callWithNewWriteOnlyTransactionAndSubmit(), which will perform a few retries in case of an OptimisticLockFailedException.

Comment by Swati Niture [ 24/Nov/17 ]

@ Michael, thanks.
The patch which I have raised -> https://git.opendaylight.org/gerrit/#/c/64753/ was supposed to fix
[1] Conflicting modification for path :elan-forwarding-tables/mac-table/mac-table
(https://jira.opendaylight.org/browse/NETVIRT-942) and
[2] Error writing to datastore
(https://jira.opendaylight.org/browse/NETVIRT-971)
But here, since I was trying an approach of using DJC for remove method, and this will conflict with your changes, I'll wait till your changes are merged and if [1] exception is coming in CSIT, then I'll rework on top of your changes.

Comment by Sam Hague [ 09/Jan/18 ]

oxygen failure: https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-oxygen/599/odl_1/odl1_karaf.log.gz

Comment by Sam Hague [ 05/Apr/18 ]

https://git.opendaylight.org/gerrit/70252

Generated at Wed Feb 07 20:22:55 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.