[NETVIRT-1466] OptimisticLockFailedException in VpnFootprintService$DpnEnterExitVpnWorker Created: 17/Oct/18  Updated: 27/Nov/19

Status: Open
Project: netvirt
Component/s: vpnmanager
Affects Version/s: None
Fix Version/s: Fluorine-SR1, Neon, Oxygen-SR4

Type: Bug Priority: High
Reporter: Jamo Luhrsen Assignee: Surendar Raju
Resolution: Unresolved Votes: 0
Labels: csit:exception
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to NETVIRT-1153 Conflicting modification for path ..... Resolved

 Description   

CSIT jobs are having sporadic failures due to this exception showing up.

2018-09-17T10:18:35,990 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-31 | LocalThreePhaseCommitCohort      | 235 - org.opendaylight.controller.sal-distributed-datastore - 1.7.4.SNAPSHOT | Failed to prepare transaction member-1-datastore-config-fe-0-txn-60945-0 on backend
org.opendaylight.controller.md.sal.common.api.data.OptimisticLockFailedException: Optimistic lock failed for path /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[{(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:264196233440272}]/AugmentationIdentifier{childNames=[(urn:opendaylight:flow:inventory?revision=2013-08-19)description, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-actions, (urn:opendaylight:flow:inventory?revision=2013-08-19)hardware, (urn:opendaylight:flow:inventory?revision=2013-08-19)switch-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-meter, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-instructions, (urn:opendaylight:flow:inventory?revision=2013-08-19)meter, (urn:opendaylight:flow:inventory?revision=2013-08-19)serial-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-group, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-match-types, (urn:opendaylight:flow:inventory?revision=2013-08-19)port-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)table, (urn:opendaylight:flow:inventory?revision=2013-08-19)group, (urn:opendaylight:flow:inventory?revision=2013-08-19)manufacturer, (urn:opendaylight:flow:inventory?revision=2013-08-19)table-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)software, (urn:opendaylight:flow:inventory?revision=2013-08-19)ip-address]}/(urn:opendaylight:flow:inventory?revision=2013-08-19)table/table[{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=20}]/flow/flow[{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=L3.264196233440272.20.100230.10}]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:749) ~[235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(ShardDataTree.java:787) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction(ShardDataTree.java:734) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(ShardDataTree.java:888) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:99) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.CohortEntry.canCommit(CohortEntry.java:99) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleCanCommit(ShardCommitCoordinator.java:239) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleReadyLocalTransaction(ShardCommitCoordinator.java:201) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.Shard.handleReadyLocalTransaction(Shard.java:739) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.datastore.Shard.handleNonRaftCommand(Shard.java:336) [235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.raft.RaftActor.handleCommand(RaftActor.java:270) [220:org.opendaylight.controller.sal-akka-raft:1.7.4.SNAPSHOT]
	at org.opendaylight.controller.cluster.common.actor.AbstractUntypedPersistentActor.onReceiveCommand(AbstractUntypedPersistentActor.java:44) [228:org.opendaylight.controller.sal-clustering-commons:1.7.4.SNAPSHOT]
	at akka.persistence.UntypedPersistentActor.onReceive(PersistentActor.scala:275) [45:com.typesafe.akka.persistence:2.5.11]
	at org.opendaylight.controller.cluster.common.actor.MeteringBehavior.apply(MeteringBehavior.java:104) [228:org.opendaylight.controller.sal-clustering-commons:1.7.4.SNAPSHOT]
	at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:608) [42:com.typesafe.akka.actor:2.5.11]
	at akka.actor.Actor.aroundReceive(Actor.scala:517) [42:com.typesafe.akka.actor:2.5.11]
	at akka.actor.Actor.aroundReceive$(Actor.scala:515) [42:com.typesafe.akka.actor:2.5.11]
	at akka.persistence.UntypedPersistentActor.akka$persistence$Eventsourced$$super$aroundReceive(PersistentActor.scala:273) [45:com.typesafe.akka.persistence:2.5.11]
	at akka.persistence.Eventsourced$$anon$1.stateReceive(Eventsourced.scala:691) [45:com.typesafe.akka.persistence:2.5.11]
	at akka.persistence.Eventsourced.aroundReceive(Eventsourced.scala:192) [45:com.typesafe.akka.persistence:2.5.11]
	at akka.persistence.Eventsourced.aroundReceive$(Eventsourced.scala:191) [45:com.typesafe.akka.persistence:2.5.11]
	at akka.persistence.UntypedPersistentActor.aroundReceive(PersistentActor.scala:273) [45:com.typesafe.akka.persistence:2.5.11]
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:590) [42:com.typesafe.akka.actor:2.5.11]
	at akka.actor.ActorCell.invoke(ActorCell.scala:559) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.Mailbox.run(Mailbox.scala:224) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.Mailbox.exec(Mailbox.scala:234) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [42:com.typesafe.akka.actor:2.5.11]
	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [42:com.typesafe.akka.actor:2.5.11]
Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ConflictingModificationAppliedException: Node was created by other transaction.
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkConflicting(SchemaAwareApplyOperation.java:80) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkWriteApplicable(SchemaAwareApplyOperation.java:182) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:134) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkMergeApplicable(AbstractNodeContainerModificationStrategy.java:315) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:137) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.StructuralContainerModificationStrategy.checkApplicable(StructuralContainerModificationStrategy.java:100) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:304) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:286) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:131) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.RootModificationApplyOperation.checkApplicable(RootModificationApplyOperation.java:71) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractDataTreeTip.validate(AbstractDataTreeTip.java:38) ~[428:org.opendaylight.yangtools.yang-data-impl:2.0.6.2]
	at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:741) ~[235:org.opendaylight.controller.sal-distributed-datastore:1.7.4.SNAPSHOT]
	... 30 more

The above example happened in the netvirt/vnibasedl3forwarding/vnibasedl3forwarding.robot suite.

This job

but, it was also seen here in a gate job in a different suite.



 Comments   
Comment by Sam Hague [ 30/Oct/18 ]

Still hitting this in multiple branches:

https://jenkins.opendaylight.org/releng/user/shague/my-views/view/netvirt-fluorine-queens/job/netvirt-csit-1node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-fluorine/644/robot/

 

https://jenkins.opendaylight.org/releng/me/my-views/view/netvirt-oxygen-queens/job/netvirt-csit-1node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-oxygen/643/robot/

Comment by Kiran Upadhyaya [ 02/Nov/18 ]

This bug is very much related to and has the same cause as NETVIRT-972, and other related bugs for which we have whitelisted the OptimisticLockFaliedException for table 21 in excepts.py :

{"issue": "https://jira.opendaylight.org/browse/NETVIRT-972",
"id": "ConflictingModificationAppliedException",
"context": [
"Node was created by other transaction",
"OptimisticLockFailedException: Optimistic lock failed."
"Conflicting modification for path /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node" +
"[

{(urn:opendaylight:inventory?revision=2013-08-19)id=", "table/table[\{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=21}

]/flow/flow" +
"[

{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=L3.", ".21.", ".42." ]}

,

 

The part of code that throws the above exception also adds the subnet-routing flows as well when a VPN establishes its footprint on a new DPN. For subnet-routing, table-20 and 36 flows are also added, in addition to table-21.

I guess we missed adding the ".20." and ".36." keywords as well in the above whitelisted exception.

Is this acceptable until we find a way to clean up this exception from excepts.py?

Comment by Sam Hague [ 02/Nov/18 ]

Are we sure that NETVIRT-972 is the same problem? The issue with the whitelist on the flow exceptions is that the flows are for all different tables and the signature of the flowid is different for each. This leads me to think these are just different enough that they are not the same.

I think also in this case the vpnFootprintService was something new coming out when this new jira was filed, so here it also seemed like a new exception which is why we didn't want to add it to the whitelist.

Comment by Kiran Upadhyaya [ 08/Nov/18 ]

Sam, your observation is right that this exception coming for table 21, 36 is a new one, and I am yet to identify this new issue in timing that has cropped up But yes, the root cause for this issue is the same as NETVIRT-972. The same pair of DJCs is responsible for both the bugs.

The DJCs actually create/remove flows in table 21, 20, 36, and the 21 and 36 are purely for subnet-route IPs and IPs imported through the bgpvpn RT values. In our case, the exceptions are being hit for the subnet-route FIBs.

Generated at Wed Feb 07 20:24:08 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.