[YANGTOOLS-559] Delete transactions fail in MinMaxElementsValidation under load Created: 26/Nov/15  Updated: 10/Apr/22  Resolved: 09/Feb/16

Status: Resolved
Project: yangtools
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Stephen Kitt Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Duplicate
is duplicated by OVSDB-236 southbound fails to delete OvsdbNode ... Resolved
External issue ID: 4684
Priority: High

 Description   

In OVSDB we're seeing transaction failures from MinMaxElementsValidation under load; the log is (with the extra info from 30247 and 30249):

2015-11-26 15:40:12,794 | ERROR | lt-dispatcher-24 | LocalThreePhaseCommitCohort | 178 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Failed to prepare transaction member-1-chn-1-txn-38 on backend
com.google.common.base.VerifyException: Total child count is -1 (from NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node, modificationType=TOUCH, childModification={(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640/bridge/br-int}

]=NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640/bridge/br-int}

], modificationType=DELETE, childModification={}], (urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

]=NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

], modificationType=DELETE, childModification={}]}] and Optional.of(MaterializedContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@7cb06828, children={(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640/bridge/br-int}

]=LazyContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@398378c4, children={AugmentationIdentifier

{childNames=[(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)controller-entry, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-external-ids, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-other-configs, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)datapath-id, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)fail-mode, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)protocol-entry, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)flow-node, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-uuid, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-name, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-openflow-node-ref, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)managed-by, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)datapath-type]}

=LazyContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@398378c4, children={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-external-ids=MaterializedContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@398378c4, children={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-external-ids[

{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-external-id-key=opendaylight-iid}

]=MaterializedContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@398378c4, children={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-external-id-value=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@6e9ea50b, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-external-id-key=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@74c4f53a}}}}, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-other-configs=MaterializedContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@398378c4, children={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-other-configs[

{(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-other-config-key=disable-in-band}

]=MaterializedContainerNode{subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@398378c4, children={(urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-other-config-value=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@bca32a2, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-other-config-key=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@29ad2e47}}}}, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-name=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@32806084, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)fail-mode=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@613e3ecb, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)bridge-uuid=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@3b3b4f6c, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)managed-by=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@7ba4815a, (urn:opendaylight:params:xml:ns:yang:ovsdb?revision=2015-01-05)datapath-type=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@4a914aea}}, (urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.ValueNode@5ac9fd15}}}}))
at com.google.common.base.Verify.verify(Verify.java:123)[79:com.google.guava:18.0.0]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkMinMaxElements(MinMaxElementsValidation.java:86)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkTouchApplicable(MinMaxElementsValidation.java:146)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]

Regards,

Stephen



 Comments   
Comment by Filip Gregor [ 27/Nov/15 ]

Can you please update and rebuild yangtools, and retry again. thank you

Comment by Stephen Kitt [ 27/Nov/15 ]

It's still failing:

2015-11-27 16:35:40,215 | WARN | lt-dispatcher-15 | SimpleShardDataTreeCohort | 178 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Store Tx member-1-chn-1-txn-238: Data validation failed for path /(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=ovsdb:1}

]/node.
org.opendaylight.yangtools.yang.data.api.schema.tree.DataValidationFailedException: (urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node does not have enough elements (-1), needs at least 0
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkMinMaxElements(MinMaxElementsValidation.java:81)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]

If you want to be able to reproduce, clone ovsdb, checkout 1ae9f08, then

mvn clean install -DskipTests -DskipIT
sudo docker-compose up -d
cd southbound/southbound-it
mvn verify -Pintegrationtest -Dovsdbserver.ipaddress=127.0.0.1

(the README file has the instructions for setting up the docker-compose image).

Comment by Stephen Kitt [ 27/Nov/15 ]

The next logged ERROR (rather than just WARN) is

2015-11-27 16:35:40,216 | ERROR | lt-dispatcher-17 | LocalThreePhaseCommitCohort | 178 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Failed to prepare transaction member-1-chn-1-txn-238 on backend
TransactionCommitFailedException

{message=Data did not pass validation., errorList=[RpcError [message=Data did not pass validation., severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.yangtools.yang.data.api.schema.tree.DataValidationFailedException: (urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node does not have enough elements (-1), needs at least 0]]}

at org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(SimpleShardDataTreeCohort.java:62)[178:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$CohortEntry.canCommit(ShardCommitCoordinator.java:601)[178:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]

Comment by Stephen Kitt [ 01/Dec/15 ]

Is https://git.opendaylight.org/gerrit/30389 supposed to fix this on its own, or does it need to be combined with https://git.opendaylight.org/gerrit/30266 and https://git.opendaylight.org/gerrit/30265 to work?

I tried a rebuild of 30389, and I still get the same issue in ovsdb.

Comment by Stephen Kitt [ 08/Dec/15 ]

As requested, I just tried with current yangtools (as rebuilt last night); I'm still seeing failures:

2015-12-08 09:57:25,387 | WARN | ult-dispatcher-4 | SimpleShardDataTreeCohort | 178 - org.opendaylight.controller.sal-distr
ibuted-datastore - 1.3.0.SNAPSHOT | Unexpected failure in validation phase
com.google.common.base.VerifyException: Total child count is -1 (from NodeModification [identifier=(urn:TBD:params:xml:ns:yang:netw
ork-topology?revision=2013-10-21)node, modificationType=TOUCH, childModification={(urn:TBD:params:xml:ns:yang:network-topology?revi
sion=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

]=NodeModific
ation [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topolo gy?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

], modificationType=DELETE, childModification={}]}] and Optional.of(Materiali
zedContainerNode{version=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@7f39a732, subtreeVersion=org.opendaylight
.yangtools.yang.data.api.schema.tree.spi.Version@4eab30f1, children={}}))
at com.google.common.base.Verify.verify(Verify.java:123)[79:com.google.guava:18.0.0]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkMinMaxElements(MinMaxElementsValidat
ion.java:86)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkTouchApplicable(MinMaxElementsValida
tion.java:146)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperatio
n.java:125)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]

followed by

2015-12-08 09:57:25,389 | ERROR | lt-dispatcher-27 | LocalThreePhaseCommitCohort | 178 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Failed to prepare transaction member-1-txn-362 on backend
com.google.common.base.VerifyException: Total child count is -1 (from NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node, modificationType=TOUCH, childModification={(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

]=NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

], modificationType=DELETE, childModification={}]}] and Optional.of(MaterializedContainerNode{version=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@7f39a732, subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@4eab30f1, children={}}))
at com.google.common.base.Verify.verify(Verify.java:123)[79:com.google.guava:18.0.0]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkMinMaxElements(MinMaxElementsValidation.java:86)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkTouchApplicable(MinMaxElementsValidation.java:146)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkChildPreconditions(AbstractNodeContainerModificationStrategy.java:204)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.AbstractNodeContainerModificationStrategy.checkTouchApplicable(AbstractNodeContainerModificationStrategy.java:188)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]

I'll try a rebuild of the current yangtools git master.

Comment by Stephen Kitt [ 08/Dec/15 ]

The reproducer branch is https://git.opendaylight.org/gerrit/30965

Comment by Stephen Kitt [ 08/Dec/15 ]

Same with current yangtools git master:

2015-12-08 10:17:16,007 | WARN | lt-dispatcher-20 | SimpleShardDataTreeCohort | 178 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Unexpected failure in validation phase
com.google.common.base.VerifyException: Total child count is -1 (from NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node, modificationType=TOUCH, childModification={(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

]=NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

], modificationType=DELETE, childModification={}]}] and Optional.of(MaterializedContainerNode{version=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@7c77d69e, subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@64c8ba95, children={}}))
at com.google.common.base.Verify.verify(Verify.java:123)[79:com.google.guava:18.0.0]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkMinMaxElements(MinMaxElementsValidation.java:86)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkTouchApplicable(MinMaxElementsValidation.java:146)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]

followed by

2015-12-08 10:17:16,014 | ERROR | lt-dispatcher-29 | LocalThreePhaseCommitCohort | 178 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Failed to prepare transaction member-1-chn-1-txn-8 on backend
com.google.common.base.VerifyException: Total child count is -1 (from NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node, modificationType=TOUCH, childModification={(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

]=NodeModification [identifier=(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node[

{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://127.0.0.1:6640}

], modificationType=DELETE, childModification={}]}] and Optional.of(MaterializedContainerNode{version=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@7c77d69e, subtreeVersion=org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@64c8ba95, children={}}))
at com.google.common.base.Verify.verify(Verify.java:123)[79:com.google.guava:18.0.0]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkMinMaxElements(MinMaxElementsValidation.java:86)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.MinMaxElementsValidation.checkTouchApplicable(MinMaxElementsValidation.java:146)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:125)[95:org.opendaylight.yangtools.yang-data-impl:0.8.0.SNAPSHOT]

Comment by Anil Vishnoi [ 15/Jan/16 ]

I am also facing the same issue with openflowplugin project as well.

Comment by Colin Dixon [ 19/Jan/16 ]

During the MD-SAL call today, Robert said that the problem was understood, but the fix was more complex than hoped and it was likely going to target Beryllium-RC1.

Comment by Sam Hague [ 19/Jan/16 ]

(In reply to Colin Dixon from comment #10)
> During the MD-SAL call today, Robert said that the problem was understood,
> but the fix was more complex than hoped and it was likely going to target
> Beryllium-RC1.

Is there a workaround or any way to mitigate the issue? The test in the bug says "under load" but we see the error when there really isn't a load. This causes random failures and our IT tests to fail.

Comment by Robert Varga [ 20/Jan/16 ]

The 'under load' means the problem is more likely to happen because of how transactions are sequenced. It is not a race condition in the code, but rather failure to handle all the cases correctly.

If we do not solve this week we will disable min/max validation for RC1 and reintroduce only after it is fixed.

Comment by Robert Varga [ 25/Jan/16 ]

https://git.opendaylight.org/gerrit/#/c/30266/

Comment by Tony Tkacik [ 26/Jan/16 ]

Patches were merged to Boron and stable/beryllium.

Comment by Stephen Kitt [ 26/Jan/16 ]

Our reproducer no longer trips up with the fix, so this looks good to me. Thanks!

Comment by Stephen Kitt [ 08/Feb/16 ]

Unfortunately we appear to be seeing this again in CSIT. https://jenkins.opendaylight.org/releng/job/ovsdb-csit-1node-southbound-only-beryllium/868/console fails in a setup designed to reproduce this (or rather, duplicate OVSDB-236).

Comment by Stephen Kitt [ 09/Feb/16 ]

The CSIT karaf.log doesn't show the relevant exception, so the failure is likely due to something else. (Thanks to Robert for investigating!)

Generated at Wed Feb 07 20:53:36 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.