[CONTROLLER-1001] Datastore write that automatically creates children broken Created: 06/Nov/14  Updated: 27/Nov/14  Resolved: 27/Nov/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Reinaldo Penno Assignee: Tony Tkacik
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 2328
Priority: High

 Description   

A change in the last 24-48 hrs broke a datastore write that asks for intermediate children to be created automatically. This has broken SFC and probably other projects.

The following code has always worked (notice the “true”)

writeTx.merge(LogicalDatastoreType.CONFIGURATION,
sftentryIID, sftServiceFunctionName, true);

And now it barfs with the following.

Caused by: org.opendaylight.yangtools.yang.data.api.schema.tree.ModifiedNodeDoesNotExistException: Node /(urn:cisco:params:xml:ns:yang:sfc-sft?revision=2014-07-01)service-function-types does not exist. Cannot apply modification to its children.
at org.opendaylight.yangtools.yang.data.impl.schema.tree.NormalizedNodeContainerModificationStrategy.checkSubtreeModificationApplicable(NormalizedNodeContainerModificationStrategy.java:164)[90:org.opendaylight.yangtools.yang-data-impl:0.7.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:135)[90:org.opendaylight.yangtools.yang-data-impl:0.7.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.NormalizedNodeContainerModificationStrategy.checkChildPreconditions(NormalizedNodeContainerModificationStrategy.java:178)[90:org.opendaylight.yangtools.yang-data-impl:0.7.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.NormalizedNodeContainerModificationStrategy.checkSubtreeModificationApplicable(NormalizedNodeContainerModificationStrategy.java:168)[90:org.opendaylight.yangtools.yang-data-impl:0.7.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.SchemaAwareApplyOperation.checkApplicable(SchemaAwareApplyOperation.java:135)[90:org.opendaylight.yangtools.yang-data-impl:0.7.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.impl.schema.tree.RootModificationApplyOperation.checkApplicable(RootModificationApplyOperation.java:72)[90:org.opendaylight.yangtools.yang-data-impl:0.7.0.SNAPSHOT]



 Comments   
Comment by Tony Tkacik [ 06/Nov/14 ]

Moved to Controller MD-SAL since this functionality is provided by DataBroker, not underlying DataTree (which only reported error).

Error you got, is only and only possible if there was two transactions like:

delTx = newReadWriteTransaction() // started from same point (service-function-types exists in transaction)
wrTx = newReadWriteTransaction() // started from same point (service-function-types exists in transaction)

delTx.delete(CONFIGURATION,"service-function-types") // deletes container service-function-types
wrTx.merge(CONFIGURATION,sfentryId,sfName,true) // service-function-types exists in transaction, no need to create it
delTx.commit().get() // delTx got commited, service-function-types is removed
wrTx.commit().get() // write Transaction is submitted, service-function-types does not exist (transaction does not contains instruction to create it since it existed at original time, this leads to error).

So in short during your configuration of sfc data some other code external to MD-SAL deleted it.

Behaviour probably should be that these nodes should be introduced again, but underlying problem is there were two excecutions in system which modified same subtree in different ways - race condition.

Since Helium no code was changed in createParents = true implementation, so if it is bug in MD-SAL that bug was present
since createParents was introduced.

Comment by Reinaldo Penno [ 06/Nov/14 ]

thanks Tony,

I will investigate further. Lowered the priority.

Comment by Tony Tkacik [ 06/Nov/14 ]

Stable/helium patch, once reviewed and merged will be cherrypick to helium

https://git.opendaylight.org/gerrit/#/c/12544/

Comment by Reinaldo Penno [ 06/Nov/14 ]

I finished investigation and it is as you hinted.

The Python regression script and external app perform transactions at the same time simulating multiple real external apps.

I'm not sure why I'm seeing this messages now since this test has been going on for a while and I never saw them (I'm positive on this). I'm till wondering if there was a change recently.

Irrespective, this is very important since in real world situations many apps will operate on the datastore at the same time deleting/adding/etc simultaneously.

thanks,

Comment by RichardHill [ 10/Nov/14 ]

Regarding "I'm not sure why I'm seeing this messages now since this test has been going on for a while and I never saw them"

Has the timing or order of the test changed? This may uncover a race condition

Comment by Reinaldo Penno [ 14/Nov/14 ]

No, nothing changed, some background:

I have a SFC regression script that run every 24hrs or so. The script does the same thing all the time, it uses Python to commit things through RESTconf simulating two different clients at the same time. No changes there.

The SFC project pulls all other controller components and builds a SFC distribution, so if something changes upstream, we get affected (for better or worse).

The script was running clean and in a period of 24-48hrs those logs start appearing. Therefore my suspicion is that some commit went in that changed things. Maybe it was something that a developer would not expect to affect datastore but did. I have no other explanation at this point.

Generated at Wed Feb 07 19:54:26 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.