[MDSAL-456] Inconsistent list handling in Binding datastore operations Created: 16/Nov/18  Updated: 01/May/20  Resolved: 22/Apr/20

Status: Resolved
Project: mdsal
Component/s: None
Affects Version/s: 3.0.8, 4.0.2
Fix Version/s: 6.0.0

Type: Bug Priority: Highest
Reporter: Luis Gomez Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Blocks
is blocked by MDSAL-449 Empty lists/maps have no semantic mea... Resolved
Relates
relates to YANGTOOLS-585 Data tree: automatic structural lifec... Resolved

 Description   

Before neon MRI upgrade, the output of oper inventory when no switch is connected was:

GET /restconf/operational/opendaylight-inventory:nodes
200 OK {}

After neon MRI upgrade, this did not change when the controller starts for the first time, however when switch connects and disconnects, the above changes to:

GET /restconf/operational/opendaylight-inventory:nodes

404 Request could not be completed because the relevant data model content does not exist

 

Because of the above we had to remove some OFP test.



 Comments   
Comment by Luis Gomez [ 23/Nov/18 ]

This actually explains, however it is weird it does not work the first time the controller boots up:
https://wiki.opendaylight.org/view/Neon_platform_upgrade#Datastore_lifecycle

Comment by Luis Gomez [ 18/Mar/19 ]

I would say either we close this because the change impact is minimal or we move the issue to yangtools projects for further investigation.

Comment by Anil Vishnoi [ 02/May/19 ]

rovarga any thoughts on this issue?

Comment by Robert Varga [ 02/May/19 ]

As documented here: https://wiki.opendaylight.org/view/Neon_platform_upgrade#DataTree_removes_empty_lists.2C_choices_and_augmentations the list will disappear, so the outcome is correct.

Comment by Luis Gomez [ 14/May/19 ]

We understand the change in behavior but we do not understand why this change does not apply the first time the controller boots up: if I boot the controller for the first time and the switches are not connected I get 200 OK + empty list, now if I connect a switch and disconnect the same, then I get 404, so it seems to me the API is not consistent or there is a bug during the boot up process.

Comment by Robert Varga [ 14/May/19 ]

Ah, now I understand the inconsistency. I suspect this is due to how the topology is initialized. What are the transactions OFP executes during init?

Comment by Luis Gomez [ 14/May/19 ]

Avishnoi, can you answer Roberts's question?

Comment by Anil Vishnoi [ 15/May/19 ]

rovarga It's not topology, but inventory. During initialization it initialize the operational datastore with empty list

 

final WriteTransaction tx = dataBroker.newWriteOnlyTransaction();

try {
tx.merge(LogicalDatastoreType.OPERATIONAL, InstanceIdentifier.create(Nodes.class), new NodesBuilder()
.setNode(Collections.emptyList())
.build());
tx.commit().get();
} catch (ExecutionException | InterruptedException e) {
LOG.error("Creation of node failed.", e);
throw new IllegalStateException(e);
}

After the controller starts and i fetch GET http://\controller-ip:8181/restconf/operational/opendaylight-inventory:nodes

it shows me 

{}

than i connect the device and it shows me device list in following json for the same above url

{
"nodes": {
   "node": ...... //all nodes details
  }
}

After devices are added to operational datastore it shows "nodes" list but in the previous case (Fresh restart), it was just showing empty list {} ( i thought it should have shown

{
"nodes" :{}
}

Then i disconnect all the switches (it removes individual nodes)

@Override
public ListenableFuture<?> removeDeviceFromOperationalDS(@Nonnull final KeyedInstanceIdentifier<Node, NodeKey> ii) {

final WriteTransaction delWtx = dataBroker.newWriteOnlyTransaction();
delWtx.delete(LogicalDatastoreType.OPERATIONAL, ii);
return delWtx.commit();

}

and when i fetch the operational nodes again it throws error (404)

{
"errors": {
"error": [
{
"error-type": "application",
"error-tag": "data-missing",
"error-message": "Request could not be completed because the relevant data model content does not exist"
}
]
}
}

 

Comment by Robert Varga [ 16/May/19 ]

So if you do not do .setNode(Collection.emptyList()), things should be consistent. The reason why this is happening is that merge() will turned into a write() of nodes, which contains an empty list. We do not examine the contents of written data to eliminate empty lists etc.

Comment by Anil Vishnoi [ 18/May/19 ]

I think that's where the inconsistency in behavior is, so if user is adding element to that empty list and than removing all of it, which results in the same empty list, why does it throws error rather than responding with empty list?

Comment by Robert Varga [ 20/May/19 ]

Since the list becomes empty, it is removed (along with any of its insignificant parents). The initial state of having the empty list present is wrong, hence the simplest solution is https://git.opendaylight.org/gerrit/82167 .

Comment by Anil Vishnoi [ 20/May/19 ]

In my opinion, we have a API inconsistency here. Either the empty list/ container, should not be allowed while writing to the data store or the empty container/list should not be removed when all the elements are removed from it, but we are seeing different behavior for write and delete API's here. To make thing worst, these workarounds are something user will have to keep in mind while they program, where they don't have clear guidline on do's and don'ts. 

Comment by Robert Varga [ 21/May/19 ]

Well yes, but that inconsistency does not lie solely in yangtools, because:

  • yangtools is providing DataTree API and an implementation
  • you are talking to mdsal-binding-api, which is a completely unrelated API
  • there is a stack of components (and since it's Operational store, which is internal, OFP is a privileged application, hence mitigations there are a fair game)

And the end of the day, this boils down to OFP writing and empty List through Binding APIs, which translates do an empty map, which is retained by InMemoryDataTree.

From yangtools perspective, it is an operational write, which we are expecting to be consistent – hence we trust the data to be minimaly-conformant. Unrestricted empty lists are okay from validation perspective, hence there is not much we are doing wrong. Fixing the issue at this level would require an additional check (incurred at every write) and possible NormalizedNode rebuild (if we find an empty list). If at all possible, we want to solve this problem at a different layer.

Note that lists are addressable in DOM, but not addressable in Binding – hence Binding has nonnullFoo() bridges, as we are fully expecting people to treat empty and null lists the same (and anyway, that's an MD-SAL issue).

That brings a possible solution: empty lists should be pruned completely when translated from binding. That obviously has impacts, so needs to be considered carefully. If you can provide an estimate of breakage this would bring, that would be appreciated.

Whatever we do, the empty list merged by OFP will become a no-op and therefore, as a matter of efficiency, proposed OFP patch is an obvious simplification and is therefore the right thing to remove it.

As for DO's and DON'T's, there are always going to be some. Some of them are "this is really a bad idea", and as genius shows, that will not dissuade people. Some of them are "yeah, okay, this really means nothing", which is the case here (once solved).

Comment by Robert Varga [ 21/May/19 ]

Oh, and btw. the end result is better experience for applications – they have even less to worry about. Existing applications should simplify where possible, i.e. eliminate every single line of code, which is not required.

Comment by Robert Varga [ 18/Jun/19 ]

https://git.opendaylight.org/gerrit/82558 implements the binding-level change: we do not emit empty lists. ecelgp can you give it a test-drive across all of CSIT?

Comment by Luis Gomez [ 25/Jun/19 ]

OK, I created distribution with multi-patch: https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/integration/integration/distribution/karaf/0.11.0-SNAPSHOT/karaf-0.11.0-20190625.163317-26.zip

And run few OFP tests and saw no problem. Should we run more tests?

Comment by Anil Vishnoi [ 27/Jun/19 ]

ecelgp i think we need to revert https://git.opendaylight.org/gerrit/#/c/77908/ and run the test again to confirm the fix?

Comment by Luis Gomez [ 27/Jun/19 ]

Yes, the revert patch + MDSAL change works here:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-gate-clustering-bulkomatic-only-sodium/76/

Comment by Robert Varga [ 22/Sep/19 ]

What is actually breaking is this: https://github.com/opendaylight/ovsdb/blob/master/hwvtepsouthbound/hwvtepsouthbound-impl/src/main/java/org/opendaylight/ovsdb/hwvtepsouthbound/transact/McastMacsRemoteRemoveCommand.java#L125

I.e. the change here is that we get null vs. empty list, which ends up being compared as non-equal. This will be addressed in Aluminium as part of MDSAL-449.

Generated at Wed Feb 07 20:09:51 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.