[NETCONF-86] Real world network devices NETCONF mounted Lithium won't mount Beryllium Created: 16/Oct/15  Updated: 15/Mar/19  Resolved: 22/Jul/16

Status: Resolved
Project: netconf
Component/s: netconf
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: James Gregory Hall Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Duplicate
duplicates NETCONF-232 Exclude flawed models from mount point Resolved
External issue ID: 4492

 Description   

With Helium & Lithium, I can mount and leverage RESTCONF/NETCONF yang-ext:mount with a set of network devices running in real networks today. Commercial business has been developed on this capability with "Opendaylight" branding this capability.

I am unable to mount the same network devices with the Beryllium controller.

The root cause is a NullPointerException now thrown when LeafEffectiveStatementImpl fails to check that typeEffectiveSubstatement is null.

Yes, there is an error in the yang file ... specifically no "type" child for a Leaf node, and it may be tempting to some to blame the real world network device and leave this be as a "feature" of a strict yang tools ... please don't make that mistake.

To be useful, our controller must mount real world devices which have been purchased and provisioned into real networks. These real devices can't be guaranteed to perfectly implement the spec, and especially if they passed QA with an earlier version of ODL.

I'm proposing a specific fix to check for null in this case where a leaf fails to declare a type.

> this.type = TypeUtils.getTypeFromEffectiveStatement(typeEffectiveSubstatement);

< this.type = (typeEffectiveSubstatement == null) ? null : TypeUtils.getTypeFromEffectiveStatement(typeEffectiveSubstatement);

I don't pretend to fully understand the side effects of properly avoiding a null pointer exception, however I'll point out that the missing "type" didn't cause any known issues in He/Li.



 Comments   
Comment by Vratko Polak [ 16/Oct/15 ]

TypeUtils.getTypeFromEffectiveStatement() was edited in Change 28478.
Was that Change a cause (or possibly a fix) of this Bug?

Comment by Robert Varga [ 17/Oct/15 ]

Please talk to the device vendor to fix their models, as type statement is absolutely critical to correct interpretation of the data. It is also critical that invalid models get fixed, otherwise we will end up in interoperability hell, as different systems will end up fixing flawed models in different ways, thus leading to incompatible behavior.

In the meantime, you can deploy a fixed-up version of the model in the NETCONF cache, so it will not get pulled from the device.

Downgrading priority to normal, since this is not blocking an ODL project as far as I can tell.

Comment by James Gregory Hall [ 19/Oct/15 ]

Eventually the vendor will fix the issue, and then eventually all customers will upgrade all devices to make this all go away ... but that timeline is beyond our control.

This is blocking the use of the controller to mount network devices which are being mounted by Helium & Lithium based controllers.

Comment by Tony Tkacik [ 20/Oct/15 ]

> This is blocking the use of the controller to mount network devices
> which are being mounted by Helium & Lithium based controllers.

Statement is not correct, this is only blocking specific vendor and only in case broken model is downloaded from device.

> In the meantime, you can deploy a fixed-up version of the model in
> the NETCONF cache, so it will not get pulled from the device.

Work-around for such devices (without introducing validation regression in parser)
is to load modified models in cache/schema folder of OpenDaylight controller,
which is used to provide models without downloading them from device.

If such "exception" for non-strictness should be part of parser, not enforcing
correct behaviour will allow for additional models like the one you mentioned to be created.

Cleaner way will be introduction of STRICT and COMPATIBILITY modes into parser, where
one will be used during compile time and other in runtime, but introducing such APIs and rule-sets (and chosing which part of YANG do not enforce in COMPATIBILITY mode) will require lot of work and redesign of some of API contracts.

Also behaviour of COMPATIBILITY fixes is left out to dispuse - how parser should render effective model from which NETCONF, RESTCONF, MD-SAL derives it's behaviour and are this choices really right?
Eg. in your use-case it may falling back to string be right choice, but what if other system with same issue was implemented to expect is as type empty or type identityref?

Robert's idea allows for users, integrators and vendors to provide fixed models and to sideload them to OpenDaylight without modifying actual device and to specify such behaviour on per-model basis, instead of creating "default" rule for specific types of not honoring YANG.

Comment by Robert Varga [ 20/Oct/15 ]

As discussed on the MD-SAL meeting, the problem here is that a single model which fails to parse renders the entire mount point unusable. This problem can be solved acceptably by excluding the model from the resulting schema context, which is something the NETCONF SB plugin can do based on its configuration (either global or for that particular element).

The other issue raised is the fact the mount point access from RESTCONF NB to NETCONF SB forces data validation, which breaks with devices which do not enforce their own models (like allowing an out-of-range value to be configured) – leading to get-config failing. This is needed for the ability to project the data on the NB in both JSON and XML, but could potentially be addressed by re-architecting the RESTCONF/NETCONF interactions.

An idea of 'sloppy' mount points was discussed, which requires some level of design and changes to the MD-SAL APIs to expose how sloppy the mount point is being, so its interactions with the rest of the system can be properly guarded – for example blocking MD-SAL applications from seeing non-conformant data, so they do not break on it.

Since the two immediate problems can (and should) be addressed within the NETCONF project, I am moving this issue there.

Generated at Wed Feb 07 20:14:08 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.