[CONTROLLER-1475] Supervisor Strategy caught unexpected exception because of NPE in NormalizedNodeInputStreamReader Created: 20/Jan/16  Updated: 25/Jul/23  Resolved: 01/Mar/16

Status: Resolved
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Anil Vishnoi Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Duplicate
is duplicated by CONTROLLER-1452 3 # cluster : Pre-provision is not ha... Resolved
External issue ID: 5019
Priority: Highest

 Description   

I am getting following exception during net-virt clustering test.

2016-01-20 03:55:32,386 | WARN | lt-dispatcher-27 | ShardManager | 161 - org.opendaylight.controller.sal-distributed-datastore - 1.3.0.SNAPSHOT | Supervisor Strategy caught unexpected exception - resuming
java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:210)[61:com.google.guava:18.0.0]
at org.opendaylight.yangtools.yang.data.api.YangInstanceIdentifier$AbstractPathArgument.<init>(YangInstanceIdentifier.java:413)[76:org.opendaylight.yangtools.yang-data-api:0.8.0.SNAPSHOT]
at org.opendaylight.yangtools.yang.data.api.YangInstanceIdentifier$NodeWithValue.<init>(YangInstanceIdentifier.java:584)[76:org.opendaylight.yangtools.yang-data-api:0.8.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.node.utils.stream.NormalizedNodeInputStreamReader.readNormalizedNodeInternal(NormalizedNodeInputStreamReader.java:138)[157:org.opendaylight.controller.sal-clustering-commons:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.node.utils.stream.NormalizedNodeInputStreamReader.readNormalizedNode(NormalizedNodeInputStreamReader.java:97)[157:org.opendaylight.controller.sal-clustering-commons:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.DataTreeCandidatePayload.readNode(DataTreeCandidatePayload.java:184)[161:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.DataTreeCandidatePayload.readChildren(DataTreeCandidatePayload.java:146)[161:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]
at org.opendaylight.controller.cluster.datastore.DataTreeCandidatePayload.readModifiedNode(DataTreeCandidatePayload.java:161)[161:org.opendaylight.controller.sal-distributed-datastore:1.3.0.SNAPSHOT]

Dumping debug message in NormalizedNodeInputStreamReader, shows that it's throwing exception while reading ip_address leaf of the flow.
Following gist has full exception and debug messages before this exception occures

https://gist.githubusercontent.com/vishnoianil/0fab6f95c5f6824891cd/raw/f46ae667f9a8aef94f789a81c2480c4480cfac22/gistfile1.txt

I am seeing this exception in inventory shard leader controller instance log as well as one of the follower.
But the flow that is causing this exception, successfully getting installed on the switch.



 Comments   
Comment by Tom Pantelis [ 21/Jan/16 ]

ip-address was the last node that was read so it's not the problem. The problem occurs on the next DataTreeCandidatePayload which appears to contain a standalone LeafSetEntryNode w/o the parent LeafSetNode. In this case only the entry's value is outputted to the stream as the NormalizedNodeWriter doesn't pass NodeIdentifier in the leafSetEntryNode method. Presumably this was done for efficiency so the NodeIdentifier QName isn't duplicated for each entry when writing out a leaf set. However it assumes the parent LeafSetNode was previously written to the stream. The NormalizedNodeInputStreamReader makes the same assumption. In order to create a LeafSetEntryNode it needs the leaf set QName. This is obtained when the parent LeafSetNode is parsed and it's cached when the child LeafSetEntryNode's are parsed. However if the stream contains only a LeafSetEntryNode then the leaf set QName isn't available which leads to the NPE.

I'm not familiar with the OF data model and which leaf set is causing the issue and which component is writing it.

It seems the the NormalizedNodeWriter needs to detect that a standalone/parentless LeafSetEntryNode is being written and pass the NodeIdentifier to the stream writer, either as a @Nullable param to the existing method or a new method.

Comment by Tom Pantelis [ 21/Jan/16 ]

Submitted https://git.opendaylight.org/gerrit/#/c/33192 which adds a unit test that reproduces the NPE in NormalizedNodeInputStreamReader.

Comment by Robert Varga [ 21/Jan/16 ]

Looks like a change to NormalizedNodeStreamWriter's methods will be required to expose the QName.

Comment by Tom Pantelis [ 21/Jan/16 ]

Yes unfortunately - I don't see any other way around it. Adding QName (or NodeWithValue) to leafSetEntryNode would be enough on the yangtools side. The CDS NormalizedNodeOutputStreamWriter already keeps track of whether the parent was seen.

(In reply to Robert Varga from comment #3)
> Looks like a change to NormalizedNodeStreamWriter's methods will be required
> to expose the QName.

Comment by Tom Pantelis [ 21/Jan/16 ]

Submitted yangtools patch https://git.opendaylight.org/gerrit/#/c/33222/ to add the QName param.

Do we need an API waiver?

I'm working on the controller side patch.

I don't know if any other projects need changes.

Comment by Robert Varga [ 21/Jan/16 ]

Yes we do. I am actually working on a patch to add NodeWithValue, which is consistent with the rest of the API. I will go through the projects – it has not been two weeks since last waiver

Comment by Tom Pantelis [ 21/Jan/16 ]

I added a QName param with my patch. I didn't know you were working on it. NodeWithValue is fine too - we can go with your patch - either way - I'll leave it up to you.

I have the controller side patch https://git.opendaylight.org/gerrit/#/c/33228/ working (it's a draft right now since it won't build in jenkins) with changes to AbstractNormalizedNodeDataOutput and NormalizedNodePruner. Added more unit tests to DataTreeCandidatePayloadTest to cover standalone leaf node and ordered leaf set (recently added - I fixed an issue with it).

Note that AnyXML doesn't work at all with the CDS streaming (it doesn't know how to serialize a DOMSource) but that's a separate issue.

(In reply to Robert Varga from comment #6)
> Yes we do. I am actually working on a patch to add NodeWithValue, which is
> consistent with the rest of the API. I will go through the projects – it
> has not been two weeks since last waiver

Comment by Robert Varga [ 21/Jan/16 ]

mdsal: https://git.opendaylight.org/gerrit/33231
controller: https://git.opendaylight.org/gerrit/33230
netconf: https://git.opendaylight.org/gerrit/33229

Waiting for initial review, will send out an API freeze waiver once initial review has been done.

Comment by Tom Pantelis [ 21/Jan/16 ]

I already submitted controller patch https://git.opendaylight.org/gerrit/#/c/33228/ which fixes the issue so https://git.opendaylight.org/gerrit/#/c/33230/ isn't needed.

(In reply to Robert Varga from comment #8)
> mdsal: https://git.opendaylight.org/gerrit/33231
> controller: https://git.opendaylight.org/gerrit/33230
> netconf: https://git.opendaylight.org/gerrit/33229
>
> Waiting for initial review, will send out an API freeze waiver once initial
> review has been done.

Comment by Robert Varga [ 27/Jan/16 ]

Boron has been merged, still working on the API waiver for Beryllium.

Comment by Anil Vishnoi [ 02/Feb/16 ]

Tom/Robert,

I believe all the relevant patches are merged for this bug, or do we have any pending patches that needs to be merged ?

Comment by Robert Varga [ 02/Feb/16 ]

Beryllium/Boron are fixed, but this needs to backported to Lithium, too.

Comment by A H [ 29/Feb/16 ]

If the bug is a blocker, has it been fixed for Lithium SR4? If this bug is NOT a blocker, can we downgrade the severity to “critical”?

Comment by Robert Varga [ 01/Mar/16 ]

Lithium does not have all the dependencies need to fix this.

Generated at Wed Feb 07 19:55:38 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.