[GENIUS-58] DataStoreJobCoordinator ModifiedNodeDoesNotExistException: Node /(urn:ietf:params:xml:ns:yang:ietf-interfaces?revision=2014-05-08)interfaces-state/interface does not exist. Cannot apply modification to its children. Created: 07/Mar/17 Updated: 19/Oct/17 Resolved: 09/Mar/17 |
|
| Status: | Resolved |
| Project: | genius |
| Component/s: | General |
| Affects Version/s: | (unspecified) |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Michael Vorburger | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 7917 |
| Description |
|
[Not sure which project to file this in; perhaps rather genius than controller?] Like I'm hitting a bunch of the following exceptions, after simply doing, cd netvirt/vpnservice/distribution/karaf cd target/assembly/bin opendaylight-user@root>feature:install odl-netvirt-openstack I've also hit http://localhost:8181/controller/nb/v2/neutron/ in a Browser (which 404s), and I'm not sure that's what causes the NPEs or the mere start up. 2017-03-07 14:03:03,878 | WARN | CommitFutures-0 | DataStoreJobCoordinator | 328 - org.opendaylight.genius.mdsalutil-api - 0.2.0.SNAPSHOT | Job: JobEntry{key='119556424750016:br-int', mainWorker=InterfaceStateRemoveWorker{nodeConnectorIdNew=Uri [_value=openflow:119556424750016:LOCAL], nodeConnectorIdOld=null, fcNodeConnectorOld=FlowCapableNodeConnector{getAdvertisedFeatures=PortFeatures [_tenMbHd=false, _tenMbFd=false, _hundredMbHd=false, _hundredMbFd=false, _oneGbHd=false, _oneGbFd=false, _tenGbFd=false, _fortyGbFd=false, _hundredGbFd=false, _oneTbFd=false, _other=false, _copper=false, _fiber=false, _autoeng=false, _pause=false, _pauseAsym=false], getConfiguration=PortConfig [_pORTDOWN=true, _nORECV=false, _nOFWD=false, _nOPACKETIN=false], getCurrentFeature=PortFeatures [_tenMbHd=false, _tenMbFd=false, _hundredMbHd=false, _hundredMbFd=false, _oneGbHd=false, _oneGbFd=false, _tenGbFd=false, _fortyGbFd=false, _hundredGbFd=false, _oneTbFd=false, _other=false, _copper=false, _fiber=false, _autoeng=false, _pause=false, _pauseAsym=false], getCurrentSpeed=0, getHardwareAddress=MacAddress [_value=6c:bc:66:3a:53:c0], getMaximumSpeed=0, getName=br-int, getPeerFeatures=PortFeatures [_tenMbHd=false, _tenMbFd=false, _hundredMbHd=false, _hundredMbFd=false, _oneGbHd=false, _oneGbFd=false, _tenGbFd=false, _fortyGbFd=false, _hundredGbFd=false, _oneTbFd=false, _other=false, _copper=false, _fiber=false, _autoeng=false, _pause=false, _pauseAsym=false], getPortNumber=PortNumberUni [_uint32=4294967294], getQueue=[], getState=State{isBlocked=false, isLinkDown=true, isLive=false, augmentations={}}, getSupported=PortFeatures [_tenMbHd=false, _tenMbFd=false, _hundredMbHd=false, _hundredMbFd=false, _oneGbHd=false, _oneGbFd=false, _tenGbFd=false, _fortyGbFd=false, _hundredGbFd=false, _oneTbFd=false, _other=false, _copper=false, _fiber=false, _autoeng=false, _pause=false, _pauseAsym=false]}, interfaceName='119556424750016:br-int'}, rollbackWorker=null, retryCount=6, futures=[org.opendaylight.controller.cluster.databroker.ConcurrentDOMDataBroker$AsyncNotifyingSettableFuture@b31a18f]} failed at org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree.java:691)[220:org.opendaylight.controller.sal-distributed-datastore:1.5.0.SNAPSHOT] |
| Comments |
| Comment by Tom Pantelis [ 09/Mar/17 ] |
|
Looks like someone is trying to write a node where at least part of the parent path doesn't exist. This indicates an application-side issue. |
| Comment by Michael Vorburger [ 09/Mar/17 ] |
|
> write a node where at least part of the parent path doesn't exist agreed, and would love to move the issue and dig more in another project than controller, BUT don't we have a .. "traceability issue" here - it's impossible to tell, from this stack trace, where this originally came from?! I do understand this is related to async lambda stuff, but... there must be a solution to this, how does async FMKs typically deal with this? Capture the stack of the caller submitting the lambda, and filling it as root (or additional via setStackTrace()) of such exceptions? This common problem must have a general solution, no? |
| Comment by Tom Pantelis [ 09/Mar/17 ] |
|
The InterfaceStateRemoveWorker toString gives a clue as to the originator - it's open flow-related so but should be moved there. Adding appropriate info in the caller's mainWorker toString would help to identify the originator. However I think capturing the caller's stack trace would be too expensive in production although it could be done in a debug mode. |
| Comment by Michael Vorburger [ 09/Mar/17 ] |
|
> The InterfaceStateRemoveWorker toString gives a clue as to the originator OK, thanks for the tip; moved bug from project controller to openflowplugin. > Adding appropriate info in the caller's mainWorker toString would help to ==> PS: I now realized that DataStoreJobCoordinator JobEntry mainWorker is project genius/infrautils related, not controller. |
| Comment by Robert Varga [ 09/Mar/17 ] |
|
The only way to capture caller identity is to capture it via a Throwable. That is going to hurt performance a lot. Clean way of achieving this is to route the failure back to the requestor – which can then identify itself and provide any useful context. I mean, at the end of the day, the requestor needs to know about the failure, right? |
| Comment by Michael Vorburger [ 09/Mar/17 ] |
|
Robert, thanks for feedback! More discussions about this in |
| Comment by Anil Vishnoi [ 09/Mar/17 ] |
|
Please don't look at it as i want to throw it out of my plate @Genius-project, please have a look at the issue and if you think it's an issue with the plugin, please queue it back to openflowplugin project. |
| Comment by Michael Vorburger [ 09/Mar/17 ] |
|
Anil, no this is (almost) certainly not an issue in genius, genius just provides this DataStoreJobCoordinator infra, but something else (we suspect openflowplugin, or perhaps openflowjava? I admit I barely understand the difference) actually submitted a job doing this write.. |
| Comment by Michael Vorburger [ 09/Mar/17 ] |
|
> as of f6bda1bdfd01a44948fe154883140538c2558151 I've just tried to reproduce it, with a fresh Karaf rebuilt today (with mvn -U), and currently don't see it anymore... so either I did additional steps yesterday, or it's just been solved?! Closing for now, anyone seeing this again should re-open it. |
| Comment by Anil Vishnoi [ 09/Mar/17 ] |
|
(In reply to Michael Vorburger from comment #8) Openflowplugin or openflowjava both are not using anything from genius project. The task that is submitted to DataStoreJobCoordinator was InterfaceStateRemoveWorker (belong to genius) and looks like it's trying to remove interface from the data store and encountering this issue List<InterfaceChildEntry> interfaceChildEntries = getInterfaceChildEntries(dataBroker, interfaceName); That's why i believe it's good if genius project look at it issue first and queue it to whoever is submitting this job. |