[NETVIRT-717] CSIT Sporadic failures - arp learning suite - MIP not in /config/odl-fib:fibEntries/ Created: 07/Jun/17  Updated: 20/Jun/17  Resolved: 20/Jun/17

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: None

Type: Bug
Reporter: Jamo Luhrsen Assignee: Gobinath Suganthan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 8630

 Description   

This may be a new sporadic failure seen once in releng, and once in a
sandbox job.

releng:
https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-carbon/656/log.html.gz#s1-s6

a small discussion in email:

https://lists.opendaylight.org/pipermail/netvirt-dev/2017-June/004624.html



 Comments   
Comment by Jamo Luhrsen [ 12/Jun/17 ]

This is very consistent in CSIT now seemingly happening on every job and
every time. i.e., not sporadic.

our high frequency job (every ~4 hours) passed in job 420 [a] and failed
in 421 [b]. job 420 used the distro from autorelease 1842 which was posted
Mon Jun 05 04:53:33 UTC 2017 and 421 used the distro from autorelease 1845
which was posted Wed Jun 07 04:41:48 UTC 2017.

The distcompare tool does not work on distros created in autorelease, so I
took two snapshot distributions created slightly before and slightly after
those two timestamps. note: These distributions are ephemeral and will not
exist much longer on nexus.

But, the patches that were merged between the two distros is below. There
are 42 patches listed, 9 of which are netvirt or genius which is where
I would look first. However the bulk of the rest of the changes are to
relevant downstream projects like openflowplugin, mdsal, controller...

Patch differences:
------------------
openflowplugin https://git.opendaylight.org/gerrit/58308 OPNFLWPLUG-830: Cleanup queue after switch disconnect
openflowplugin https://git.opendaylight.org/gerrit/58100 Optimize port status and hello message handling
openflowplugin https://git.opendaylight.org/gerrit/58256 Stop reschedule stat. after device disconnected
lispflowmapping https://git.opendaylight.org/gerrit/58192 LISPMAP-158: Add knob to disable authentication
mdsal https://git.opendaylight.org/gerrit/58332 Binding2 runtime - Codecs impl #1
mdsal https://git.opendaylight.org/gerrit/58331 Binding v2 DOM Codec - codecs API - Part 2
mdsal https://git.opendaylight.org/gerrit/58330 Binding v2 runtime context
mdsal https://git.opendaylight.org/gerrit/58325 Binding v2 DOM Codec - codecs API - Part 1
mdsal https://git.opendaylight.org/gerrit/58324 Binding spec runtime v2 - TreeNodeSerializer & relatives
mdsal https://git.opendaylight.org/gerrit/58323 Binding2 runtime - API #7
mdsal https://git.opendaylight.org/gerrit/58322 Binding2 runtime - API #6
mdsal https://git.opendaylight.org/gerrit/58321 Binding2 runtime - API #5
mdsal https://git.opendaylight.org/gerrit/58320 Binding2 runtime - API #4
mdsal https://git.opendaylight.org/gerrit/58338 Binding2 runtime - API #3
mdsal https://git.opendaylight.org/gerrit/58319 Binding2 runtime - API #2
mdsal https://git.opendaylight.org/gerrit/58318 Binding2 runtime - API #1
mdsal https://git.opendaylight.org/gerrit/58222 Binding generator v2 - Identities support
mdsal https://git.opendaylight.org/gerrit/58005 Binding v2 runtime
mdsal https://git.opendaylight.org/gerrit/58221 Binding generator v2 - Unions fix
mdsal https://git.opendaylight.org/gerrit/58220 Binding generator v2 - Notifications
yangtools https://git.opendaylight.org/gerrit/58358 YANGTOOLS-778: Add support for parsing restconf:yang-data extension
yangtools https://git.opendaylight.org/gerrit/58350 YANGTOOLS-781 - Empty description and reference of ModuleImport in some cases
yangtools https://git.opendaylight.org/gerrit/58263 YANGTOOLS-548: Change semantic-version to openconfig-version
yangtools https://git.opendaylight.org/gerrit/58229 YANGTOOLS-702 - Improve mapping of YANG extensions
genius https://git.opendaylight.org/gerrit/58129 @Immutable GroupEntity
genius https://git.opendaylight.org/gerrit/58102 Add missing @Override and serialVersionUID to genius.mdsalutil
groupbasedpolicy https://git.opendaylight.org/gerrit/58337 GBP-288 - quick fix for async transaction creation
groupbasedpolicy https://git.opendaylight.org/gerrit/58245 NETVIRT-697 - updating metadata endpoints
netconf https://git.opendaylight.org/gerrit/58312 BUG-8085: create missing parent augmentation node
netconf https://git.opendaylight.org/gerrit/58105 NETCONF-427 direct writes to ordered list fail
controller https://git.opendaylight.org/gerrit/58368 Fix RecoveryIntegrationSingleNodeTest failure
controller https://git.opendaylight.org/gerrit/58194 BUG-8494: do not attempt to reconnect ReconnectingClientConnection
controller https://git.opendaylight.org/gerrit/58274 BUG-8403: fix DONE state propagation
controller https://git.opendaylight.org/gerrit/57972 Replace LOGGER by LOG
sfc https://git.opendaylight.org/gerrit/57988 Remove redundant modifier
netvirt https://git.opendaylight.org/gerrit/58347 NETVIRT-709 - DNAT traffic from DC gateway to FIP fails
netvirt https://git.opendaylight.org/gerrit/58346 NETVIRT-703: Exception with invalid QoS Alert params
netvirt https://git.opendaylight.org/gerrit/58199 Fix checkstyle problems not detected by the current version
netvirt https://git.opendaylight.org/gerrit/58265 GENIUS-46 - guarding NPE
netvirt https://git.opendaylight.org/gerrit/58250 NETVIRT-702: DNAT failure with openstack/ocata
netvirt https://git.opendaylight.org/gerrit/58249 elanName is null
netvirt https://git.opendaylight.org/gerrit/58140 Replace LOGGER by LOG

Not sure if this helps or not. If there are a few patches we think could be
what's caused the regression we could put a revert for it (without merging)
and take the resulting distribution from distribution-check and run CSIT.
Or actually, just run the gate on the revert.

[a] https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-upstream-stateful-carbon/420/log.html.gz#s1-s6
[b] https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-upstream-stateful-carbon/421/log.html.gz#s1-s6

Comment by Vivekanandan Narasimhan [ 13/Jun/17 ]

VPNEngine is unable to convert the learnt migrated MIP IPAddress into a VRFEntry. As a result, no flow appears for the MIP IPAddress.

Peri is currently looking into this problem.

"
The other CSIT TC failure upstream is on MIP Migration, we see that the failure is due to the inability to create VRFEntry for a migrated MIP even though the information for that MIP is available in learnt-vpn-vip-to-port..

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-carbon/657/log.html.gz#s1-s6

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-carbon/657/odl1_karaf.log.gz

From the report above, you could see that “192.168.10.110" is available, but VRFEntry for the same is not available in Karaf Logs:
2017-06-08 10:24:06,029 | INFO | nPool-1-worker-1 | VpnInterfaceManager | 332 - org.opendaylight.netvirt.vpnmanager-impl - 0.4.1.Carbon | VPN Interface update event - intfName 83ec3742-5730-4761-b31b-5a1b4b7dbe7c onto vpnName 4ae8cd92-48ca-49b5-94e1-b2921a261111 running config-driven
2017-06-08 10:24:06,029 | ERROR | nPool-1-worker-1 | VpnUtil | 332 - org.opendaylight.netvirt.vpnmanager-impl - 0.4.1.Carbon | No rd available from VpnInstance to allocate for prefix 192.168.10.110/32
2017-06-08 10:24:06,029 | ERROR | nPool-1-worker-1 | VpnInterfaceManager | 332 - org.opendaylight.netvirt.vpnmanager-impl - 0.4.1.Carbon | No rds to allocate extraroute 192.168.10.110/32

"

Comment by Gobinath Suganthan [ 20/Jun/17 ]

This has been fixed as part of this patch https://git.opendaylight.org/gerrit/#/c/59138/

Generated at Wed Feb 07 20:22:17 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.