[NETVIRT-1372] Fail: vpn basic ipv6.Delete Router And Router Interfaces With L3VPN: Verify GWMAC Flow Entry Removed From Flow Table' failed Created: 16/Jul/18 Updated: 16/Oct/18 Resolved: 16/Oct/18 |
|
| Status: | Resolved |
| Project: | netvirt |
| Component/s: | ipv6service |
| Affects Version/s: | None |
| Fix Version/s: | Fluorine-SR1 |
| Type: | Bug | Priority: | High |
| Reporter: | Sam Hague | Assignee: | Somashekar B |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | csit:exception | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
|
The test fails when ti checks for the gw mac's in the table 19 flows. fa:16:3e:1e:c9:e6 in this test run is the one that failed to be removed. This test fails intermittently every other day or so. cookie=0x8220015, duration=7053.025s, table=19, n_packets=62, n_bytes=2604, priority=100,arp,arp_op=1 actions=resubmit(,17) cookie=0x8220016, duration=7053.025s, table=19, n_packets=28, n_bytes=1176, priority=100,arp,arp_op=2 actions=resubmit(,17) cookie=0x8000009, duration=49.935s, table=19, n_packets=0, n_bytes=0, priority=20,metadata=0x30e0c/0xfffffe,dl_dst=fa:16:3e:1e:c9:e6 actions=goto_table:21 cookie=0x1080000, duration=7053.025s, table=19, n_packets=3513, n_bytes=468116, priority=0 actions=resubmit(,17) cookie=0x88880000, duration=7052.683s, table=195, n_packets=193, n_bytes=8274, priority=0,arp actions=CONTROLLER:65535,learn(table=195,hard_timeout=5,priority=10,cookie=0x88880000,eth_type=0x806,NXM_OF_ARP_SPA[],NXM_OF_ARP_TPA[],OXM_OF_METADATA[24..39],load:0x1->NXM_NX_REG4[0..7]),learn(table=196,hard_timeout=5,priority=10,cookie=0x88880000,eth_type=0x806,NXM_OF_ARP_TPA[]=NXM_OF_ARP_SPA[],NXM_OF_ARP_SPA[]=NXM_OF_ARP_TPA[],OXM_OF_METADATA[24..39],load:0x1->NXM_NX_REG4[8..15]) |
| Comments |
| Comment by Jamo Luhrsen [ 16/Jul/18 ] |
|
shague, can you put a link to the robot log with this failure? |
| Comment by Somashekar B [ 03/Aug/18 ] |
|
This issue is not observed on the recent builds. |
| Comment by Waldemar Znoinski [ 15/Aug/18 ] |
|
I see this issue in RH OSP CI recently, DEBUG karaf logs + robot log attached to: https://bugzilla.redhat.com/show_bug.cgi?id=1616288 |
| Comment by Vishal Thapar [ 03/Sep/18 ] |
|
Saw this issue in upstream Netvirt CSIT again: |
| Comment by Jamo Luhrsen [ 04/Sep/18 ] |
|
it's still happening, re-opening |
| Comment by Somashekar B [ 10/Sep/18 ] |
|
I am not able to reproduce this issue. Observation: When subnets are added to router, for every router interface (dl_dst=MAC), table 19 flow is programmed with router metadata. In case, if we dissociate router from L3VPN and later delete router interfaces, we may not hit this stale issue. Logs with TRACE enabled might help a bit here. |
| Comment by Jamo Luhrsen [ 10/Sep/18 ] |
|
the last time this failed in 1node was over two weeks ago, but maybe it's just more frequent in the 3node jobs, which is where what modules would we need TRACE enabled for? One thing we can try is running this suite many times in a row |
| Comment by Somashekar B [ 11/Sep/18 ] |
|
Please enable TRACE for below modules: org.opendaylight.netvirt.neutronvpn |
| Comment by Somashekar B [ 09/Oct/18 ] |
|
jluhrsen, did you get a chance to get the TRACE logs for this issue? |
| Comment by Vishal Thapar [ 09/Oct/18 ] |
|
I've started a run with just this test suite and trace logs: Will share the results once available and rerun the suite if don't hit the issue. Could you explain the workaround you mentioned? Do you think changing order can 'avoid' this issue? If yes, I'd like to try that too as backup plan. |
| Comment by Vishal Thapar [ 09/Oct/18 ] |
|
First run didn't see the issue and I messed up the debugmap. Started a new run with correct debugmap. Update: Second run passed but no logs. I think TRACE generates way too many log entries than our jenkins can manage. Can we get more directed traces to reduce amount of logs? |
| Comment by Somashekar B [ 09/Oct/18 ] |
|
Hi Vishal, Try setting "org.opendaylight.netvirt.neutronvpn.NeutronHostConfigChangeListener" to INFO level. |
| Comment by Somashekar B [ 09/Oct/18 ] |
|
Hi Vishal, Thanks for running the test suite. From CSIT logs, issue has happened when router is associated with L3VPN. [1] Before router assocation with BGPVPN: cookie=0x8220015, duration=1035.396s, table=19, n_packets=45, n_bytes=1890, priority=100,arp,arp_op=1 actions=resubmit(,17) [2] After router assocation with BGPVPN: cookie=0x8220015, duration=1109.108s, table=19, n_packets=45, n_bytes=1890, priority=100,arp,arp_op=1 actions=resubmit(,17) |
| Comment by Vishal Thapar [ 09/Oct/18 ] |
|
Hi Som, Could you clarify this? Set all other TRACE and this to INFO? |
| Comment by Somashekar B [ 10/Oct/18 ] |
|
Yes, Vishal. |
| Comment by Jamo Luhrsen [ 10/Oct/18 ] |
| Comment by Vishal Thapar [ 12/Oct/18 ] |
|
And we're not hitting the issue with TRACE enabled |
| Comment by Somashekar B [ 12/Oct/18 ] |
|
Should we close this bug for now and re-open again if it's reproducible later? |
| Comment by Sam Hague [ 16/Oct/18 ] |
|
Yes, let's go ahead and close it for now. |