[NETVIRT-528] CSIT Sporadic failures - floating IP unreachable Created: 13/Mar/17 Updated: 04/Apr/17 Resolved: 04/Apr/17 |
|
| Status: | Resolved |
| Project: | netvirt |
| Component/s: | General |
| Affects Version/s: | Carbon |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Jamo Luhrsen | Assignee: | Alon Kochba |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Issue Links: |
|
||||||||
| External issue ID: | 7968 | ||||||||
| Description |
|
two instances are assigned a floating ip and connectivity is tested with a possibly digging in to the flow dumps will point to something wrong? |
| Comments |
| Comment by Alon Kochba [ 14/Mar/17 ] |
|
Note that v1 with neutron |
| Comment by Alon Kochba [ 14/Mar/17 ] |
|
Note that networking-odl-v1 had a known issue where these failures would happen from time to time. Please update here if this also happens in v2. |
| Comment by Koby Aizer [ 15/Mar/17 ] |
|
I looked into the dumps as well. Alon is right, and there is indeed a sporadic bug in v1 with floating IPs. However, this is not the case here. The failure in the report Jamo has attached is caused by a missing table=21 rule of the VMs private IPs. This looks really similar to |
| Comment by Jamo Luhrsen [ 23/Mar/17 ] |
| Comment by Koby Aizer [ 27/Mar/17 ] |
|
Root cause for this bug is |
| Comment by Sam Hague [ 03/Apr/17 ] |
| Comment by Vivekanandan Narasimhan [ 04/Apr/17 ] |
|
I am duping this bug to 8082, as both 7968 and 8082 root-caused to missing flows in Table 19 for Internal Router Interface Mac Addresses. On my analysis with [302] below (equated to [0] from Jamo), I could see the root cause for both 7968 and sub-issue with 7939 is improper flows in L3-GW-MAC-TABLE (i.e, Table 19) making it unable to send packet down to Table 21. In both the failed scenarios a. 7968 - Ping from VMInstance to Floating-IP (see [301]) b. 7939 - Here referring to Add Multiple extra-route TC failure with Table 81 is getting correctly populated always with right MAC to respond to on ARP (and getting cleaned-up too) , but Table 19 is never having the same matching Router-internal-interface MAC-flows thereby resulting in traffic failures. If you would be able to squeeze some time, please see why L3-GW-MAC-Table is not able to program the router-internal-intf flows and sometimes not removing old flows. |