[NETVIRT-785] CSIT Sporadic failures - SNAT VM without connectivity to external gateway Created: 12/Jul/17 Updated: 15/Dec/17 Resolved: 27/Sep/17 |
|
| Status: | Resolved |
| Project: | netvirt |
| Component/s: | General |
| Affects Version/s: | Carbon |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Jamo Luhrsen | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 8850 |
| Description |
|
before we had bugs that were failing just the TCP and/or UDP cases, but |
| Comments |
| Comment by Chetan Arakere Gowdru [ 13/Jul/17 ] |
|
Hi All, Based on the Primary investigation, I will remote flow for reaching back 90.0.0.6(which is on computeNode-2) been missing on Control-Node(Napt Switch). As a result, after SNAT reverse translation, the packet is lost. Compute2 flows(flows to reach NAPT been installed properly) cookie=0x8000004, duration=131.778s, table=21, n_packets=9, n_bytes=810, priority=10,ip,metadata=0x30d4e/0xfffffe actions=goto_table:26 group_id=210003,type=all,bucket=actions=resubmit(,55),set_field:0xf->tun_id,bucket=actions=resubmit(,55),set_field:0x3->tun_id Control-Node (SNAT translation happened,but due to missing remote flows to reach 90.0.0.6, packet are not send back to Non-Napt Switch) cookie=0x8000005, duration=128.923s, table=36, n_packets=14, n_bytes=971, priority=10,ip,tun_id=0x11170 actions=write_metadata:0x30d4e/0xfffffe,goto_table:46 cookie=0x81296a7, duration=9.444s, table=46, n_packets=3, n_bytes=222, idle_timeout=300, send_flow_rem priority=10,tcp,metadata=0x30d4e/0xfffffe,nw_src=90.0.0.6,tp_src=54966 actions=set_field:10.10.10.9->ip_src,set_field:49154->tcp_src,set_field:fa:16:3e:d7:82:77->eth_src,write_metadata:0x30d48/0xffffff,goto_table:47 cookie=0x81296a7, duration=9.482s, table=44, n_packets=5, n_bytes=370, send_flow_rem priority=10,tcp,nw_dst=10.10.10.9,tp_dst=49154 actions=set_field:90.0.0.6->ip_dst,set_field:54966->tcp_dst,write_metadata:0x30d4e/0xfffffe,goto_table:47 Since UDP traffic is also initiated from the same VM, it is also failed. I see the existence of VPN on Control-node(VM flows belonging to same network installed) cookie=0x8000003, duration=159.424s, table=21, n_packets=0, n_bytes=0, priority=42,ip,metadata=0x30d4e/0xfffffe,nw_dst=90.0.0.8 actions=set_field:0x60->tun_id,set_field:fa:16:3e:db:5d:c7->eth_dst,load:0x1100->NXM_NX_REG6[],resubmit(,220) cookie=0x8000003, duration=159.424s, table=21, n_packets=7, n_bytes=480, priority=42,ip,metadata=0x30d4e/0xfffffe,nw_dst=90.0.0.12 actions=set_field:0x60->tun_id,set_field:fa:16:3e:d5:92:12->eth_dst,load:0x1100->NXM_NX_REG6[],resubmit(,220) cookie=0x8000003, duration=159.011s, table=21, n_packets=0, n_bytes=0, priority=42,ip,metadata=0x30d4e/0xfffffe,nw_dst=90.0.0.2 actions=group:150006 Thanks, |
| Comment by Chetan Arakere Gowdru [ 13/Jul/17 ] |
|
Issue observed once in last 30 CSIT run |
| Comment by Jamo Luhrsen [ 18/Jul/17 ] |
|
another failure: |
| Comment by Jamo Luhrsen [ 25/Jul/17 ] |
| Comment by Chetan Arakere Gowdru [ 26/Jul/17 ] |
|
For Controller Based SNAT, this UC is getting failed due to missing remote FIB entries to reach VM with internal IP after reverse SNAT translation. (flow missing to reach 90.0.0.6 on Control-node(NAPT Switch)) (flow missing for 90.0.0.11 in Compute-2(NAPT) to reach this VM which is in compute-1) @Aswin, With Conntrack-Based-SNAT, I don't see this flow missing issue. Compute-1 cookie=0x8000006, duration=160.196s, table=46, n_packets=3, n_bytes=198, priority=6,ct_state=+snat,ip,metadata=0x30d54/0xfffffe actions=set_field:0x30d48->metadata,set_field:fa:16:3e:71:81:b6->eth_src,resubmit(,47) cookie=0x8000006, duration=160.196s, table=46, n_packets=2, n_bytes=133, priority=5,ct_state=+new+trk,ip,metadata=0x30d54/0xfffffe actions=set_field:0x30d48->metadata,set_field:fa:16:3e:71:81:b6->eth_src,ct(commit,table=47,zone=5003,nat(src=10.10.10.4)) cookie=0x8000006, duration=160.196s, table=47, n_packets=5, n_bytes=331, priority=6,ip,metadata=0x30d48/0xfffffe actions=load:0->NXM_OF_IN_PORT[],resubmit(,21) cookie=0x8000006, duration=160.191s, table=47, n_packets=5, n_bytes=348, priority=5,ct_state=+dnat,ip actions=resubmit(,21) Eventhough INBOUND_NAPT_TABLE is hit, the below L3_FIB table is not hit in this case. cookie=0x8000003, duration=168.902s, table=21, n_packets=0, n_bytes=0, priority=42,ip,metadata=0x30d54/0xfffffe,nw_dst=90.0.0.5 actions=set_field:0x59->tun_id,set_field:fa:16:3e:ab:74:7f->eth_dst,load:0xe00->NXM_NX_REG6[],resubmit(,220) |
| Comment by Jamo Luhrsen [ 27/Sep/17 ] |
|
no longer seen in CSIT |