[NETVIRT-785] CSIT Sporadic failures - SNAT VM without connectivity to external gateway Created: 12/Jul/17  Updated: 15/Dec/17  Resolved: 27/Sep/17

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: None

Type: Bug
Reporter: Jamo Luhrsen Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 8850

 Description   

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-carbon/813/log.html.gz#s1-s1-s3-t19

before we had bugs that were failing just the TCP and/or UDP cases, but
this seems related to a single VM without connectivity.



 Comments   
Comment by Chetan Arakere Gowdru [ 13/Jul/17 ]

Hi All,

Based on the Primary investigation, I will remote flow for reaching back 90.0.0.6(which is on computeNode-2) been missing on Control-Node(Napt Switch). As a result, after SNAT reverse translation, the packet is lost.

Compute2 flows(flows to reach NAPT been installed properly)

cookie=0x8000004, duration=131.778s, table=21, n_packets=9, n_bytes=810, priority=10,ip,metadata=0x30d4e/0xfffffe actions=goto_table:26
cookie=0x8000006, duration=131.658s, table=26, n_packets=3, n_bytes=222, priority=5,ip,metadata=0x30d4e/0xfffffe actions=set_field:0x11170->tun_id,group:225003

group_id=210003,type=all,bucket=actions=resubmit(,55),set_field:0xf->tun_id,bucket=actions=resubmit(,55),set_field:0x3->tun_id

Control-Node (SNAT translation happened,but due to missing remote flows to reach 90.0.0.6, packet are not send back to Non-Napt Switch)

cookie=0x8000005, duration=128.923s, table=36, n_packets=14, n_bytes=971, priority=10,ip,tun_id=0x11170 actions=write_metadata:0x30d4e/0xfffffe,goto_table:46

cookie=0x81296a7, duration=9.444s, table=46, n_packets=3, n_bytes=222, idle_timeout=300, send_flow_rem priority=10,tcp,metadata=0x30d4e/0xfffffe,nw_src=90.0.0.6,tp_src=54966 actions=set_field:10.10.10.9->ip_src,set_field:49154->tcp_src,set_field:fa:16:3e:d7:82:77->eth_src,write_metadata:0x30d48/0xffffff,goto_table:47

cookie=0x81296a7, duration=9.482s, table=44, n_packets=5, n_bytes=370, send_flow_rem priority=10,tcp,nw_dst=10.10.10.9,tp_dst=49154 actions=set_field:90.0.0.6->ip_dst,set_field:54966->tcp_dst,write_metadata:0x30d4e/0xfffffe,goto_table:47

Since UDP traffic is also initiated from the same VM, it is also failed.

I see the existence of VPN on Control-node(VM flows belonging to same network installed)

cookie=0x8000003, duration=159.424s, table=21, n_packets=0, n_bytes=0, priority=42,ip,metadata=0x30d4e/0xfffffe,nw_dst=90.0.0.8 actions=set_field:0x60->tun_id,set_field:fa:16:3e:db:5d:c7->eth_dst,load:0x1100->NXM_NX_REG6[],resubmit(,220) cookie=0x8000003, duration=159.424s, table=21, n_packets=7, n_bytes=480, priority=42,ip,metadata=0x30d4e/0xfffffe,nw_dst=90.0.0.12 actions=set_field:0x60->tun_id,set_field:fa:16:3e:d5:92:12->eth_dst,load:0x1100->NXM_NX_REG6[],resubmit(,220) cookie=0x8000003, duration=159.011s, table=21, n_packets=0, n_bytes=0, priority=42,ip,metadata=0x30d4e/0xfffffe,nw_dst=90.0.0.2 actions=group:150006

Thanks,
Chetan

Comment by Chetan Arakere Gowdru [ 13/Jul/17 ]

Issue observed once in last 30 CSIT run

Comment by Jamo Luhrsen [ 18/Jul/17 ]

another failure:

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-ocata-upstream-learn-carbon/68/log.html.gz#s1-s1-s3

Comment by Jamo Luhrsen [ 25/Jul/17 ]

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-upstream-stateful-snat-conntrack-carbon/121/log.html.gz#s1-s1-s3

Comment by Chetan Arakere Gowdru [ 26/Jul/17 ]

For Controller Based SNAT, this UC is getting failed due to missing remote FIB entries to reach VM with internal IP after reverse SNAT translation.

(flow missing to reach 90.0.0.6 on Control-node(NAPT Switch))
https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-newton-nodl-v2-upstream-stateful-carbon/813/log.html.gz#s1-s1-s3-t19

(flow missing for 90.0.0.11 in Compute-2(NAPT) to reach this VM which is in compute-1)
https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-1node-openstack-ocata-upstream-learn-carbon/68/log.html.gz#s1-s1-s3

@Aswin,

With Conntrack-Based-SNAT, I don't see this flow missing issue.

Compute-1
cookie=0x8000006, duration=160.191s, table=44, n_packets=5, n_bytes=348, priority=10,ip,metadata=0x30d48/0xfffffe,nw_dst=10.10.10.4 actions=set_field:0x30d54->metadata,ct(table=47,zone=5003,nat)

cookie=0x8000006, duration=160.196s, table=46, n_packets=3, n_bytes=198, priority=6,ct_state=+snat,ip,metadata=0x30d54/0xfffffe actions=set_field:0x30d48->metadata,set_field:fa:16:3e:71:81:b6->eth_src,resubmit(,47)

cookie=0x8000006, duration=160.196s, table=46, n_packets=2, n_bytes=133, priority=5,ct_state=+new+trk,ip,metadata=0x30d54/0xfffffe actions=set_field:0x30d48->metadata,set_field:fa:16:3e:71:81:b6->eth_src,ct(commit,table=47,zone=5003,nat(src=10.10.10.4))

cookie=0x8000006, duration=160.196s, table=47, n_packets=5, n_bytes=331, priority=6,ip,metadata=0x30d48/0xfffffe actions=load:0->NXM_OF_IN_PORT[],resubmit(,21)

cookie=0x8000006, duration=160.191s, table=47, n_packets=5, n_bytes=348, priority=5,ct_state=+dnat,ip actions=resubmit(,21)

Eventhough INBOUND_NAPT_TABLE is hit, the below L3_FIB table is not hit in this case.

cookie=0x8000003, duration=168.902s, table=21, n_packets=0, n_bytes=0, priority=42,ip,metadata=0x30d54/0xfffffe,nw_dst=90.0.0.5 actions=set_field:0x59->tun_id,set_field:fa:16:3e:ab:74:7f->eth_dst,load:0xe00->NXM_NX_REG6[],resubmit(,220)

Comment by Jamo Luhrsen [ 27/Sep/17 ]

no longer seen in CSIT

Generated at Wed Feb 07 20:22:27 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.