[NETVIRT-659] TCP communication is working, even remove the TCP SG rule from the VM instance. Created: 08/May/17  Updated: 03/May/18  Resolved: 05/Apr/18

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Carbon
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Hari Prasidh Assignee: Vinoth B
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 8400

 Description   

Setup Used:

1. 3 node ODL cluster + HA proxy
2. new netvirt (carbon)
3. Packstack with ocata.

Steps followed for testing:

Scenario 1:
------------
1.Created Network.
2. Created 2 VM's
3.Created SG1 with TCP ALL rule.
4. Created SG2 without any rule.
5.Applied SG1 to both the VM's.
6.Observed TCP communication is working for Ingress and Egress as well.
7.Removed SG1 rule and Applied SG2 to both Vm's during TCP communication.

Observations:
---------------
1) Observed TCP communication still working in same session for Ingress and Egress as well.

2) Observed TCP communication is does not working with a new session for Ingress and Egress as well.

Scenario 2:
-----------
1.Created Network.
2. Created 2 VM's
3.Created SG1 with TCP ALL rule.
4.Applied SG1 to both the VM's.
5.Observed TCP communication is working for Ingress and Egress as well.
6.Removed TCP ALL rule from both the VM's during TCP communication.

Observations:
---------------
1) Observed TCP communication still working in same session for Ingress and Egress as well.

2) Observed TCP communication is does not working with a new session for Ingress and Egress as well.



 Comments   
Comment by Arthi Bhattacharjee [ 26/May/17 ]

While evaluating the bug, we have found an another issue.
Ref: https://bugs.opendaylight.org/show_bug.cgi?id=8553

Comment by Hari Prasidh [ 30/May/17 ]

Note:

I've tested the same scenario with Any SG rule (TCP/ICMP/UDP) and observed the issue.

Comment by Aswin Suryanarayanan [ 30/May/17 ]

This is since conntrack entries are not deleted when a SG rule is deleted. We need to find a way to prevent the traffic corresponding the deleted rules not hitting +est. One way to workaround this is add a higher priority drop rule to match the traffic corresponding to deleted rule with a timeout.

Comment by Vinoth B [ 15/Jun/17 ]

Observation - 1:

Created 2 VMs with ICMP ALL rule.. When ping operation begin between 2 VMs , the below conntrack entry has been created,

[1] [root@ocata-compute1 ~]# conntrack -L | grep icmp
conntrack v1.4.3 (conntrack-tools): 38 flow entries have been shown.
icmp 1 19 src=20.0.0.5 dst=20.0.0.12 type=8 code=0 id=31489 src=20.0.0.12 dst=20.0.0.5 type=0 code=0 id=31489 mark=0 zone=6002 use=1

This entry will be available until we stopped the ping communication.

When we remove the ICMP rule from the SG, Ongoing communication is not getting disturbed because the respective conntrack entry [1] is still exists and make the communication to be successful.

Observation - 2:

Executed the same test case in pure openstack and observered that ongoing communication has been stopped and respective conntrack rule also removed, when we remove the rule.

Comment by Aswin Suryanarayanan [ 15/Jun/17 ]

As mentioned earlier we could add a higher priority rule to drop the traffic. The delete action will overwrite the commit flow with a drop flow of higher priority with a timeout. The value for timeout for each protocol can be selected from [1].

[1]https://github.com/opendaylight/netvirt/blob/master/vpnservice/aclservice/impl/src/main/yang/aclservice-config.yang

Comment by Vinoth B [ 19/Jun/17 ]

Observation - 3:

Setup : pure openstack with OVS firewall,

1. Created 1 VM with ICMP ALL ingress rule and it installs the below specified flows in the flow table.

[1] cookie=0xa99821b3245d1aa6, duration=95.226s, table=82, n_packets=23, n_bytes=2254, reset_counts priority=70,ct_state=+est-rel-rpl,icmp,reg5=0x4,dl_dst=fa:16:3e:7c:07:75 actions=pop_vlan,output:4

[2] cookie=0xa99821b3245d1aa6, duration=95.225s, table=82, n_packets=1, n_bytes=98, reset_counts priority=70,ct_state=+new-est,icmp,reg5=0x4,dl_dst=fa:16:3e:7c:07:75 actions=ct(commit,zone=NXM_NX_REG6[0..15]),pop_vlan,output:4

[1] Kernal conntrack entry flow
[2] SG flow

2. Pinged from DHCP to VM and observed that the first packet is going through the SG flow and further packets are hitting the kernal conntrack entry flow.

3. When we delete the ICMP ingress rule, the on going communication has been stopped because the respective conntrack kernal entry flow is also getting deleted along with the SG flow.

But in ODL, since we have only one global Kernal conntrack entry flow for all the VMs, The ongoing communication is still succeeded even though we removed the SG rule from VM.

Comment by Vinoth B [ 20/Jun/17 ]

(In reply to Vinoth B from comment #6)
> Observation - 3:
>
> Setup : pure openstack with OVS firewall,
>
> 1. Created 1 VM with ICMP ALL ingress rule and it installs the below
> specified flows in the flow table.
>
> [1] cookie=0xa99821b3245d1aa6, duration=95.226s, table=82,
> n_packets=23, n_bytes=2254, reset_counts
> priority=70,ct_state=+est-rel-rpl,icmp,reg5=0x4,dl_dst=fa:16:3e:7c:07:75
> actions=pop_vlan,output:4
>
> [2] cookie=0xa99821b3245d1aa6, duration=95.225s, table=82,
> n_packets=1, n_bytes=98, reset_counts
> priority=70,ct_state=+new-est,icmp,reg5=0x4,dl_dst=fa:16:3e:7c:07:75
> actions=ct(commit,zone=NXM_NX_REG6[0..15]),pop_vlan,output:4
>
> [1] Kernal conntrack entry flow
> [2] SG flow
>
> 2. Pinged from DHCP to VM and observed that the first packet is going
> through the SG flow and further packets are hitting the kernal conntrack
> entry flow.
>
> 3. When we delete the ICMP ingress rule, the on going communication has been
> stopped because the respective conntrack kernal entry flow is also getting
> deleted along with the SG flow.
>
> But in ODL, since we have only one global Kernal conntrack entry flow for
> all the VMs, The ongoing communication is still succeeded even though we
> removed the SG rule from VM.

Shall i proceed the way how OVS firewall handles this issue? I mean adding the conntrack-Kernel entry per rule instead of common for all the VMs.

Comment by Aswin Suryanarayanan [ 20/Jun/17 ]

Adding a +est rule per vm defeat the purpose of using connection tracking in my opinion. An established packet need to traverse through all the rules which is added per vm before we have a hit.

I would suggest adding a drop rule on deletion would be better solution. A similar proposal implemented in OVN[1],[2].

[1]https://github.com/openvswitch/ovs/commit/cc58e1f2cc414b844c36d21a9a39e4f3383e75e4
[2]https://mail.openvswitch.org/pipermail/ovs-dev/2016-February/065716.html

Comment by Vinoth B [ 23/Jun/17 ]

Aswin,

The DROP flow fix is fine.

Fix Explanation :

When we delete a specific SG rule from the VM, netvirt should add the high priority drop rule with the exact match condition with idle timeout. This drop flow will stops the existing communication.

Similarly, When we add a same rule again, netvirt should remove the added drop flow to make the existing stopped communication to be resumed.

We have to look on below problem before starting the implementation.
Consider we are associating two ALL ICMP Ingress rule to one VM. So, we could see two ICMP ingress flow with different priority for the same VM. When we delete one ICMP ingress rule from the VM, then we will add one drop rule as per the fix above, which make the existing communication to break irrespective of exact ICMP ingress flow still exists in the flow table.
So before adding the drop rule, we have to check whether the VM associated with similar rule or not.

Apart from this, this approach is fine. Shall | start the implementation for this fix?

Comment by Vinoth B [ 10/Jul/17 ]

I have pushed a draft patch to fix this issue.

Netvirt:
https://git.opendaylight.org/gerrit/#/c/60077/

Fix provided:

Added a DROP rule with idle_timeout(5 secs) when we delete or remove the SG from the VM. If we add the same rule again to the VM, Netvirt will delete the added DROP rule and make the existing communication to be resume.

Limitations :

1) When we add the two similar rules to the VMs (SG1 with ALL ICMP ingress, SG2 with ALL ICMP ingress), there are two ALL ICMP ingress flow will be exists in the flow table with different priority. while removing SG1 from VM, as per the above fix, the DROP rule will be added which stops the existing ICMP communication irrespective of there is another similar rule (SG2) is still exist in the VM.

2) After establishing SSH communication between VMs, TCP doesn't send packets continuously. So the added DROP flow is getting removed from the flow table after reaching the idle timeout 5 secs. Observed the same behavior also for SCP communication

Comment by Aswin Suryanarayanan [ 10/Jul/17 ]

The draft is not accessible. Please publish the patch or add reviewers.

Comment by Vinoth B [ 09/Aug/17 ]

Since the fix having some limitation, I am working on the new approach which will resolve all the limitation.

The new approach involves pipeline changes, So I have pushed the spec patch.

https://git.opendaylight.org/gerrit/#/c/60580/

Comment by Sam Hague [ 05/Apr/18 ]

https://git.opendaylight.org/gerrit/66788

Generated at Wed Feb 07 20:22:08 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.