[NETVIRT-1311] Bundle based resync fails after the OVS restart in L2 network Created: 13/Jun/18  Updated: 22/Jan/20  Resolved: 22/Jan/20

Status: Resolved
Project: netvirt
Component/s: natservice
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Fathima Thasneem Assignee: Karthikeyan Krishnan
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File BundleCommitError.PNG     Text File Config Dump.txt    

 Description   

Issue: Bundle based resync fails after the OVS restart in L2 network 

Noticed : Entire Netvirt CSIT suite run with Bundle resync flag enabled

 

Commit Bundle Failed for the below OFPT_ERROR message.

Type :OFPET_BAD_ACTION(2)


Group :OFBAC_BAD_OUT_GROUP (9)

 

Failed for the reason:

"Flows present in the config Inventory pointing to groups which doesn’t exist”

 We could narrow down that it was because of the SNAT Flow pasted below,where the "group-id": 225005 doesn’t exist at all.

Attached herewith the config dump and the OFPT_Error wireshark capture.



 Comments   
Comment by daya kamath [ 13/Jun/18 ]

chetan has indicated that this codepath is observed in the vlan provider network based external network access usecase, and not in the bgpvpn based external network acccess usecase

Comment by Chetan Arakere Gowdru [ 14/Jun/18 ]

Ref CSIT - https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-queens-upstream-stateful-oxygen/533/robot-plugin/log_full.html.gz

Comment by Chetan Arakere Gowdru [ 14/Jun/18 ]

As per analysis, when tempest is ran, an vpn-instance-vpn-id entry will be made for each of the external-subnets. The Nat ExternalSubnetVpnInstanceListener.add() gets triggered on this which makes installation of 21->SNATGroup(225005) on each Dpn where this VPN has the presence. 

On tempest clean-up, I see an stale entry vpn-instance-vpn-id for this external-subnet. As a result, ExternalSubnetVpnInstanceListener.remove() not triggered resulting in this stale flows.

Comment by Chetan Arakere Gowdru [ 17/Jun/18 ]

@Fathima,

I have raised an WIP review to address these stale entries(21->snat-group with the corresponding entry missing). Such stale entries are currently not seen with be below CSIT.

https://git.opendaylight.org/gerrit/#/c/73061/

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-queens-gate-stateful-fluorine/629/robot-plugin/log_full.html.gz#s1-s9-t3-k14-k2-k1-k2-k38

Will it be possible to validate the Bundle based resync UC with this patch-set??

 

Comment by Fathima Thasneem [ 19/Jun/18 ]

Hi Chetan,

I gave a suite run picking your gerrit patch and observed the issue.

https://jenkins.opendaylight.org/sandbox/job/fathimanetvirt-csit-1node-openstack-pike-upstream-stateful-fluorine/1/

Thanks,

A.Fathima Thasneem

Comment by Chetan Arakere Gowdru [ 19/Jun/18 ]

Hi Fathima, 

As discussed, I don't see the patch changes are taken with this CSIT run. Please re-verify the same.

Thanks,

Chetan

Comment by Chetan Arakere Gowdru [ 23/Jun/18 ]

WIP : https://git.opendaylight.org/gerrit/#/c/73061/

Comment by Chetan Arakere Gowdru [ 22/Jan/20 ]

This Identified issue and the required fix been merged.

https://git.opendaylight.org/gerrit/#/c/netvirt/+/73061/

Please verify if this still and open issue

Generated at Wed Feb 07 20:23:45 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.