[OVSDB-146] race condition between northbound and southbound events Created: 23/Apr/15  Updated: 19/Oct/17  Resolved: 25/Jan/16

Status: Resolved
Project: ovsdb
Component/s: openstack.net-virt
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Cédric Ollivier Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Linux
Platform: PC


External issue ID: 3052

 Description   

When an interface is deleted, processRowUpdate (SouthboundHandler.java) tries to get the tenant network corresponding to this interface. But, in that implementation, the network will be null if the corresponding neutron port has been deleted before.

It seems working when the owner is compute:nova or network:dhcp (reference setup). My logs show processRowUpdate is always called before doNeutronPortDeleted (PortHandler.java).

It doesn't work in case of router_interface ports owned by l3-agent (my logs show that doNeutronPortDeleted is always called before). Network is null in processRowUpdate and then handleInterfaceDelete doesn't manage the flows to delete.

Referring to ml2 plugin.py, we can't assume that l3-agent removes the interface before the call from delete_port_postcommit (mechanism_odl.py). More generally, we can't think that the interface will be deleted before the http delete request.

A solution would be to manage network data in southbound delete events (eg. virtual tenant identifier) to handle interface delete. More disruptive solutions are possible too.



 Comments   
Comment by Flavio Fernandes [ 23/Apr/15 ]

[16:47:48] <ollivier> I think this is really serious and forbid at least the use of l3-agent
[16:47:57] <flaviof> ack
[16:48:53] <ollivier> As it's a race condition it could happen by any port removal
[16:49:29] <ollivier> the exchanges between nova and neutron save us
[16:49:50] <ollivier> I don't analyse deeply the dhcp case..
[16:49:51] <flaviof> l3-agent is not part of odl... not sure I understand how that is affected
[16:50:35] <flaviof> mind you, the ovs port add/remove is not done by ovsdb in odl, it is done by nova.
[16:50:42] <ollivier> It's a race condition raised by ports managed by l3-agent... But It could happen with dhcp port or VMS
[16:51:20] <flaviof> ack
[16:51:37] <ollivier> agree. But if neutron sends to ODL port deleted before nova calls ovs-vsctl del-port
[16:52:46] <ollivier> itwill fail
[16:53:42] <ollivier> In case of l3 agent, ML2 (ODL driver) sends the HTTP delete request before ovs-vsctl del-port
[16:53:50] <flaviof> make sense.
[16:54:03] <flaviof> it definitely should be order agnostic.
[16:56:31] <ollivier> even if ODL can manage L3 forwarding and avoid l3-agent now, any port can raise this race condition...

Comment by Cédric Ollivier [ 03/May/15 ]

It doesn't work in case of network:dhcp ports when the subnet is deleted. You must modify the subnet before deleting it (enable_dhcp=False).

Comment by Peter Bandzi [ 03/Jun/15 ]

Tested in OPNFV project.
https://build.opnfv.org/ci/view/functest/job/functest-opnfv-jump-2

used test from ODL integration suite:
https://github.com/opendaylight/integration/tree/master/test/csit/suites/openstack/neutron

Plus added similar test for delete created network according ID

Network, Subnet, Port still remains in ODL even when they were deleted from neutron.

to reproduce:
git clone https://gerrit.opnfv.org/gerrit/functest
cd functest/testcases/Controllers/ODL/CI
./create_env.sh
ODL_IP=<odl_ip> ODL_PORT=<odl_port> PASS=<ostack_admin_pass> NEUTRON_IP=<neutorn_ip> ./start_tests.sh

Generated at Wed Feb 07 20:35:37 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.