[NEUTRON-158] Dependency checking for Northbound API Created: 04/Apr/18  Updated: 24/Aug/18

Status: In Progress
Project: neutron
Component/s: northbound-api
Affects Version/s: None
Fix Version/s: Neon

Type: Improvement Priority: Medium
Reporter: Michael Vorburger Assignee: Josh Hershberg
Resolution: Unresolved Votes: 0
Labels: csit:3node
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Issue Links:
Blocks
blocks NEUTRON-149 Security group and security rule even... Open
Sub-Tasks:
Key
Summary
Type
Status
Assignee
NEUTRON-165 bgpvpn dependency checking (NeutronBg... Sub-task Open Deepthi V V  
NEUTRON-166 firewall dependency checking (Neutron... Sub-task Open Josh Hershberg  
NEUTRON-167 FirewallPolicy dependency checking (N... Sub-task Open Josh Hershberg  
NEUTRON-168 FirewallRule dependency checking (Neu... Sub-task Open Josh Hershberg  
NEUTRON-169 Floatingip dependency checking (Neutr... Sub-task Open Chetan Arakere Gowdru  
NEUTRON-170 L2gatewayConnection dependency checki... Sub-task Open suneelu varma  
NEUTRON-171 L2gateway dependency checking (Neutro... Sub-task Open suneelu varma  
NEUTRON-172 Healthmonitor dependency checking (Ne... Sub-task Open Josh Hershberg  
NEUTRON-173 Loadbalancer dependency checking (Neu... Sub-task Open Josh Hershberg  
NEUTRON-174 lbaas (?) dependency checking (Neutro... Sub-task Open Josh Hershberg  
NEUTRON-175 lbaas pool (?) dependency checking (N... Sub-task Open Vishal Thapar  
NEUTRON-176 metering label (?) dependency checkin... Sub-task Open Vishal Thapar  
NEUTRON-177 metering label rule (?) dependency ch... Sub-task Open Vishal Thapar  
NEUTRON-178 network dependency checking (NeutronN... Sub-task Resolved Michael Vorburger  
NEUTRON-179 port dependency checking (NeutronPort... Sub-task In Review Vishal Thapar  
NEUTRON-180 QOS policy dependency checking (Neutr... Sub-task Open Vishal Thapar  
NEUTRON-181 security group dependency checking (N... Sub-task Open Vishal Thapar  
NEUTRON-182 security rule dependency checking (Ne... Sub-task Open Vishal Thapar  
NEUTRON-183 SfcFlowClassifier dependency checking... Sub-task Open Vishal Thapar  
NEUTRON-184 PortChain dependency checking (Neutro... Sub-task In Review Vishal Thapar  
NEUTRON-185 SFC PortPairGroup dependency checking... Sub-task Open  
NEUTRON-186 SFC PortPair dependency checking (Neu... Sub-task Open  
NEUTRON-187 subnet dependency checking (NeutronSu... Sub-task Open yenuganti vasudha  
NEUTRON-188 TapFlow dependency checking (NeutronT... Sub-task Open  
NEUTRON-189 TapService dependency checking (Neutr... Sub-task Open  
NEUTRON-190 trunk dependency checking (NeutronTru... Sub-task Open Shashidhar R  
NEUTRON-191 VPN Ike policy dependency checking (N... Sub-task Open  
NEUTRON-192 VPN ipsec policy dependency checking ... Sub-task Open  
NEUTRON-193 VPN ipsec site connection dependency ... Sub-task Open  
NEUTRON-194 VPN service dependency checking (Neut... Sub-task Open yenuganti vasudha  
NEUTRON-195 Framework for delete dependency checking Sub-task Open  
Epic Link: Clustering Stability

 Description   

The Neutron Northbound API currently (almost) always just returns an "OK" (HTTP 200) reply. Any errors that occur when the requested resource creation/update did not succeed, including missing dependency (but also other internal problems), is "lost" and not communicated to the caller (and only appears in the log).

The technical explanation for this is ODL general architecture based on decoupled asynchronous listening on a shared data model, and the lack of any "feedback loop" between Neutron and e.g. Netvirt (or, in theory, other listeners to the Neutron model).

The goal of this issue is to provide feedback to callers for a first set of missing dependencies. It will not include work to address the more general problem of error handling and feedback for other internal problems than some dependencies.

Events that are originally ordered in the OpenStack Journal can get out of order because (a) the clustered driver reading from the DB could read out of order; (b) the driver sending to clustered load balanced ODL could cause events to be stored out of order in the ODL data store. This will throw off applications such as netvirt which do not expect this.

In order to compensate for this, we will make the ODL Neutron project return an HTTP error other than 2xx range code if its dependencies are not, yet, available in the data store. For example, in the case of NEUTRON-149 a security rule which is written before a security group should be rejected. The validation happens, only, at the level of the Neutron model object data objects, not the internal netvirt or other models which eventually have created by the listeners in netvirt or other. The fact that e.g. a security rule requires a group with a certain ID (and all others like it), will be hard-coded in neutron.

The initial implementation will focus on doing above for Create events. If at all and how to apply this to Update and Delete will be considered later under future issues with refined requirements.

This work in ODL requires no changes to t the existing networking ODL driver code. Therefore we are not anticipating to make this a configurable optional feature via a switch, but just have this new behaviour active as the related changes will be merged.

https://bugzilla.redhat.com/show_bug.cgi?id=1550699



 Comments   
Comment by Michael Vorburger [ 28/May/18 ]

We should pick a specific HTTP 4xx status code to return in case expected dependant objects are not (yet) present.

Looking e.g. at the Wikipedia page about HTTP status codes, the following all seem vaguely appropriate:

  • 409 Conflict: Indicates that the request could not be processed because of conflict in the request, such as an edit conflict between multiple simultaneous update.
  • 412 Precondition Failed (RFC 7232) : The server does not meet one of the preconditions that the requester put on the request. Sounds neat, but strictly speaking we would not reply this because of a RFC 7232 compliant precondition header that the requester put on the request, of course.
  • 417 Expectation Failed: The server cannot meet the requirements of the Expect request-header field. Again, strictly speaking wrong, as of course we do actually use the Expect request-header field.
  • 424 Failed Dependency (WebDAV; RFC 4918): The request failed because it depended on another request and that request failed (e.g., a PROPPATCH). Neat fit, but again - we of course don't do RFC 4918 / WebDAV.
  • 442: Totally arbitrary new one, I just made up! May be best and clearer than using an existing one?

I will start with 442, and will let anyone who thinks that there is a more appropriate one please comment here.

Comment by Josh Hershberg [ 29/May/18 ]

Alternatively, we could just use 400 and give a textual explanation.

Or we could try "402 Payment Required" and see if we could make a little money  

Comment by Michael Vorburger [ 29/May/18 ]

Removing GENIUS-122 as an "is blocked by link" - the current design won't need the DataTreeEventCallbackRegistrar.

Comment by Michael Vorburger [ 20/Jun/18 ]

> initial implementation will focus on doing above for Create events

with the just merged c/72372, this is now on master. 

First round covers it for SG / SR; others should replicate this to other Neutron entities.

> If at all and how to apply this to Update

this is now proposed c/73248, but may need some disucssion... will the delta sent in a PUT to modify alway contain all fields? For example, does a PUT to modify an existing SR have to have that security_group_id field, like a POST to create must, or is that optional? Interested in your thoughts jhershbe, MikeKolesnik, thapar...

> and Delete will be considered later under future issues with refined requirements.

Delete probably does not need Dependency Checking - forget that.

Comment by Michael Vorburger [ 20/Jun/18 ]

shague and I have agree to include this work under the umbrella of "work required for better clustering" (see label and Epic Link he just added), based on a private email exchange where thapar stated the following which jhershbe confirmed with "it makes sense to me to add it here as well." & "Yes, that is my thinking as well." in reply to:

I thought it was more due to Openstack side HA issues. ODL Clustering wise, it will help as netvirt will be much less likely to get notifications out of order. A DS Read on dependency check kinda of ensures that services have already got notifications for earlier resource and will be ready to handle this one. It will help reduce lots of race and out of order issues and probably should help reduce some cross shard read and writes.

Comment by Michael Vorburger [ 25/Jun/18 ]

> First round covers it for SG / SR; others should replicate this to other Neutron entities.

the best way to organize this is probably with new sub-tasks for specific Neutron entities (TBC).

Comment by Michael Vorburger [ 25/Jul/18 ]

> the best way to organize this is probably with new sub-tasks for specific Neutron entities (TBC)

I would love someone else to take this over at this point. What's missing is basically just implementing more areAllDependenciesAvailable() methods, by replicating https://git.opendaylight.org/gerrit/#/c/72372/10/transcriber/src/main/java/org/opendaylight/neutron/transcriber/NeutronSecurityRuleInterface.java to all other Neutron entities.

It's not particularly hard to do anymore (now that I've completing creating all the infrastructure that was required for this underneath), but I'm actually clueless about the Neutron entities... 

Comment by Michael Vorburger [ 02/Aug/18 ]

> take this over at this point

see https://github.com/openstack/networking-odl/blob/master/networking_odl/journal/dependency_validations.py

Generated at Wed Feb 07 20:25:42 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.