[OVSDB-143] Need to define OVSDB Clustering/Data Persistence Behavior Created: 31/Mar/15 Updated: 19/Oct/17 Resolved: 12/Jan/16 |
|
| Status: | Resolved |
| Project: | ovsdb |
| Component/s: | Southbound.Open_vSwitch |
| Affects Version/s: | unspecified |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Reinaldo Penno | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 2923 |
| Description |
|
I’m facing some problems with clustering in SFC and stumbled on an issue with OVSDB The access pattern is the following: OpenvSwitch connects to ODL If ODL reboots: OVSDB and SFC will have data for a switch that does not exist (yet) until it connects Will OVSDB keep this stale data around? How long will it wait until the switch reconnects? All this is unclear to me and have profound impacts on how SFC works and need to be defined if clustering is going to be enabled. |
| Comments |
| Comment by Reinaldo Penno [ 31/Mar/15 ] |
|
BTW, https://git.opendaylight.org/gerrit/#/c/17425/ "* OVS Bridge is successfully written into OVSDB Config store, but it is not instantiated on corresponding OpenVSwitch instance yet" From: Reinaldo Penno <rapenno@gmail.com> Summarizing what we just discussed on the phone… In steady state every switch in OVSDB’s operational store has an associated SFC abstraction called SFF in SFC’s configuration data store. This SFF device is created by listening to OVSDB operational tree changes. There is a lot of code to listen to change on bridges, tp-ids, node-ids, and map to SFC abstractions. The user can change this SFF if needed by creating data plane locators, using it in services paths, etc. These changes are propagated to OVS switch by writing to OVSDB config store. This is still the design we discussed originally for the new OVSDB MD-SAL<->SFC. Now, 1 – ODL reboots while OVS switch was still connected The main points: If switch reconnects, you need to reapply all data to operational store. Specially since switch configuration could have change across ODL reboots (stuff deleted or added). So, a complete write overwrite would be needed. Now, let’s suppose switch never reconnects. How will ever SFC be notified of switch gone? You can not delete something that is already gone. The listener will not get triggered Thanks, |
| Comment by Jan Medved [ 01/Apr/15 ] |
|
I don't understand the following: > 1 – ODL reboots while OVS switch was still connected Do you mean the SFC in the OVS switches is still set up and passing traffic? I assume the SFC app in ODL restarts along with ODL, so its state is wiped out too. ODL (or Control Plane in general) restarts mandate state reconciliation - there is no way around it. State in the SFC app (the SFFs?) must be reconciled with the configured values in the OVS switches. Also, state in the SFC app must be reconciled with the state in the OVSDB app and the OF Plugin and the NC Connector (there is no way to tell whether some transactions were in flight, unless you do 3-phase commit across multiple apps; reconciliation is easier) There are two choices: either you recreate the SFFs from OVS config, or you impose the after-reboot SFF state (perhaps re-read from persistent store) onto OVSes. Since this is SDN, the 2nd option is preferred (We impose the controller's view on the network elements) |
| Comment by Reinaldo Penno [ 01/Apr/15 ] |
|
Do you mean the SFC in the OVS switches is still set up and passing traffic? [RP] there is no SFC App in OVS. If ODL reboots OpenvSwitch continues to work passing traffic. This is data plane. I assume the SFC app in ODL restarts along with ODL, so its state is wiped out too. [RP] Configuration state persists if clustering is turned on. ODL (or Control Plane in general) restarts mandate state reconciliation - there is no way around it. State in the SFC app (the SFFs?) must be reconciled with the configured values in the OVS switches. Also, state in the SFC app must be reconciled with the state in the OVSDB app and the OF Plugin and the NC Connector (there is no way to tell whether some transactions were in flight, unless you do 3-phase commit across multiple apps; reconciliation is easier) [RP] There is no transaction in flight. This is not the case here. AS far a reconciliation goes, there is nothing SFC can do since it is a user of these southbound protocols. There are two choices: either you recreate the SFFs from OVS config [RP] SFC does not communicate with OVS directly and has limited OVS knowledge. It seems to me the whole purpose of the southbound is to have a dedicated software that knows everything about that specific protocol. , or you impose the after-reboot SFF state (perhaps re-read from persistent store) onto OVSes. Since this is SDN, the 2nd option is preferred (We impose the controller's view on the network elements) [RP] See above. |
| Comment by Jan Medved [ 01/Apr/15 ] |
|
> Do you mean the SFC in the OVS switches is still set up and passing traffic? I understand that. > I assume the SFC app in ODL restarts along with ODL, so its state is wiped But the SFC app should know that it restarted. The state reconciliation will be between what it has in the persistent store and what's configured in the OVS switches. > ODL (or Control Plane in general) restarts mandate state reconciliation - there > [RP] There is no transaction in flight. This is not the case here. AS far a The SFC app must reconcile the state in its own persistent store with what it con figured in previous life in the OVS switches. And your'e right - reconciliation should go through the SB plugin. Now, the state of the SB plugin may play a role in the reconciliation too - if some previously present OVSes are missing or new OVSes are present - it's a tough problem... > There are two choices: either you recreate the SFFs from OVS config > [RP] SFC does not communicate with OVS directly and has limited OVS knowledge. Yes - but you use the plugin to inject state into the OVS in the first place. But it does not matter how you inject the state into an OVS - what matters is who is the originator of the state. After a reboot, the originator has to make sure that if it has some state that is not present in the OVS switches, the sate is restored; or if there si some state present in an OVS switch that is not present in the originator, the state is cleared from the OVS. It is not a simple problem... |
| Comment by Reinaldo Penno [ 01/Apr/15 ] |
|
Yes - but you use the plugin to inject state into the OVS in the first place. [RP] The problem happens even if SFC is just a user of OVSDB without having injected state at all. SFC will not be notified of if OpenvSwitch never reconnects. But it does not matter how you inject the state into an OVS - what matters is who is the originator of the state. After a reboot, the originator has to make sure that if it has some state that is not present in the OVS switches, the sate is restored; or if there si some state present in an OVS switch that is not present in the originator, the state is cleared from the OVS. [RP] A few things: 1 - First of all I think the problem is misunderstood. The issue is switch reconciliation. SFC can not be responsible for OVSDB protocol. IF a switch never reconnects SFC will never be notified. This is what I mentioned above and is independent of SFC having installed state or not. 2 - The design that OVSDB folks proposed is for SFC of write into OVSDB's config data store when it wants to communicate with OpenvSwitch. After that state is written, it is part of regular OVSDB protocol and handled by OVSDB plug-in. 3 - If SFC should be responsible for whatever piece of state it includes in OVSDB's datastore, even basic configuration, then maybe we should change the design since this split ownership is not really good. Having known that before hand we might have been better off implementing OVS protocol inside SFC. |
| Comment by Jan Medved [ 01/Apr/15 ] |
|
> [RP] The problem happens even if SFC is just a user of OVSDB without having Is SFC notified when a switch disappears? In other words, does OVSDB have operational state for each connected switch, and does it purge the state when the switch is disconnected? > [RP] A few things: This is a "regular" pattern. But since SFC wrote the data into the config data store (doesn't matter whose), it owns the data and it should be responsible for cleaning it up. Note that OVSDB should not be responsible for what OVSDB does with the data. > 3 - If SFC should be responsible for whatever piece of state it includes in Can we go over details what exactly does SFC have to write into OVSDB's data store? From you first post i gather that OVSDB writes some data into its own config data store when it discovers a switch. The point when the data should be purged after restart needs to be defined, as you point out. But i'd like to point out that the behavior with IMDS is also not 100% correct. Assume that OVSDB forgets everything after restart. The SFC app too. The set of OVSes before the restart may or may not be the same after the restart. You never get the opportunity to cleanup SFC state in the OVSes that do not reconnect after the restart, and may or may not be able to cleanup/reconcile SFC state in OVSes that do reconnect; to make sure everything is in sync, you have to wipe them clean and re-configure from scratch. |
| Comment by Reinaldo Penno [ 01/Apr/15 ] |
|
(In reply to Jan Medved from comment #6) [RP] Yes. But the issues is the pattern I mentioned. After a reboot the switch (that is present in data store) never reconnects. > [RP] Sure, if SFC creates a port then it should be responsible for it. But OVSDB is responsible for the switch. so, now we have a split ownership where SFC is responsible for a small piece configuration and OVSDB is responsible for the entire switch, which from a Yang model perspective actually contains the port. The issues that can arise from this make my head spin. > [RP] Anything from nothing (OpenvSwitch is properly configured) to everything. > [RP] With IMDS everything is wiped clean, both in OVSDB and SFC. Then it is guaranteed that both will be in synch as switches reconnect (or not). |
| Comment by Reinaldo Penno [ 13/May/15 ] |
|
Steps to reproduce.
Manager "tcp:192.168.1.14:6640" |