-
Bug
-
Resolution: Done
-
High
-
Carbon-SR3
Hello Anil,
Looks like we hit the same issue in our local testing.
With ODL Carbon (+ Pike, OVS2.7), during one reboot scenario, we observed some race condition in ODL/OVSDB.
Can you please let us know if this issue is a known-issue/addressed in OVSDB?
Steps to Reproduce (in a working setup with a controller and two compute nodes):
1. Restart the compute node and wait for the compute node to come up.
2. Launch an instance on the compute node
3. You can observe that the instance initially stays in "spawning" state and then transitions to "error" state.
4. Restart the openvswitch on the compute node
5. Launch a new instance and it would boot successfully.
Basically, when we issue the reboot on the compute node, ODL identifies that the node is idle and triggers the disconnection chain.
But, while this is going on, when the Compute node comes up, we could see that there is a race condition between the cleanup events and the events related to the node reconciliation.
In this process, we could see that finally the Compute node is deleted from the operational store [#] eventhough its connected to the controller.
Since the node info is deleted from the datastore, the side effect is that port-binding fails and we are unable to spawn new VMs until we restart the OVS Switch on the Compute node.
Following[@] is a SNAP of the karaf logs which show this sequence.
Additional notes:
In case, the compute node comes up with some delay (i.e., after the cleanup is properly done in ODL) this issue (i.e., step3 above) is not seen.
[#] 2017-08-01 07:48:16,660 | INFO | lt-dispatcher-49 | OvsdbConnectionManager | 289 - org.opendaylight.ovsdb.southbound-impl - 1.4.1.Carbon-redhat-1 | Entity{type='ovsdb', id=/(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[
]/node/node[\{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=ovsdb://uuid/e9806896-8dc2-4f17-83ea-c1c957608915}]} has no owner, cleaning up the operational data store
[@] https://gist.github.com/sridhargaddam/3761ef080e11f2dd2429c8d7016ae6d0
- relates to
-
OVSDB-443 Write CSIT test to capture this issue.
- Open
-
OVSDB-462 Bridge randomly goes missing in topology/oper DS on delete and add
- Verified
-
OVSDB-433 OVSDB entity... node-id=ovsdb://uuid/c6ab2b03-b101-4778-82f2-e1863ba42a47}]} was already registered for ownership
- Resolved
-
OVSDB-438 operational node goes missing upon ovsdb node connection flap when haproxy is enabled
- Resolved
1.
|
Write CSIT test to capture this issue. | Open | Unassigned |