[OPNFLWPLUG-720] OVS remains in disconnected state when reconnecting device to cluster Created: 30/Jun/16  Updated: 27/Sep/21  Resolved: 24/Aug/16

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Juraj Sebin Assignee: Jozef Bacigal
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Linux
Platform: PC


Issue Links:
Blocks
is blocked by OPNFLWPLUG-659 Milestone: integrate with Singleton a... Resolved
Duplicate
is duplicated by OPNFLWPLUG-733 Switch Disconnect+Connect to Ownershi... Resolved
is duplicated by OPNFLWPLUG-735 The openflow status change to disconn... Resolved
External issue ID: 6133

 Description   

When disconnecting and then reconnecting OVS, it remain in disconnected state

Environment is 3 node cluster, using distribution https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/integration/distribution-karaf/0.5.0-SNAPSHOT/distribution-karaf-0.5.0-20160629.123423-3194.zip

to reproduce issue:
list ovs controller connection and then disconnect and reconnect switch to one node by chaning port to invalid and then back to valid number

1. sudo ovs-vsctl list Controller
2. sudo ovs-vsctl set Controller 1e173873-5a74-4f57-b27e-55ebf88195d5 target=\"tcp:10.25.2.14:6654\"
3. sudo ovs-vsctl set Controller 1e173873-5a74-4f57-b27e-55ebf88195d5 target=\"tcp:10.25.2.14:6633\"

1e173873-5a74-4f57-b27e-55ebf88195d5 being uuid listed by ovs-vsctl list Controller command. The ip in target=\"tcp:10.25.2.14:6633\" should correspond with relevant entry from same command



 Comments   
Comment by Tomas Slusny [ 22/Jul/16 ]

If found cause of this bug, and it is that RoleContext is not unregistered for slave when it have still main entity registered. Here is log output that shows this issue:

2016-07-21 16:36:16,549 | INFO | entLoopGroup-7-2 | RoleManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Role BECOMESLAVE successfully set on device Uri [_value=openflow:3]
2016-07-21 16:36:23,597 | DEBUG | on-dispatcher-34 | FlowNodeReconciliationImpl | 292 - org.opendaylight.openflowplugin.applications.forwardingrules-manager - 0.3.0.SNAPSHOT | Node added: openflow:3
2016-07-21 16:36:36,604 | INFO | entLoopGroup-7-2 | SystemNotificationsListenerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionEvent: Connection closed by device, Device:/10.12.0.74:58525, NodeId:Uri [_value=openflow:3]
2016-07-21 16:36:36,604 | DEBUG | entLoopGroup-7-2 | ConnectionContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Trying unregister outbound queue handler registration for node Uri [_value=openflow:3]
2016-07-21 16:36:36,605 | DEBUG | entLoopGroup-7-2 | ConnectionContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Trying closing handshake context for node Uri [_value=openflow:3]
2016-07-21 16:36:36,605 | DEBUG | entLoopGroup-7-2 | TransactionChainManager | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | TxManager is going SHUTTING_DOWN for node KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:3]]]]}

2016-07-21 16:36:36,605 | DEBUG | entLoopGroup-7-2 | DeviceManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | TxChainManager for device Uri [_value=openflow:3] is closed successful.
2016-07-21 16:36:36,605 | DEBUG | entLoopGroup-7-2 | RoleManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Found roleContext associated to deviceContext: Uri [_value=openflow:3], now trying close the roleContext
2016-07-21 16:36:36,605 | DEBUG | entLoopGroup-7-2 | RoleContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unregister candidate for entity Entity{type='openflow', id=/(urn:opendaylight:params:xml:ns:yang:controller:md:sal:core:general-entity?revision=2015-08-20)entity/entity[

{(urn:opendaylight:params:xml:ns:yang:controller:md:sal:core:general-entity?revision=2015-08-20)name=openflow:3}

]}
2016-07-21 16:36:36,607 | DEBUG | entLoopGroup-7-2 | DeviceManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | onDeviceContextClosed for Node Uri [_value=openflow:3]
2016-07-21 16:36:37,533 | DEBUG | entLoopGroup-7-1 | HandshakeListenerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | succeeded by getting sweep barrier after posthandshake for device Uri [_value=openflow:3]
2016-07-21 16:36:37,533 | INFO | entLoopGroup-7-1 | DeviceManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionEvent: Device connected to controller, Device:/10.12.0.74:58547, NodeId:Uri [_value=openflow:3]
2016-07-21 16:36:37,533 | DEBUG | entLoopGroup-7-1 | TransactionChainManager | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | created txChainManager for KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:3]]]]}

2016-07-21 16:36:37,536 | ERROR | entLoopGroup-7-1 | HandshakeListenerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionContext initial processing failed: Role context for master Node Uri [_value=openflow:3] is still not closed.
2016-07-21 16:36:37,543 | DEBUG | pool-42-thread-1 | ConnectionContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Trying unregister outbound queue handler registration for node Uri [_value=openflow:3]
2016-07-21 16:36:37,545 | DEBUG | entLoopGroup-7-1 | ConnectionContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Trying closing handshake context for node Uri [_value=openflow:3]
2016-07-21 16:36:37,545 | DEBUG | entLoopGroup-7-1 | ConnectionContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Propagating device disconnect for node Uri [_value=openflow:3]
2016-07-21 16:36:37,545 | DEBUG | entLoopGroup-7-1 | TransactionChainManager | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | TxManager is going SHUTTING_DOWN for node KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:3]]]]}

2016-07-21 16:36:37,545 | DEBUG | entLoopGroup-7-1 | DeviceManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | TxChainManager for device Uri [_value=openflow:3] is closed successful.
2016-07-21 16:36:37,546 | DEBUG | entLoopGroup-7-1 | RoleManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Found roleContext associated to deviceContext: Uri [_value=openflow:3], now trying close the roleContext
2016-07-21 16:36:37,546 | INFO | entLoopGroup-7-1 | RoleContextImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Role context closed, unregistering all candidates for ownership for node Uri [_value=openflow:3]
2016-07-21 16:36:37,546 | DEBUG | entLoopGroup-7-1 | DeviceManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | onDeviceContextClosed for Node Uri [_value=openflow:3]
2016-07-21 16:36:39,533 | DEBUG | entLoopGroup-7-2 | HandshakeListenerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | succeeded by getting sweep barrier after posthandshake for device Uri [_value=openflow:3]
2016-07-21 16:36:39,533 | INFO | entLoopGroup-7-2 | DeviceManagerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionEvent: Device connected to controller, Device:/10.12.0.74:58548, NodeId:Uri [_value=openflow:3]
2016-07-21 16:36:39,533 | DEBUG | entLoopGroup-7-2 | TransactionChainManager | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | created txChainManager for KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:3]]]]}

2016-07-21 16:36:39,534 | ERROR | entLoopGroup-7-2 | HandshakeListenerImpl | 284 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionContext initial processing failed: Role context for master Node Uri [_value=openflow:3] is still not closed.
2

Comment by Jozef Bacigal [ 22/Jul/16 ]

https://git.opendaylight.org/gerrit/#/c/42289/

Comment by Patricio Latini [ 02/Aug/16 ]

After applying the patch in gerrit a different error appear, however still OVS cannot connect to cluster.

2016-08-02 14:30:14,170 | INFO | entLoopGroup-9-1 | ConnectionAdapterImpl | 292 - org.opendaylight.openflowjava.openflow-protocol-impl - 0.8.0.SNAPSHOT | Hello received / branch
2016-08-02 14:30:14,200 | INFO | entLoopGroup-9-1 | DeviceManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionEvent: Device connected to controller, Device:/10.0.0.12:50598, NodeId:Uri [_value=openflow:141209680562936]
2016-08-02 14:30:14,215 | INFO | entLoopGroup-9-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | SetRole called with input:SetRoleInput [_controllerRole=BECOMESLAVE, _node=NodeRef [_value=KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:141209680562936]]]]}

], augmentation=[]]
2016-08-02 14:30:14,216 | INFO | entLoopGroup-9-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Requesting state change to BECOMESLAVE
2016-08-02 14:30:14,216 | INFO | entLoopGroup-9-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | RoleChangeTask called on device:openflow:141209680562936 OFPRole:BECOMESLAVE
2016-08-02 14:30:14,216 | INFO | entLoopGroup-9-1 | RoleService | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | getGenerationIdFromDevice called for device:openflow:141209680562936
2016-08-02 14:30:14,216 | INFO | entLoopGroup-9-1 | RoleContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Initialization main candidate for node Uri [_value=openflow:141209680562936]
2016-08-02 14:30:14,216 | WARN | entLoopGroup-9-1 | RoleContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Candidate for entity openflow is already registered.
2016-08-02 14:30:14,216 | WARN | entLoopGroup-9-1 | LifecycleConductorImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Initialization phase for node openflow:141209680562936 in role context was NOT successful, closing connection.
2016-08-02 14:30:14,227 | INFO | entLoopGroup-9-1 | ConnectionContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unregister outbound queue successful.
2016-08-02 14:30:14,243 | INFO | entLoopGroup-9-1 | RpcManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unregister RPCs services for device context closure
2016-08-02 14:30:14,243 | INFO | entLoopGroup-9-1 | RoleContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Role context closed, unregistering all candidates for ownership for node Uri [_value=openflow:141209680562936]
2016-08-02 14:30:14,244 | ERROR | entLoopGroup-9-1 | HandshakeListenerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionContext initial processing failed: null
2016-08-02 14:30:14,244 | INFO | entLoopGroup-9-1 | ConnectionContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unregister outbound queue successful.
2016-08-02 14:30:14,244 | INFO | entLoopGroup-9-1 | DeviceManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | DeviceContext for Node Uri [_value=openflow:141209680562936] was not found. Connection is terminated without OFP context suite.
2016-08-02 14:30:16,212 | WARN | pool-33-thread-1 | RoleManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | New role BECOMESLAVE was not propagated to device Uri [_value=openflow:141209680562936] during 10 sec
2016-08-02 14:30:16,212 | ERROR | pool-33-thread-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | SetRoleService set Role BECOMESLAVE for Node: KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:141209680562936]]]]}

fail . Reason java.util.concurrent.CancellationException: Task was cancelled.
2016-08-02 14:30:16,213 | WARN | pool-33-thread-1 | RoleManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unable to set role BECOMESLAVE on device Uri [_value=openflow:141209680562936]

Comment by Patricio Latini [ 08/Aug/16 ]

Updated log after new patch with more debug info

2016-08-08 12:11:48,107 | WARN | pool-33-thread-1 | RoleManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | New role BECOMESLAVE was not propagated to device Uri [_value=openflow:75312462753428] during 10 sec
2016-08-08 12:11:48,108 | ERROR | pool-33-thread-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | SetRoleService set Role BECOMESLAVE for Node: KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:75312462753428]]]]}

fail . Reason java.util.concurrent.CancellationException: Task was cancelled.
2016-08-08 12:11:48,108 | WARN | pool-33-thread-1 | RoleManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unable to set role BECOMESLAVE on device Uri [_value=openflow:75312462753428]
2016-08-08 12:11:54,088 | INFO | entLoopGroup-9-1 | ConnectionAdapterImpl | 292 - org.opendaylight.openflowjava.openflow-protocol-impl - 0.8.0.SNAPSHOT | Hello received / branch
2016-08-08 12:11:54,092 | INFO | entLoopGroup-9-1 | DeviceManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionEvent: Device connected to controller, Device:/10.0.0.12:47388, NodeId:Uri [_value=openflow:75312462753428]
2016-08-08 12:11:54,093 | INFO | entLoopGroup-9-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | SetRole called with input:SetRoleInput [_controllerRole=BECOMESLAVE, _node=NodeRef [_value=KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:75312462753428]]]]}

], augmentation=[]]
2016-08-08 12:11:54,094 | INFO | entLoopGroup-9-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Requesting state change to BECOMESLAVE
2016-08-08 12:11:54,094 | INFO | entLoopGroup-9-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | RoleChangeTask called on device:openflow:75312462753428 OFPRole:BECOMESLAVE
2016-08-08 12:11:54,095 | INFO | entLoopGroup-9-1 | RoleService | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | getGenerationIdFromDevice called for device:openflow:75312462753428
2016-08-08 12:11:54,096 | INFO | entLoopGroup-9-1 | RoleContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Initialization main candidate for node Uri [_value=openflow:75312462753428]
2016-08-08 12:11:54,096 | ERROR | entLoopGroup-9-1 | HandshakeListenerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | ConnectionContext initial processing failed: The Blueprint container is being or has been destroyed: (&(|(type=default)(!(type=*)))(objectClass=org.opendaylight.controller.md.sal.common.api.clustering.EntityOwnershipService))
2016-08-08 12:11:54,099 | INFO | entLoopGroup-9-1 | ConnectionContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unregister outbound queue successful.
2016-08-08 12:11:54,099 | INFO | entLoopGroup-9-1 | RpcManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unregister RPCs services for device context closure
2016-08-08 12:11:54,100 | INFO | entLoopGroup-9-1 | RoleContextImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Role context closed, unregistering all candidates for ownership for node Uri [_value=openflow:75312462753428]
2016-08-08 12:11:56,107 | WARN | pool-33-thread-1 | RoleManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | New role BECOMESLAVE was not propagated to device Uri [_value=openflow:75312462753428] during 10 sec
2016-08-08 12:11:56,108 | ERROR | pool-33-thread-1 | SalRoleServiceImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | SetRoleService set Role BECOMESLAVE for Node: KeyedInstanceIdentifier

{targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:75312462753428]]]]}

fail . Reason java.util.concurrent.CancellationException: Task was cancelled.
2016-08-08 12:11:56,108 | WARN | pool-33-thread-1 | RoleManagerImpl | 297 - org.opendaylight.openflowplugin.impl - 0.3.0.SNAPSHOT | Unable to set role BECOMESLAVE on device Uri [_value=openflow:75312462753428]

Generated at Wed Feb 07 20:33:12 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.