Uploaded image for project: 'OpenFlowPlugin'
  1. OpenFlowPlugin
  2. OPNFLWPLUG-845

Switch hangs in controller if a reconnect happens just after switch is set in IDLE state

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Done
    • None
    • None
    • General
    • None
    • Operating System: All
      Platform: All

    • 7689

    Description

      To reproduce:

      1) Start mininet to controller:

      sudo mn --controller 'remote,ip=192.168.0.2,port=6633' --topo tree,1

      2) On the switch block the OF connection using iptables. This will simulate some OF agents behavior that do not close the OF connection when they are stopped:

      sudo iptables -A OUTPUT -d 192.168.0.2 -j DROP

      3) Wait until controller sends ECHO request and switch is declared IDLE. Controller will close the OF session in 2 secs after this message:

      2017-01-27 04:50:58,890 | INFO | ofppool-0 | SystemNotificationsListenerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Switch Idle state occurred, node=/192.168.0.24:34288|auxId=0

      4) Before the 2 seconds (just right after the above message) restore the OF connection:

      sudo iptables -D OUTPUT -d 192.168.0.2 -j DROP

      5) This what is seen in karaf log, the switch hangs in the controller and only workaround is to restart the controller:

      2017-01-27 04:50:59,887 | INFO | ntLoopGroup-11-5 | SystemNotificationsListenerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | ConnectionEvent: Connection closed by device, Device:/192.168.0.24:34288, NodeId:openflow:1
      2017-01-27 04:50:59,894 | WARN | ntLoopGroup-11-5 | StatisticsGatheringUtils | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Can't write to transaction, transaction chain probably closed.
      2017-01-27 04:50:59,894 | WARN | ntLoopGroup-11-5 | StatisticsManagerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Statistics gathering for single node openflow:1 was not successful: Task was cancelled.
      2017-01-27 04:51:00,704 | INFO | ntLoopGroup-11-6 | ConnectionAdapterImpl | 277 - org.opendaylight.openflowjava.openflow-protocol-impl - 0.8.1.Boron-SR1 | Hello received / branch
      2017-01-27 04:51:00,706 | INFO | ntLoopGroup-11-6 | DeviceManagerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | ConnectionEvent: Device connected to controller, Device:/192.168.0.24:34291, NodeId:Uri [_value=openflow:1]
      2017-01-27 04:51:00,707 | INFO | ntLoopGroup-11-6 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | SetRole called with input:SetRoleInput [_controllerRole=BECOMESLAVE, _node=NodeRef [_value=KeyedInstanceIdentifier

      {targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:1]]]]}

      ], augmentation=[]]
      2017-01-27 04:51:00,707 | INFO | ntLoopGroup-11-6 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Requesting state change to BECOMESLAVE
      2017-01-27 04:51:00,707 | INFO | ntLoopGroup-11-6 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | RoleChangeTask called on device:openflow:1 OFPRole:BECOMESLAVE
      2017-01-27 04:51:00,707 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | getGenerationIdFromDevice called for device: openflow:1
      2017-01-27 04:51:00,707 | INFO | ntLoopGroup-11-6 | LifecycleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Registered clustering MASTER services for node openflow:1
      2017-01-27 04:51:00,709 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | submitRoleChange called for device:Uri [_value=openflow:1], role:BECOMESLAVE
      2017-01-27 04:51:00,710 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | submitRoleChange onSuccess for device:Uri [_value=openflow:1], role:BECOMESLAVE
      2017-01-27 04:51:00,712 | INFO | ult-dispatcher-2 | LifecycleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Starting clustering MASTER services for node openflow:1
      2017-01-27 04:51:00,712 | INFO | ult-dispatcher-2 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Starting device context cluster services for node openflow:1
      2017-01-27 04:51:00,713 | INFO | ntLoopGroup-11-6 | DeviceInitializationUtils | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | IP address of switch is: /192.168.0.24:34291
      2017-01-27 04:51:00,716 | INFO | ntLoopGroup-11-6 | DeviceInitializationUtils | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Static node Uri [_value=openflow:1] info: OFPMPMETERFEATURES collected
      2017-01-27 04:51:00,717 | INFO | ntLoopGroup-11-6 | DeviceInitializationUtils | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Static node Uri [_value=openflow:1] info: OFPMPGROUPFEATURES collected
      2017-01-27 04:51:00,717 | INFO | ntLoopGroup-11-6 | DeviceInitializationUtils | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Static node Uri [_value=openflow:1] info: OFPMPPORTDESC collected
      2017-01-27 04:51:00,718 | INFO | ult-dispatcher-2 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | SetRole called with input:SetRoleInput [_controllerRole=BECOMEMASTER, _node=NodeRef [_value=KeyedInstanceIdentifier

      {targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:1]]]]}

      ], augmentation=[]]
      2017-01-27 04:51:00,718 | INFO | ult-dispatcher-2 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Requesting state change to BECOMEMASTER
      2017-01-27 04:51:00,718 | INFO | ult-dispatcher-2 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | RoleChangeTask called on device:openflow:1 OFPRole:BECOMEMASTER
      2017-01-27 04:51:00,718 | INFO | ult-dispatcher-2 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | getGenerationIdFromDevice called for device: openflow:1
      2017-01-27 04:51:00,718 | INFO | ult-dispatcher-2 | StatisticsContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Starting statistics context cluster services for node openflow:1
      2017-01-27 04:51:00,718 | INFO | ult-dispatcher-2 | StatisticsManagerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Scheduling statistics poll for device: Uri [_value=openflow:1]
      2017-01-27 04:51:00,719 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | submitRoleChange called for device:Uri [_value=openflow:1], role:BECOMEMASTER
      2017-01-27 04:51:00,722 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | submitRoleChange onSuccess for device:Uri [_value=openflow:1], role:BECOMEMASTER
      2017-01-27 04:51:00,891 | WARN | ofppool-0 | SystemNotificationsListenerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Exception while waiting for echoReply from [/192.168.0.24:34288] in TIMEOUTING state: Timeout waiting for task.
      2017-01-27 04:51:00,891 | INFO | ofppool-0 | SystemNotificationsListenerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | ConnectionEvent:Closing connection as device is idle. Echo sent at Fri Jan 27 04:50:58 UTC 2017. Device:/192.168.0.24:34288, NodeId:openflow:1
      2017-01-27 04:51:00,901 | INFO | ofppool-0 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | SetRole called with input:SetRoleInput [_controllerRole=BECOMESLAVE, _node=NodeRef [_value=KeyedInstanceIdentifier

      {targetType=interface org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node, path=[org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.Nodes, org.opendaylight.yang.gen.v1.urn.opendaylight.inventory.rev130819.nodes.Node[key=NodeKey [_id=Uri [_value=openflow:1]]]]}

      ], augmentation=[]]
      2017-01-27 04:51:00,901 | INFO | ofppool-0 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Requesting state change to BECOMESLAVE
      2017-01-27 04:51:00,901 | INFO | ofppool-0 | SalRoleServiceImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | RoleChangeTask called on device:openflow:1 OFPRole:BECOMESLAVE
      2017-01-27 04:51:00,901 | INFO | ofppool-0 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | getGenerationIdFromDevice called for device: openflow:1
      2017-01-27 04:51:00,902 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | submitRoleChange called for device:Uri [_value=openflow:1], role:BECOMESLAVE
      2017-01-27 04:51:00,902 | INFO | ntLoopGroup-11-6 | RoleService | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | submitRoleChange onSuccess for device:Uri [_value=openflow:1], role:BECOMESLAVE
      2017-01-27 04:51:08,823 | WARN | lt-dispatcher-35 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:08,824 | WARN | lt-dispatcher-35 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:13,820 | WARN | ult-dispatcher-4 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:13,821 | WARN | ult-dispatcher-4 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:16,098 | WARN | ntLoopGroup-11-6 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for port 1 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,130 | WARN | ntLoopGroup-11-6 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for port 2 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,187 | INFO | ntLoopGroup-11-6 | SystemNotificationsListenerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | ConnectionEvent: Connection closed by device, Device:/192.168.0.24:34291, NodeId:openflow:1
      2017-01-27 04:51:16,187 | INFO | ntLoopGroup-11-6 | DeviceManagerImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | DeviceContext for Node openflow:1 was not found. Connection is terminated without OFP context suite.
      2017-01-27 04:51:16,201 | WARN | Thread-104 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for 0 on port 1 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,231 | WARN | Thread-105 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for 0 on port 2 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,303 | WARN | Thread-104 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for 1 on port 1 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,332 | WARN | Thread-105 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for 1 on port 2 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,404 | WARN | Thread-104 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for 2 on port 1 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,404 | WARN | Thread-104 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Failed to update port status for port 1 on device Uri [_value=openflow:1] even after 3 retries
      2017-01-27 04:51:16,433 | WARN | Thread-105 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Error processing port status message for 2 on port 2 on device Uri [_value=openflow:1] : org.opendaylight.controller.md.sal.common.api.data.TransactionChainClosedException: Cannot write into transaction.
      2017-01-27 04:51:16,433 | WARN | Thread-105 | DeviceContextImpl | 286 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Failed to update port status for port 2 on device Uri [_value=openflow:1] even after 3 retries
      2017-01-27 04:51:18,821 | WARN | ult-dispatcher-4 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:18,822 | WARN | lt-dispatcher-26 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:23,821 | WARN | lt-dispatcher-35 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:23,821 | WARN | lt-dispatcher-35 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:28,821 | WARN | lt-dispatcher-35 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierImpl{context=null, type=(urn:opendaylight:packet:service?revision=2013-07-09)transmit-packet, route=/(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[

      {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:1}

      ]}
      2017-01-27 04:51:28,821 | WARN | lt-dispatcher-35 | RpcRegistry | 189 - org.opendaylight.controller.sal-remoterpc-connector - 1.4.1.Boron-SR1 | Timed out finding routers for RouteIdentifierI

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            ecelgp Luis Gomez
            ecelgp Luis Gomez
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: