[OPNFLWPLUG-288] OF Plugin changes the order of NodeConnector update events. Created: 19/Sep/14  Updated: 27/Sep/21  Resolved: 22/Sep/14

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Hideyuki Tai Assignee: Michal Rehak
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
blocks VTN-41 Failed to forward packets on OF 1.0 s... Resolved
External issue ID: 1988

 Description   

Due to this issue, VTN Manager system test fails.
https://jenkins.opendaylight.org/integration/job/integration-master-csit-karaf-vtn-only/
https://lists.opendaylight.org/pipermail/integration-dev/2014-September/001687.html

The MD-SAL OF plugin changed the order of NodeConnector update events from OenFlow 1.0 switches.

For example, when I run a controller and a Mininet, and start OpenFlow 1.0 switches on it, a switch sent FEATURES_REPLY message and PORT_STATUS message to the controller.
When the switch sent the FEATURES_REPLY, a port "s1-eth1" was LINK_DOWN, so The FEATURES_REPLY contained information which indicated the status of the port "s1-eth1" is "LINK_DOWN".
After that, the port "s1-eth1" became LINK_UP, and the switch sent the PORT_STATUS message which indicated "s1-eth1" was "LINK_UP".

However, the MD-SAL OF plugin changed the order of these notification.

Therefore, VTN Manager (an application of SAL) received "LINK_UP" notification at first, and received "LINK_DOWN" notification at second for the port.
VTN Manager thought that the status of the port changed from "LINK_UP" to "LINK_DOWN", it stopped to send PACKET_OUT for the port, and stopped to install flow entry which used the port.



 Comments   
Comment by Hideyuki Tai [ 19/Sep/14 ]

I guess the AD-SAL OF plugin (the old OF plugin) processes events from a switch in a single thread, this kind of problem does not occur with the AD-SAL OF plugin.

Comment by Hideyuki Tai [ 19/Sep/14 ]

Here, I'm writing my observation in detail.

When the MD-SAL OF plugin detects an OpenFlow 1.0 switch, the MD-SAL OF plugin sends FEATURES_REQUEST message to the switch.
Next, The OpenFlow 1.0 switch sends FEATURES_REPLY message which contains port information.

In my setup, the FEATURES_REPLY message indicated that the state of port "s1-eth1" was LINK_DOWN.
After the FEATURES_REPLY, the switch sent PORT_STATUS message to the controller.
The PORT_STATUS message indicated that the state of port "s1-eth1" became LINK_UP.

However, the MD-SAL OF plugin notified MD-SAL of the PORT STATUS information first, and the OF plugin notified MD-SAL of the FEATURES_REPLY information which indicated LINK_DOWN.

I picked up some log messages here related to this issue in chronological order

(1)
2014-09-19 18:23:38,927 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | \
127 - slf4j.api - 1.7.2 | sending feature request for version=1 and xid=65

The OF plugin sent a FEATURES_REQUEST.

(2)
2014-09-19 18:23:38,944 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | \
127 - slf4j.api - 1.7.2 | obtained features: datapathId=1
2014-09-19 18:23:38,945 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | \
127 - slf4j.api - 1.7.2 | obtained features: auxiliaryId=null

The OF plugin received a FEATURES_REPLY.

(3)
2014-09-19 18:23:39,451 | DEBUG | OFmsgProcessor-2 | \
ToNodeConnectorUpdatedTranslator | 127 - slf4j.api - 1.7.2 | \
PortStatusMessage: version 1 dataPathId 1 portNo 1

It received PORT_STATUS which indicated that s1-eth1 is LINK UP.

(4)
2014-09-19 18:23:39,456 | TRACE | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
translator: org.opendaylight.openflowplugin.openflow.md.core.\
translator.PortStatusMessageToNodeConnectorUpdatedTranslator@17d78745 \
elapsed time 4966942 ns
2014-09-19 18:23:39,456 | DEBUG | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
message processing done (type: PortStatusMessage, ticket: 1238356292)

The OF plugin notified the PORT_STATUS to MD-SAL.

(5)
2014-09-19 18:23:39,466 | TRACE | g-notification-5 | Inventory | \
82 - org.opendaylight.controller.sal.implementation - \
0.4.2.Helium-RC1-2-v201409180238 | OF|1@OF|00:00:00:00:00:00:00:01 ADDED
2014-09-19 18:23:39,466 | DEBUG | g-notification-5 | SwitchManager | \
203 - org.opendaylight.controller.switchmanager.implementation - \
0.4.2.Helium-RC1-2-v201409180238 | \
updateNodeConnector: OF|1@OF|00:00:00:00:00:00:00:01 type ADDED \
props [State[1], Config[1], BandWidth[10Gbps], Name[s1-eth1]] \
for container default
2014-09-19 18:23:39,468 | INFO | g-notification-5 | VTNManagerImpl | \
232 - org.opendaylight.vtn.manager.implementation - \
0.2.0.Helium-RC1-2-v201409180238 | \
default: addNode: New node OF|00:00:00:00:00:00:00:01
2014-09-19 18:23:39,468 | INFO | g-notification-5 | VTNManagerImpl | \
232 - org.opendaylight.vtn.manager.implementation - \
0.2.0.Helium-RC1-2-v201409180238 | \
default: addPort: New port: port=OF|1@OF|00:00:00:00:00:00:00:01, \
prop=PortProperty[name=s1-eth1,cost=1000,enabled]

Based on the PORT_STATUS information, SAL notified VTN Manager of
a NodeConnector update.
At this time, VTN Manager thought the state of the port "s1-eth1" was up.

(6)
2014-09-19 18:23:40,824 | TRACE | OFmsgHarvester-0 | \
QueueProcessorLightImpl | 127 - slf4j.api - 1.7.2 | \
ticket scheduling: GetFeaturesOutput, ticket: 1459666234

FeaturesOutput???

(7)

2014-09-19 18:23:40,824 | DEBUG | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
message received, type: GetFeaturesOutput
2014-09-19 18:23:40,824 | TRACE | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
translating ticket: GetFeaturesOutput, ticket: 830691151
2014-09-19 18:23:40,824 | DEBUG | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
translatorKey: 1 + org.opendaylight.yang.gen.v1.urn.opendaylight.\
openflow.protocol.rev130731.GetFeaturesOutput
2014-09-19 18:23:40,824 | TRACE | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
translator: org.opendaylight.openflowplugin.openflow.md.core.translator.\
FeaturesV10ToNodeConnectorUpdatedTranslator@6afc375d elapsed time 307589 ns
2014-09-19 18:23:40,825 | DEBUG | OFmsgProcessor-2 | \
TicketProcessorFactoryImpl | 127 - slf4j.api - 1.7.2 | \
message processing done (type: GetFeaturesOutput, ticket: 830691151)

Ouch! The port information in the FEATURES_OUTPUT message was handled
as a NodeConnector update.

(8)

2014-09-19 18:23:40,827 | TRACE | -notification-17 | Inventory | \
82 - org.opendaylight.controller.sal.implementation - \
0.4.2.Helium-RC1-2-v201409180238 | OF|1@OF|00:00:00:00:00:00:00:01 CHANGED
2014-09-19 18:23:40,827 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | \
127 - slf4j.api - 1.7.2 | postHandshake DONE
2014-09-19 18:23:40,829 | DEBUG | -notification-17 | SwitchManager | \
203 - org.opendaylight.controller.switchmanager.implementation - \
0.4.2.Helium-RC1-2-v201409180238 | \
updateNodeConnector: OF|1@OF|00:00:00:00:00:00:00:01 type CHANGED \
props [State[0], Config[1], BandWidth[10Gbps], Name[s1-eth1]] \
for container default
2014-09-19 18:23:40,830 | INFO | -notification-17 | VTNManagerImpl | \
232 - org.opendaylight.vtn.manager.implementation - \
0.2.0.Helium-RC1-2-v201409180238 | \
default: addPort: Property has been changed: \
port=OF|1@OF|00:00:00:00:00:00:00:01, \
prop=PortProperty[name=s1-eth1,cost=1000,enabled] -> \
PortProperty[name=s1-eth1,cost=1000,disabled]

Ouch! Based on the FEATURES_OUTPUT information, SAL notified VTN Manager of the NodeConnector update.
At this time, VTN Manager thought the state of the port "s1-eth1" became DOWN.

On the other hand, OpenFlow 1.3 switches does not contain port information in FEATURES_OUTPUT, and the MD-SAL OF plugin actively gets port information using MULTIPART_REQUEST.
The issue does not occur on OpenFlow 1.3 switches.

Comment by Hideyuki Tai [ 20/Sep/14 ]

I've written how to reproduce the issue in VTN-41.

https://bugs.opendaylight.org/show_bug.cgi?id=1993#c0

The above procedure really reproduces the issue.
But, I believe that there are more easy ways to reproduce the issue, although I've not tried other ways yet.

I guess that just installing MD-SAL and the MD-SAL OF plugin, and starting OF 1.0 switch on Mininet produces the issue.

Comment by Michal Rehak [ 21/Sep/14 ]

https://git.opendaylight.org/gerrit/#/c/11416/

Comment by Michal Rehak [ 21/Sep/14 ]

I failed to replicate the scenario but I found the code where features might be delivered later than port updates. Now no message will be processed until registration done and features sent to MD-SAL.

Comment by Abhijit Kumbhare [ 21/Sep/14 ]

It has been merged.

Comment by Hideyuki Tai [ 21/Sep/14 ]

(In reply to michal rehak from comment #4)
> https://git.opendaylight.org/gerrit/#/c/11416/

I've built a Karaf distribution using Integration Group Git repository, and tested the bug.
However, I've found out that the bug is not fixed yet.
It seems to me that the result does not change.

Comment by Hideyuki Tai [ 21/Sep/14 ]

(In reply to Hideyuki Tai from comment #7)
> (In reply to michal rehak from comment #4)
> > https://git.opendaylight.org/gerrit/#/c/11416/
>
> I've built a Karaf distribution using Integration Group Git repository, and
> tested the bug.
> However, I've found out that the bug is not fixed yet.
> It seems to me that the result does not change.

Could I revert the status of this bug report to "CONFIRMED" from "RESOLVED"?

I'm writing my observation in detail.

To check the behavior of openflowplugin, I changed the log level during the test.

log:set TRACE org.opendaylight.controller.sal.implementation
log:set TRACE org.opendaylight.openflowplugin.openflow

(1)

2014-09-21 17:16:59,679 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | sending feature request for version=1 and xid=1345

The OF plugin sent a FEATURES_REQUEST.

(2)
2014-09-21 17:16:59,695 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | obtained features: datapathId=1

2014-09-21 17:16:59,696 | DEBUG | OFHandshake-1-0 | HandshakeManagerImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | obtained features: auxiliaryId=null

The OF plugin received a FEATURES_REPLY.

(3)

2014-09-21 17:17:04,075 | DEBUG | OFmsgProcessor-2 | ToNodeConnectorUpdatedTranslator | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | PortStatusMessage: version 1 dataPathId 1 portNo 1

It received PORT_STATUS which indicated that s1-eth1 is LINK UP.

(4)

2014-09-21 17:17:04,075 | TRACE | OFmsgHarvester-0 | QueueProcessorLightImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | ticket scheduling: GetFeaturesOutput, ticket: 114653084

FeaturesOutput???
It seems to me that openflowplugin handles events in multiple thread.
And it seems to me that openflow plugin changes the order of the event.

(5)

2014-09-21 17:17:04,083 | TRACE | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | translator: org.opendaylight.openflowplugin.openflow.md.c ore.translator.PortStatusMessageToNodeConnectorUpdatedTranslator@6066c81a elapsed time 8638785 ns

2014-09-21 17:17:04,083 | DEBUG | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | message processing done (type: PortStatusMessage, ticket: 1795650542)

The OF plugin notified the PORT_STATUS to MD-SAL.

(6)

2014-09-21 17:17:04,083 | DEBUG | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | message received, type: GetFeaturesOutput

2014-09-21 17:17:04,084 | TRACE | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | translator: org.opendaylight.openflowplugin.openflow.md.c ore.translator.FeaturesV10ToNodeConnectorUpdatedTranslator@59226d94 elapsed time 251468 ns

2014-09-21 17:17:04,084 | DEBUG | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | message processing done (type: GetFeaturesOutput, ticket: 626232100)

The port information in the FEATURES_OUTPUT message was handled
as a NodeConnector update.

(7)
2014-09-21 17:17:04,106 | TRACE | -notification-28 | Inventory | 82 - org.opendaylight.controller.sal.implementation - 0.4.2.SNAPSHOT | OF|1@OF|00:00:00:00:00:00:00:01 ADDED

2014-09-21 17:17:04,177 | INFO | -notification-28 | VTNManagerImpl | 232 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default: addPort: New port: port=OF|1@OF|00:00:00:00:00:00:00:01, prop=PortProperty[name=s1-eth1,cost=1000,enabled]

Based on the PORT_STATUS information, SAL notified VTN Manager of
a NodeConnector update.
At this time, VTN Manager thought the state of the port "s1-eth1" was up.

(8)

2014-09-21 17:17:04,223 | TRACE | g-notification-0 | Inventory | 82 - org.opendaylight.controller.sal.implementation - 0.4.2.SNAPSHOT | OF|1@OF|00:00:00:00:00:00:00:01 CHANGED

2014-09-21 17:17:04,231 | INFO | g-notification-0 | VTNManagerImpl | 232 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default: addPort: Property has been changed: port=OF|1@OF|00:00:00:00:00:00:00:01, prop=PortProperty[name=s1-eth1,cost=1000,enabled] -> PortProperty[name=s1-eth1,cost=1000,disabled]

2014-09-21 17:17:04,234 | INFO | Thread: default | VTNManagerImpl | 232 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default:vBridge-IF:vtn1.vbr1.if1: Virtual int erface changed: VInterface[name=if1,enabled,state=DOWN,entityState=DOWN]

Based on the FEATURES_OUTPUT information, SAL notified VTN Manager of the NodeConnector update.
At this time, VTN Manager thought the state of the port "s1-eth1" became DOWN.

Comment by Michal Rehak [ 21/Sep/14 ]

https://git.opendaylight.org/gerrit/#/c/11428/

Comment by Hideyuki Tai [ 21/Sep/14 ]

(In reply to michal rehak from comment #9)
> https://git.opendaylight.org/gerrit/#/c/11428/

I've pulled Michal's patch (11428), tested OPNFLWPLUG-288 and confirmed that the patch fixes the OPNFLWPLUG-288.

I observed that the order of events is not changed.

2014-09-21 18:39:20,691 | DEBUG | OFmsgProcessor-2 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | message processing done (type: GetFeaturesOutput, ticket: 846515835)

2014-09-21 18:39:20,692 | TRACE | g-notification-1 | Inventory | 82 - org.opendaylight.controller.sal.implementation - 0.4.2.SNAPSHOT | OF|1@OF|00:00:00:00:00:00:00:01 ADDED

2014-09-21 18:39:20,702 | DEBUG | OFmsgProcessor-0 | TicketProcessorFactoryImpl | 327 - org.opendaylight.openflowplugin - 0.0.3.SNAPSHOT | message processing done (type: PortStatusMessage, ticket: 1131559627)

2014-09-21 18:39:20,782 | TRACE | -notification-16 | Inventory | 82 - org.opendaylight.controller.sal.implementation - 0.4.2.SNAPSHOT | OF|1@OF|00:00:00:00:00:00:00:01 CHANGED

2014-09-21 18:39:20,792 | INFO | -notification-16 | VTNManagerImpl | 232 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default: addPort: Property has been changed: port=OF|1@OF|00:00:00:00:00:00:00:01, prop=PortProperty[name=s1-eth1,cost=1000,disabled] -> PortProperty[name=s1-eth1,cost=1000,enabled]

2014-09-21 18:39:20,802 | INFO | Thread: default | VTNManagerImpl | 232 - org.opendaylight.vtn.manager.implementation - 0.2.0.SNAPSHOT | default:vBridge-IF:vtn1.vbr1.if1: Virtual int erface changed: VInterface[name=if1,enabled,state=UP,entityState=UP]

Comment by Michal Rehak [ 22/Sep/14 ]

merged

Generated at Wed Feb 07 20:32:04 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.