[CONTROLLER-1895] Links not removed from topology with tell-based Created: 20/May/19  Updated: 01/Oct/19

Status: Open
Project: controller
Component/s: clustering
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: High
Reporter: Luis Gomez Assignee: Luis Gomez
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

There is a test where we set 1 switch-switch link down in cluster deployment, and the switches belong to separate instances.

The test works with ask-protocol:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-gate-clustering-only-sodium/71/

But it fails with tell-based (links are not removed from topology):

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-gate-clustering-only-sodium/72/

After enabling some DEBUGs, I can see 7 TP notifications with ask-based protocol:

2019-05-20T20:14:25,096 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@26eac440 operation available, starting transaction
2019-05-20T20:14:25,098 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@2057bbed
2019-05-20T20:14:25,098 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@71bc860a
2019-05-20T20:14:25,099 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@6c667215
2019-05-20T20:14:25,106 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@1cde631d operation available, starting transaction
2019-05-20T20:14:25,107 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@14b2bb8
2019-05-20T20:14:25,110 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@1452614d operation available, starting transaction
2019-05-20T20:14:25,112 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@7c1cf238
2019-05-20T20:14:25,515 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@7bc3acff operation available, starting transaction
2019-05-20T20:14:25,516 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@6ad1bf80
2019-05-20T20:14:25,522 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@5ee83cb8 operation available, starting transaction
2019-05-20T20:14:25,523 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@45fe986f
2019-05-20T20:14:25,524 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1725/789675131@23bc607b operation available, starting transaction
2019-05-20T20:14:25,525 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1726/1648087628@44f6e633

And only 4 with tell-based protocol:

2019-05-20T20:19:51,259 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1808/2124882347@64c577c1 operation available, starting transaction
2019-05-20T20:19:51,260 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1814/1542219333@7095f28a
2019-05-20T20:19:51,261 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1808/2124882347@3fa1e826 operation available, starting transaction
2019-05-20T20:19:51,262 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1814/1542219333@4477848c
2019-05-20T20:19:51,264 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1808/2124882347@7427b4fc operation available, starting transaction
2019-05-20T20:19:51,265 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1814/1542219333@55aa94cf
2019-05-20T20:19:51,267 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | New org.opendaylight.openflowplugin.applications.topology.manager.DataTreeChangeListenerImpl$$Lambda$1808/2124882347@1ce34ad2 operation available, starting transaction
2019-05-20T20:19:51,268 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation org.opendaylight.openflowplugin.applications.topology.manager.TerminationPointChangeListenerImpl$$Lambda$1814/1542219333@26e81ce5
2019-05-20T20:19:51,268 | DEBUG | ofp-topo-expo-flow:1 | OperationProcessor               | 305 - org.opendaylight.openflowplugin.applications.topology-manager - 0.9.0.SNAPSHOT | Next operation null


 Comments   
Comment by Luis Gomez [ 21/May/19 ]

Avishnoi, can you shed some light here? what kind of notifications are these and where are in the OFP code?

Comment by Anil Vishnoi [ 31/May/19 ]

Following class is registering listener to get  notification on FlowCapableNodeConnector

 

https://github.com/opendaylight/openflowplugin/blob/master/applications/topology-manager/src/main/java/org/opendaylight/openflowplugin/applications/topology/manager/TerminationPointChangeListenerImpl.java#L45

 

It processes the DTCL and adds the task to the queue in OperationProcessor.

ecelgp We need to dump the DTCL output in this class for both the scenario. If there is a diff in the way DTCN is published by datastore, that will give some concrete information for robert to debug it further.

I pushed a patch to openflowplugin master branch 

 

https://git.opendaylight.org/gerrit/#/c/82374/

 

Can you run the job with this patch and with both the tell and ask protocol? In the log, grep for CONTROLLER-1895 to grep the data tree change notification.

 

Comment by Luis Gomez [ 05/Jun/19 ]

After running the test with the OFP patch I can see the notification # discrepancy is OK: sometimes controller gets single DS notification containing "link-down=true" & "configuration=PORT-DOWN" while other times it gets 2 notifications with one change on each.

So if the notification is not the problem, what is next DEBUG to figure out why port down does not remove link from topology?

Comment by Robert Varga [ 06/Jun/19 ]

Either case should be fine: https://github.com/opendaylight/openflowplugin/blob/master/applications/topology-manager/src/main/java/org/opendaylight/openflowplugin/applications/topology/manager/TerminationPointChangeListenerImpl.java#L131

I wonder whether we are not in SUBTREE_MODIFIED case which ends up landing here:

https://github.com/opendaylight/openflowplugin/blob/master/applications/topology-manager/src/main/java/org/opendaylight/openflowplugin/applications/topology/manager/TerminationPointChangeListenerImpl.java#L111

I  have updated the patch to warn when that happens.

Comment by Luis Gomez [ 10/Jun/19 ]

I re-run the test with new patchset:

https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-gate-clustering-only-sodium/82/

But I do not see the "Unhandled termination point change" in the logs.

Comment by Robert Varga [ 01/Oct/19 ]

Alright, we need to move this forward. ecelgp I remember you mentioning this happening even with ask-based on stable/sodium. I seem to remember timing being a role. Can you confirm?

If that is the case, we need a tracker in OFP, because it is likely they have an issue we are unmasking here.

Generated at Wed Feb 07 19:56:43 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.