[LACP-11] LAG entry creation doesn't happen when trying to scale the number of LAG's to 128 entries. Created: 02/Jun/15  Updated: 24/Jun/15  Resolved: 24/Jun/15

Status: Resolved
Project: lacp
Component/s: General
Affects Version/s: unspecified
Fix Version/s: None

Type: Bug
Reporter: Mahesh Manivasagam Assignee: Rajesh B Sindagi
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 3541

 Description   

Mininet Topology: 128 Host & Switches and 2 links connecting each host and switch. Configure the Hosts on LACP mode and check if the LAG groups are formed on the switch side.
Openvswitch version: 2.3.1
Build no: Integration build 2133

Steps to recreate:
1. Bring up the controller and don't install the LACP module.
2. Now, setup a mininet topology and create 128 Hosts & Switches and 2 links between each host and switch. Generate LACP PDU's from all the hosts.
3. Once done, install the LACP module and check for the presence of flow & group entries on the switch. <<<<<<LAG entries are not getting created. The LAG PDU's aren't being processed because the flow entries aren't available.

Dump-flows & Dump-group entries o/p:
mininet> sh ovs-ofctl -O Openflow13 dump-flows s1
OFPST_FLOW reply (OF1.3) (xid=0x2): <<<<flow entry isn't getting addded
mininet> sh ovs-ofctl -O Openflow13 dump-groups s1
OFPST_GROUP_DESC reply (OF1.3) (xid=0x2): <<<<<because of the above issue, the groups aren't added as well.
mininet>

Waited for an hour to check if the entries were added, but it wasn't the case. The dump-flows cli for the switches didn't show the LACP flow entry.

In this scenario, the flows and groups entries aren't updated for any of the 128 switches. Even with 64 Hosts & Switches, the same issue can be hit. But it was sporadic(could hit this issue once in 7-8) attempts. But then with 128, the issue is hit everytime.

On querying LACP/node-specific API's, the entries also weren't getting displayed.



 Comments   
Comment by Rajesh B Sindagi [ 17/Jun/15 ]

Workaround for the defect - Start OpenDaylight controller with LACP module and then have the network connect to the controller.

Comment by Kalaiselvi [ 18/Jun/15 ]

In the given test scenario, though the openflow listener does a Handshake with the underlying switches available in the topology,
in some instance, the node information is not updated by the inventory mananger application in the node datastore. In these instances, lacp feature is not aware of any of the nodes and does not program the flows/groups to the switches.
In some other instances after the openflowplugin handshake, the inventory manager learns the topology and updates the node datastore. The MD-SAL datastore notifier does not inform LACP feature of all the nodes attached.
In the 128 node topology, around 102 nodes are notified to the LACP and for all the notified nodes, the flow is programmed and the lag aggregation is performed for its ports.

Comment by Rajesh B Sindagi [ 24/Jun/15 ]

Below CL, fixes this issue.
https://git.opendaylight.org/gerrit/23074

Comment by Rajesh B Sindagi [ 24/Jun/15 ]

Root cause:
FlowCapableInventoryProvider of the inventory manager application in OpenflowPlugin starts later after the openflow handshake with the mininet is completed.
So the node updation notifications provided by the openflowplugin after the handshake completion are missed by the application and the nodes are not added to the node datastore.
In these instances, lacp feature is not aware of any of the nodes and does not program the flows/groups to the switches.

Below CL, fixes this issue.
https://git.opendaylight.org/gerrit/23074

Generated at Wed Feb 07 20:05:58 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.