[OPNFLWPLUG-607] [Clustering]: Unrecoverable cluster flow provisioning failure with 30 switches, 1000 flows/switch. (Tried He design only) Created: 27/Jan/16 Updated: 27/Sep/21 Resolved: 16/Feb/16 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Saibal Roy | Assignee: | Anil Vishnoi |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
| External issue ID: | 5114 |
| Priority: | Highest |
| Description |
|
Build used : Unrecoverable flow provisioning failure in ODL cluster with 30 switches and 1000 flows per switch. Test Type : Test setup : Objective of test : Test Steps : Note:We are provisioning flows via Binding aware Api of Openflow Inventory Model. Commands Note:mytopo.py is attached for quick reference.please modify mytopo.py accordingly The below command we use to check number of flows provisioned per controller. Test Results: 2. After above condition, no further flow provisioning via config DS works and system requires complete cluster reboot to restore normalcy. Attaching all the karaf logs of 3 controller nodes. Thanks & Regards, |
| Comments |
| Comment by Saibal Roy [ 27/Jan/16 ] |
|
Attachment logs.zip has been added with description: Unrecoverable flow provisioning failure in ODL cluster with 30 switches and 1000 flows per switch |
| Comment by Muthukumaran Kothandaraman [ 27/Jan/16 ] |
|
Binding aware stub app follows typical ODL stereotype of Statistics was in enabled state as the evaluation is for "as is" functionality |
| Comment by Muthukumaran Kothandaraman [ 27/Jan/16 ] |
|
From the logs it appears that AskTimeoutException originates from extensive Oper DS updates from the plugin. Eventually, this can start having ripple effect on the Config DS updates too |
| Comment by Saibal Roy [ 01/Feb/16 ] |
|
Hi, I connected 30 switches(10 switch per controller) and then i pushed 750 flows from follower c2.Note my leader is c1. Now i increase the flow to 850 and i provision from follower c2 and i am not able to see all the flows in my switch across the cluster..It gives AskTimeoutException. Thanks & Regards, |
| Comment by Abhijit Kumbhare [ 01/Feb/16 ] |
|
Muthu will provide an update on this by Feb 3 or 4. |
| Comment by Saibal Roy [ 03/Feb/16 ] |
|
Hi, In continuation of further troubleshooting following exercise was carried out. Objective: i did 2 flavors of testing with 10 switch per controller(Total 30 switches) and pushing 1000 flows per switch.Below are my observations. Test Case1: Test Case2: Observation : In the 1st TestCase when we are pushing 30K flows, it actually does 2x transaction where as in the 2nd TestCase, first 30K flows are pushed to config DS and then switch are connected where again another 30K transaction is pushed.Is this the reason for failure of TestCase1?? Thanks & Regards, |
| Comment by Saibal Roy [ 03/Feb/16 ] |
|
Hi, Thanks & Regards, |
| Comment by Anil Vishnoi [ 11/Feb/16 ] |
|
Hi Saibal, I tried the same test with with the latest stable/beryllium and things looks good to me. https://git.opendaylight.org/gerrit/#/c/34115/ this patch get rid of routed rpc use and use the local rpc registration. Although i didn't install equal number of flows on each switch but distribution is approximately even. This is what i did 1) Started 3-node cluster (used feature :odl-openflowplugin-flow-service-rest) 4) Flow installed done through the inventory-config shard follower.
Total: 50000 Can you please test it again in your environment and update the bug. |
| Comment by Saibal Roy [ 11/Feb/16 ] |
|
Attachment logs.zip has been added with description: logs |
| Comment by Saibal Roy [ 11/Feb/16 ] |
|
Hi Anil, 1. c1,c2,c3 are the controller where c2 is the leader. i could see 30000 flows in the configDS but in the switch i could not see 1000 flows per switch.. root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.42 --datastore config Totals: root@mininet-vm:/home/mininet/integration/test/tools/odl-mdsal-clustering-tests/clustering-performance-test# ./inventory_crawler.py --auth --host 10.183.181.42 --datastore operational Totals: mininet> dpctl dump-aggregate -O OpenFlow13
mininet> dpctl dump-aggregate -O OpenFlow13
mininet> dpctl dump-aggregate -O OpenFlow13
The above states that the BUG exist as we are not able to see 1000 flows per switch. There has been a definite improvement in the flow provisioning behaviour as we are able to push ~29.5K flows in the switch. Attaching the logs. Thanks & Regards, |
| Comment by Anil Vishnoi [ 11/Feb/16 ] |
|
Hi Saibal, How long do you wait before dumping all the flows in the switches? Also what's the size of your VM that's running controller ? Just want to make sure that controller has enough computing power. Can you give some more details about how your application is writing flow ? I tested right now with 50K flows and i see that things are working fine, all the flows are getting installed properly. |
| Comment by Saibal Roy [ 12/Feb/16 ] |
|
Hi Anil, 1. How long do you wait before dumping all the flows in the switches? Once i connect the switches and i check all the switches are UP on the respective controller, i wait for a minute or so..then i push 30K flows. Once i push 30k flows from the Follower,i check the configDS how many flows got provisioned followed by checking in the switch and then in the OperDS. Once all the flows are provisioned in the configDS(30K) , I keep on checking for next 30-35 minutes so that all the flows gets provisioned in the OperDS as well and i check in the switch as well. 2.Also what's the size of your VM that's running controller ? Each VM is of 20GB size. 3.Can you give some more details about how your application is writing flow ? The application used for writing flows pushes the flows directly to the config DS instead of pushing through RESTconf. |
| Comment by Muthukumaran Kothandaraman [ 12/Feb/16 ] |
|
Flows are written to Config DS using bindingaware stub application. It uses standard databroker and WriteTransaction with put calls. |
| Comment by Saibal Roy [ 16/Feb/16 ] |
|
Hi, But the issue of Flow provisioning still exists.Hence we are closing this BUG and raising another BUG. https://bugs.opendaylight.org/show_bug.cgi?id=5364 Thanks & Regards, |