[OPNFLWPLUG-588] [Clustering]: Switch state resync is not happening after controller restart [Routed RPC issue] Created: 04/Jan/16 Updated: 27/Sep/21 Resolved: 11/Feb/16 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Anil Gujele | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| External issue ID: | 4866 | ||||||||
| Priority: | Highest | ||||||||
| Description |
|
Build used : Test Type : Objective of test : Test Steps : Note: Output shown for Flows in switch and Operational DS in Step 7 and Step 8 are not consistent. In 4 times, i have seen output as 4, 0, 10 and 2 flows. Controllers (to cross-check logs): Enclosed Logs: |
| Comments |
| Comment by Anil Gujele [ 04/Jan/16 ] |
|
Attachment resyncFailed.rar has been added with description: attached snapshot and logs from c1, c2,c3 node. |
| Comment by Muthukumaran Kothandaraman [ 04/Jan/16 ] |
|
In above description, "leader" indicates shard-leader of inventory-config shard and "follower" indicates follower(s) of inventory-config |
| Comment by Tom Pantelis [ 07/Jan/16 ] |
|
Looks like this should be filed against the openflow or ovsdb project. |
| Comment by Anil Gujele [ 28/Jan/16 ] |
|
changed product from controller to openflowplugin |
| Comment by Anil Vishnoi [ 30/Jan/16 ] |
|
Tom/Moiz, This is another scenario of Routed RPC failure, which we are discussing through some other bugs. |
| Comment by Anil Vishnoi [ 05/Feb/16 ] |
|
Hi Muthu, I pushed following patch to openflowplugin that should solve/workaround this issue. My patch basically avoiding routed rpc by using clustering DCN + local rpc registration. https://git.opendaylight.org/gerrit/#/c/34115/ Can you please test with this patch and see if this works for you. |
| Comment by Muthukumaran Kothandaraman [ 05/Feb/16 ] |
|
Patch looks fine with me Anil. We will pick this and test. So, we have fully eliminated the need for forcing reconciliation through routed-rpc |
| Comment by Anil Vishnoi [ 05/Feb/16 ] |
|
Yes and i think it's probably better in term of performance. Hopefully you will see some performance improvement in flow/second in clustered setup, given that we are avoiding remote rpc now and assuming that ClusteredData(Change/Tree)Listner don't create much problem. And once we get rid of DataChangeListner and use TreeListner, things might improve further. |
| Comment by Tom Pantelis [ 05/Feb/16 ] |
|
Although Anil's patch removes the use of routed RPCs in OF, we should fix the timing issue with RPCs so I submitted https://git.opendaylight.org/gerrit/#/c/34175/ to add wait/retries in the RPC code. |
| Comment by Ryan Goulding [ 09/Feb/16 ] |
|
Is this "Waiting for Review" now? It looks like Tom Pantelis pushed a patch to fix this. Also, are we targeting stable/beryllium as well? |
| Comment by Ryan Goulding [ 09/Feb/16 ] |
|
Spawning a separate bug for: https://git.opendaylight.org/gerrit/#/c/34175/ This adds wait/retries in the RPC code. |
| Comment by Anil Gujele [ 11/Feb/16 ] |
|
I have verified with latest code build, |