[OPNFLWPLUG-651] New FRM RETRY mechanism for RCP call Created: 22/Mar/16  Updated: 27/Sep/21  Resolved: 21/Jun/16

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Jozef Slezák Assignee: Jozef Bacigal
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
is blocked by OPNFLWPLUG-649 New FRM client of Batch RPC Resolved
is blocked by OPNFLWPLUG-673 Add snapshot marks Resolved
External issue ID: 5577

 Description   

New FRM calculates the delta between the config modifications. It would not work correctly without retries in negative scenario. We expect that we need to improve FRM while testing&debugging https://bugs.opendaylight.org/show_bug.cgi?id=5576.



 Comments   
Comment by Anil Vishnoi [ 22/Mar/16 ]

Hi Jozef,

Where is this new FRM sitting? Can you please explain what this new FRM is doing different then the existing FRM and what's the need for having new FRM and not modifying the existing one ?

Comment by Jozef Slezák [ 07/Apr/16 ]

@Anil Vishnoi, there are several advantages of new FRM but the main are:
1/ when switch is connected to the Controller then Mark&Sweep will be done. It means add missing flows/groups/meters, update and after that remove.
2/ there are automatic barriers between dependent objects (flow -> group, group -> group...)
3/ state compression while writing to the switch (on per switch granularity)

Comment by Shuva Jyoti Kar [ 11/May/16 ]

will not adding barriers between dependent objects take a performance hit in case multiple dependent objects in the order of 1000s are being pushed ?

Comment by Andrej Leitner [ 12/May/16 ]

Hi Shuva,
as there must be barriers between dependent objects anyway, it is about trade-off between traffic and latency.

We can wait for implicit barrier (or gathered statistics) which stands for lower traffic but higher latency. Or we can send explicit barrier which causes higher traffic but lower latency. The second approach seems more reasonable to me.

In addition, we would like to continue with optimization in new FRM and don't wait for barriers in requests where it's not necessary.

Comment by Andrej Leitner [ 06/Jun/16 ]

gerrit: https://git.opendaylight.org/gerrit/#/c/39309/

Comment by Anil Vishnoi [ 06/Jun/16 ]

Hi Andrej,

I can't access this gerrit for some reason.

Thanks
Anil

Comment by Anil Vishnoi [ 06/Jun/16 ]

As per my understanding this new implementation is going to clean up the switch and write all the configuration (flow/group/meter) back when switch connects. How do you handle a situation where flows have specific timeout and they expired when switch was disconnected from the controller. With the current approach, whenever switch will connect back to the controller, it will install the flow again on the switch, which actually is not a correct behavior.

Comment by Andrej Leitner [ 13/Jun/16 ]

Hi Anil,
it will not clean up the switch by default. Actually, reconciliation with actual config will be done (on device contected). See also OPNFLWPLUG-692.

Comment by Anil Vishnoi [ 14/Jun/16 ]

Hi Andrej,

I looked at the OPNFLWPLUG-692 but didn't get the clarity about what new FRM will do in case of the scenario i mentioned in above comment.Let me know if this scenario is not clear to you.

Comment by Muthukumaran Kothandaraman [ 14/Jun/16 ]

HI Andrej,

You have mentioned that that the new FRM will use barriers between dependent objects. But, how would the dependent objects be detected ? By doing "full scan of datastore" ?

Comment by Andrej Leitner [ 16/Jun/16 ]

@Anil
Scenario is clear, unfortunately it is not related to retry mechanism. Retry should start when config change failed and enforce reconciliation (syncup) with recent config against operational data.
Ad your scenario - AFAIK, it is not in the scope of new or actual FRM to keep track of flow timeouts in case of device disconnection. From my point of view, user should look after these flows. If we would try to address all details like this, we will have clever but at the same time much more complicated and complex app.

@Muthukumaran
FR-sync is now using SalFlatBatchService from OFP which works with the list of batches of changes with the same type (F/G/M - A/U/R). This service decides in the process of assembling batch plan if barrier is needed before next (actually after previous) plan step or not. If yes, it is marked in plan step, then batch chain is prepared and executed. Check SalFlatBatchServiceImpl, FlatBatchUtil, etc. in openflowplugin-impl.

Generated at Wed Feb 07 20:33:01 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.