[CONTROLLER-820] Clustering : Recovery due to high snapshot count takes a long time. Created: 12/Sep/14  Updated: 19/Oct/17  Resolved: 03/Oct/14

Status: Resolved
Project: controller
Component/s: mdsal
Affects Version/s: Helium
Fix Version/s: None

Type: Bug
Reporter: Kamal Rameshan Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: Mac OS
Platform: PC


Attachments: File flow_config_perf.py    
External issue ID: 1832
Priority: Normal

 Description   

Currently the Snapshot count is set to 100000.

This high number results in a high volume journal, recovery of which takes approximately 400secs

This was found when running the performance py script attached, using the following steps

1. Run a karaf distribution of the controller. OpenFlowPlugin should be just fine.
2. Install the feature odl-mdsal-clustering
3. Add 50,000 flows and remove 50,000 flows using the attached script.
/usr/local/bin/python flow_config_perf.py --nflows 500 --nthreads=10 --ncycles 10
4. Stop the controller
5. Start the controller
6. Checking the recovery times

We would need to find the optimal value for snapshot count so as to strike a balance between faster recovery time and not impacting performance by too frequent snapshotting.



 Comments   
Comment by Kamal Rameshan [ 12/Sep/14 ]

Attachment flow_config_perf.py has been added with description: Flow_config_perf

Comment by Tom Pantelis [ 16/Sep/14 ]

Bug https://bugs.opendaylight.org/show_bug.cgi?id=1831 is adding functionality to batch the replicated log entries on recovery.

The snapshot count has been reduced to 20K via another bug.

So I think those fixes obviate the need for this bug.

Generated at Wed Feb 07 19:53:57 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.