[OPNFLWPLUG-281] Deleting of flows very slow with large number of flows in data store and controller connected to the network Created: 17/Sep/14  Updated: 27/Sep/21  Resolved: 09/Jun/15

Status: Resolved
Project: OpenFlowPlugin
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Jan Medved Assignee: Martin Bobak
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: Zip Archive Archive.zip     Zip Archive Call-tree--all-threads-together,-CPU-usage-estimation.zip     File flow_config_perf.py    
External issue ID: 1941

 Description   

When a large number of flows is inserted into the data store, individual flow delete operations become very slow (rate: 3-5 flows/sec). Flow commit statistics indicate average commit time of 200 ms, the longest commit time is about 17.5 seconds. No commits other than flow deletes are going to the data store. The CPU utilization is 10-13%.

This problem only occurs when the controller is connected to the mininet network. The size of the mininet network does not seem to matter, it occurs the same with 3 and 7 node network.

The reason this issue is raised against md-sal is that it seems to be a system issue (livelock?) that manifests itself in drastically prolonged IMDS commit times.

To reproduce:
1. Start the controller (karaf edition).Attach jconsole a

2. Connect mininet, e.g.:
> sudo mn --controller=remote,ip=192.168.162.1:6653 --topo tree,3 --switch ovsk,protocols=OpenFlow13

3. Use the attached script with the following parameters:
>python flow_config_perf.py --nflows 1000 --nthreads=5 --ncycles 10 --delay=60
This will insert 50k flows into the controller, wait for 60 seconds and then
start deleting the inserting flows.

4. While the controller is pausing for 60 second before starting the delete cycle,
clear the commit stats in jconsole and check with mininet that all flows have
been inserted into the network.

5. Observe that the delete goes very slowly. Check the commit stats (I observed
average commit rate of ~200 ms, which corresponds to 5 transactions per second.
I could see that flows were being deleted at a rate of about 5 flows per second)



 Comments   
Comment by Jan Medved [ 17/Sep/14 ]

Attachment flow_config_perf.py has been added with description: flow config script

Comment by Jan Medved [ 18/Sep/14 ]

Note that the commit times are "normal" when the flows are only injected into the data store, i.e. no flows are programmed into the network.

Comment by Jan Medved [ 18/Sep/14 ]

The delete rate is progressively slowing down:
Cycle 0:
Thread 2: Deleting 1000 flows on 7 nodes
Thread 4: Deleting 1000 flows on 7 nodes
Thread 1: Deleting 1000 flows on 7 nodes
Thread 0: Deleting 1000 flows on 7 nodes
Thread 3: Deleting 1000 flows on 7 nodes

Thread 4:
Delete time: 580.74,
Delete success rate: 1.72, Delete total rate: 1.72
Delete Results:

{200: 1000}

Thread 2:
Delete time: 583.39,
Delete success rate: 1.71, Delete total rate: 1.71
Delete Results:

{200: 1000}

Thread 3:
Delete time: 588.14,
Delete success rate: 1.70, Delete total rate: 1.70
Delete Results:

{200: 1000}

Thread 1:
Delete time: 589.78,
Delete success rate: 1.70, Delete total rate: 1.70
Delete Results:

{200: 1000}

Thread 0:
Delete time: 589.40,
Delete success rate: 1.70, Delete total rate: 1.70
Delete Results:

{200: 1000}

Overall success rate: 8.53, Overall rate: 8.53

Cycle 1:
Thread 0: Deleting 1000 flows on 7 nodes
Thread 3: Deleting 1000 flows on 7 nodes
Thread 1: Deleting 1000 flows on 7 nodes
Thread 2: Deleting 1000 flows on 7 nodes
Thread 4: Deleting 1000 flows on 7 nodes
Thread 2:
Delete time: 1145.62,
Delete success rate: 0.87, Delete total rate: 0.87
Delete Results:

{200: 1000}

Thread 0:
Delete time: 1147.30,
Delete success rate: 0.87, Delete total rate: 0.87
Delete Results:

{200: 1000}

Thread 4:
Delete time: 1148.48,
Delete success rate: 0.87, Delete total rate: 0.87
Delete Results:

{200: 1000}

Thread 3:
Delete time: 1149.67,
Delete success rate: 0.87, Delete total rate: 0.87
Delete Results:

{200: 1000}

Thread 1:
Delete time: 1149.85,
Delete success rate: 0.87, Delete total rate: 0.87
Delete Results:

{200: 1000}

Overall success rate: 4.35, Overall rate: 4.35

Cycle 2:
Thread 3: Deleting 1000 flows on 7 nodes
Thread 1: Deleting 1000 flows on 7 nodes
Thread 2: Deleting 1000 flows on 7 nodes
Thread 0: Deleting 1000 flows on 7 nodes
Thread 4: Deleting 1000 flows on 7 nodes

Thread 1:
Delete time: 1363.80,
Delete success rate: 0.73, Delete total rate: 0.73
Delete Results:

{200: 1000}

Thread 3:
Delete time: 1363.81,
Delete success rate: 0.73, Delete total rate: 0.73
Delete Results:

{200: 1000}

Thread 2:
Delete time: 1363.92,
Delete success rate: 0.73, Delete total rate: 0.73
Delete Results:

{200: 1000}

Thread 0:
Delete time: 1364.04,
Delete success rate: 0.73, Delete total rate: 0.73
Delete Results:

{200: 1000}

Thread 4:
Delete time: 1364.04,
Delete success rate: 0.73, Delete total rate: 0.73
Delete Results:

{200: 1000}

Overall success rate: 3.67, Overall rate: 3.67

Comment by Jan Medved [ 18/Sep/14 ]

Attachment Archive.zip has been added with description: log files

Comment by Jan Medved [ 18/Sep/14 ]

Attachment Call-tree--all-threads-together,-CPU-usage-estimation.zip has been added with description: Profiler output

Comment by Abhijit Kumbhare [ 01/Jun/15 ]

Fixed - need to retest.

Generated at Wed Feb 07 20:32:03 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.