[NETVIRT-424] ElanPacketInHandler locks on Mac+Elan which doesn't converge in scale scenarios Created: 11/Jan/17  Updated: 19/Oct/17  Resolved: 19/Jan/17

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Boron
Fix Version/s: None

Type: Bug
Reporter: Guy Sela Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 7530

 Description   

onPacketReceived is doing 3 operations in a single job.
1) Updates a MacEntry data structure per interface name.
2) Updates a MacEntry data structure which is global.
3) Install flows into a DPN.

The key to lock this job is MAC + ELAN, which is too coarse for this operations. Operation 1 and 3 could have being done with a fined-grained key that includes the DPNID too.
The suggested solution is to split this into 2 jobs, one for tasks 1 and 3 with a lock that includes DPNID and another job for task 2 with the current lock.

In a scale scenario with 180 Computes and 16GB heap, the pending jobs didn't converge and caused Full GC because the flows are installed sequentially to each OVS instead of being installed concurrently.



 Comments   
Comment by Guy Sela [ 17/Jan/17 ]

Also, there is a bug in the transactions handling there.
Same InstanceIdentifier is used for:
delete tx1
submit tx1
put tx2
submit tx2

This causes OptimisticLockException sometimes.
There is no need for delete anyway because the put will override the data, so the delete was removed

Comment by Koby Aizer [ 17/Jan/17 ]

Review: https://git.opendaylight.org/gerrit/#/c/50370/

Generated at Wed Feb 07 20:21:31 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.