[OPNFLWPLUG-1047] NodeConfiguratorImpl JobQueueHandler iterator() causing extensive object allocation Created: 13/Nov/18  Updated: 05/Dec/18  Resolved: 05/Dec/18

Status: Resolved
Project: OpenFlowPlugin
Component/s: None
Affects Version/s: Oxygen-SR3
Fix Version/s: Oxygen-SR4, Fluorine-SR2, Neon

Type: Bug Priority: Highest
Reporter: Michael Vorburger Assignee: Michael Vorburger
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

I'm looking at a Java Flight Recording obtained from (internal) scale lab testing, , based on Oxygen SR3 code, and am surprised to find that the biggest "TLAB allocation", with a huge factor of several magnitudes (4.97 TiB vs. next issue at 86/57/16/15 ... GiB!), is this:

Iterator java.util.concurrent.ConcurrentHashMap$EntrySetView.iterator()	127113
void org.opendaylight.openflowplugin.applications.frm.nodeconfigurator.NodeConfiguratorImpl$JobQueueHandler.run()	127113
void java.lang.Thread.run()	127113

I'm not clear how object creation churn, and thus required GC, for a simple iterator() could be reduced, but it seems worth having a closer look at this code.



 Comments   
Comment by Robert Varga [ 13/Nov/18 ]

That code needs to be refactored, because:

  1. it is a huge busy loop
  2. if (shutdown) does not actually shutdown

 

Comment by Robert Varga [ 13/Nov/18 ]

Actually, this should be reworked to just use org.opendaylight.yangtools.util.concurrent.QueuedNotificationManager, which deals with the same issue and is known to actually work.

Comment by Michael Vorburger [ 13/Nov/18 ]

rovarga Thank You very much for having jumped right on to contributing a fix for this problem!!

Comment by Robert Varga [ 13/Nov/18 ]

No problem, I just didn't want you to go into a rabiit hole. The patch still needs to be verified to behave as needed – I won't have time to do that, anybody feel free to pick it up.

Comment by Michael Vorburger [ 14/Nov/18 ]

Avishnoi, or any other active OFP commiters d.arunprakash@ericsson.com (Arunprakash ?), gobinath@ericsson.com (gobinath ?), hema.gopalkrishnan@ericsson.com (Hematg ?), pp.prasana@gmail.com, shuva.jyoti.kar.87@gmail.com (SHUVA JYOTI KAR ?), or perhaps jluhrsen, will you run a suitable CSIT on c/77732 and get it merged soon-ish? We'll also need a back-port. (tpantelis already did a code review of rovarga's contribution and +1 it.)

Comment by Michael Vorburger [ 22/Nov/18 ]

> we'll also need a back-port.

may be safer if only to Fluorine but not to Oxygen anymore.. just to be extra cautious.

Comment by Michael Vorburger [ 03/Dec/18 ]

> may be safer if only to Fluorine but not to Oxygen anymore.. just to be extra cautious.

Let's actually (propose to) get this into Oxygen as well; because skitt has seen this one again as the major allocator in a JFR (the one from https://bugzilla.redhat.com/show_bug.cgi?id=1651668).

Generated at Wed Feb 07 20:34:03 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.