[OVSDB-469] Very high String allocation pressure on GC due to org.opendaylight.ovsdb.lib.notation.Version.fromString() Created: 20/Sep/18 Updated: 12/Nov/18 Resolved: 02/Oct/18 |
|
| Status: | Resolved |
| Project: | ovsdb |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Oxygen-SR3, Fluorine |
| Type: | Bug | Priority: | Highest |
| Reporter: | Michael Vorburger | Assignee: | Michael Vorburger |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Priority: | Highest | ||||||||
| Description |
|
I was suprised to see the stack trace below as the top #3 memory allocator in the JFR from thapar or skitt or any other OVSDB maintainers, how about rewriting that simpler and without allocation? String java.lang.String.substring(int, int) 2558 CharSequence java.lang.String.subSequence(int, int) 1653 CharSequence java.util.regex.Matcher.getSubSequence(int, int) 1610 String java.util.regex.Matcher.group(int) 1610 Version org.opendaylight.ovsdb.lib.notation.Version.fromString(String) 1567 Version org.opendaylight.ovsdb.lib.schema.typed.TyperUtils.getColumnUntilVersion(Method) 857 void org.opendaylight.ovsdb.lib.schema.typed.TyperUtils.checkColumnSchemaVersion(DatabaseSchema, Method) 857 void org.opendaylight.ovsdb.lib.schema.typed.TyperUtils.access$100(DatabaseSchema, Method) 857 Object org.opendaylight.ovsdb.lib.schema.typed.TyperUtils$1.processGetColumn(Method) 768 Object org.opendaylight.ovsdb.lib.schema.typed.TyperUtils$1.invoke(Object, Method, Object[]) 768 Column com.sun.proxy.$Proxy463.getLocatorColumn() 243 RemoteUcastMacs org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.buildRemoteUcast(UcastMacsRemote) 243 Node org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.buildConnectionNode(Collection) 243 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.updateUcastMacsRemote(ReadWriteTransaction, Collection) 243 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.execute(ReadWriteTransaction) 243 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepOperationalCommandAggregator.execute(ReadWriteTransaction) 243 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.TransactionInvokerImpl.run() 243 void java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 243 void java.util.concurrent.ThreadPoolExecutor$Worker.run() 243 void java.lang.Thread.run() 243 |
| Comments |
| Comment by Michael Vorburger [ 20/Sep/18 ] |
|
and the top #6 overall memory allocation, with 538 MB of int[] is this, which will also disappear together with above: void java.util.regex.Matcher.<init>(Pattern, CharSequence) 4943 Matcher java.util.regex.Pattern.matcher(CharSequence) 4938 Version org.opendaylight.ovsdb.lib.notation.Version.fromString(String) 2960 Version org.opendaylight.ovsdb.lib.schema.typed.TyperUtils.getColumnUntilVersion(Method) 1428 void org.opendaylight.ovsdb.lib.schema.typed.TyperUtils.checkColumnSchemaVersion(DatabaseSchema, Method) 1428 void org.opendaylight.ovsdb.lib.schema.typed.TyperUtils.access$100(DatabaseSchema, Method) 1428 Object org.opendaylight.ovsdb.lib.schema.typed.TyperUtils$1.processGetColumn(Method) 1009 Object org.opendaylight.ovsdb.lib.schema.typed.TyperUtils$1.invoke(Object, Method, Object[]) 1009 Column com.sun.proxy.$Proxy463.getLogicalSwitchColumn() 385 RemoteUcastMacs org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.buildRemoteUcast(UcastMacsRemote) 385 Node org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.buildConnectionNode(Collection) 385 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.updateUcastMacsRemote(ReadWriteTransaction, Collection) 385 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepUcastMacsRemoteUpdateCommand.execute(ReadWriteTransaction) 385 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.HwvtepOperationalCommandAggregator.execute(ReadWriteTransaction) 385 void org.opendaylight.ovsdb.hwvtepsouthbound.transactions.md.TransactionInvokerImpl.run() 385 void java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 385 void java.util.concurrent.ThreadPoolExecutor$Worker.run() 385 void java.lang.Thread.run() 385 |
| Comment by Vishal Thapar [ 20/Sep/18 ] |
|
I believe these are from JFRs shared by E/// team? This is from HWVTEP. This column lists all macs that are available to this particular port. The way schema is designed and based on deployment, this will be every other mac in network*N where N is number of TOR devices. I am not sure if fromString is the culprit here, the string in ColumnVersion will be something like 1.7.12, it is literally the schema version string. It is the data in this column that can grow huge. Are we sure it is the fromVersion that is doing huge allocation and not contents of column itself? |
| Comment by Michael Vorburger [ 20/Sep/18 ] |
|
Yes, JFRs shared by E/// team ( |
| Comment by Vishal Thapar [ 20/Sep/18 ] |
|
Then something else is wrong. FromVersion can't and shouldn't be doing huge allocations coz version string will never be anything bigger than 'xxx.yyy.zzz'. How can a string like that cause huge allocations? For every table and column in OVSDB we define a from,to versions [e.g. 1.0.0, 2.0.0] to determine if a given switch supports a given column or not. This caode is just to check if column is supported or not. This simple version comparison to see if column falls within range or not. If issue were too many small allocations, I'd understand, but huge allocation is just not possible here. We're missing something significant. |
| Comment by Michael Vorburger [ 20/Sep/18 ] |
|
Nope, as a wrote, the issue is 1 GB but not of a huge String (I never said that, you interpreted) of many small, substring() String; specifically x3 for each part of the Version. What is the harm in getting c/76290 merged and seeing? It cannot hurt, at worst you are right and it doesn't help, at best I'm right and this overallocation disappears. Worth a try! |
| Comment by Vishal Thapar [ 20/Sep/18 ] |
|
Aha, makes sense. Sorry for confusion. Trading allocation for performance hit which definitely will be there? There is a reason pattern was used in first place. I'd prefer running tests with a distribution zip off the patch, getting results before we merge. Worst case is not that this doesn't help. Worst case is it makes everything too slow and scale numbers get even worse. EDIT: |
| Comment by Michael Vorburger [ 20/Sep/18 ] |
|
> performance hit which definitely will be there? There is a reason pattern was used in first place. What? What makes you say that? The change proposed in c/76290 will, if anything, make Version fromString() faster, not slower... FYI that parse() method is directly inspired by JDK's Integer.parseInt. skitt perhaps you would like to weight in here? > I'd prefer running tests with a distribution zip off the patch, getting results before we merge sure, always a good idea, can you CSIT it? Shout if anything broken (due to this), doubt it. |
| Comment by Stephen Kitt [ 21/Sep/18 ] |
|
I think Michael’s patch is definitely worth it, thanks. It’s a shame parseInt() can’t deal with substring parsing itself! NumberFormat can, but it’s too expensive. |
| Comment by Michael Vorburger [ 02/Oct/18 ] |
|
I've back-ported (manual cherry-pick and conflict resolution) https://git.opendaylight.org/gerrit/#/c/76290/ from Neon master to Fluorine (https://git.opendaylight.org/gerrit/#/c/76568) and Oxygen (https://git.opendaylight.org/gerrit/#/c/76569). |