[YANGTOOLS-694] Eliminate duplicate DescriptionEffectiveStatementImpl objects Created: 18/Oct/16 Updated: 10/Apr/22 Resolved: 20/Jan/21 |
|
| Status: | Resolved |
| Project: | yangtools |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.0.0 |
| Type: | Improvement | Priority: | Highest |
| Reporter: | Robert Varga | Assignee: | Miroslav Kovac |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Σ Remaining Estimate: | Not Specified | Remaining Estimate: | Not Specified |
| Σ Time Spent: | Not Specified | Time Spent: | Not Specified |
| Σ Original Estimate: | Not Specified | Original Estimate: | Not Specified |
| Environment: |
Operating System: All |
||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Sub-Tasks: |
|
||||||||||||||||||||||||||||||||||||
| Epic Link: | Parser Performance | ||||||||||||||||||||||||||||||||||||
| Description |
|
Analysis of a memory snapshot with vendor models shows we end up with a SchemaContext which weighs about 240MB. Further analysis of duplicate objects shows we have ~4000 instances of DescriptionEffectiveStatementImpl wasting ~400kB, which refer to a single description statement defined in a grouping. This hints at us not having the understanding that a statement has been copied and unmodified from a grouping. Analyze the exact cause for this and fix it either by not performing a copy, or squashing resulting duplicates. This theory is reinforced by the fact that the same dump shows ~64K WhenEffectiveStatementImpl objects (23x groups of 2769 instances each) coming from a construct like:
grouping foo {
leaf bar {
when "../x";
}
}
A similar situation happens with UnitsEffectiveStatementImpl – 15K instances wasting 450kB. |
| Comments |
| Comment by Robert Varga [ 19/Oct/16 ] |
|
I think the root cause for this is in SubstatementContext.copy {Declared,Effective}Statements and GroupingUtils.needToCopyByUses(). The check is a weird and undocumented one and does not take effect in situations like:
grouping foo {
leaf foo {
description "foo";
}
}
Note leaf is copied, but since description's getParentContext() points to a leaf, the second part check fails and the description ends up being copied. If this is true, the fix for this issue will also mean we will cut down temporary object use, too. |
| Comment by Robert Varga [ 19/Oct/16 ] |
|
A simple unit test for this can be created. With the following models: module foo {
grouping foo {
leaf foo {
type string;
units foo;
}
}
grouping export {
uses grouping foo;
}
}
module bar {
import foo;
container bar-export {
uses foo:export;
}
container bar-foo {
uses foo:foo;
}
}
module baz {
import foo;
container baz-export {
uses foo:export;
}
container baz-foo {
uses foo:foo;
}
}
All units visible from effective model should point to the same UnitsEffectiveStatementImpls bject. |
| Comment by Robert Varga [ 25/Oct/16 ] |
|
We have eliminated all obvious duplicates during assembly, with the exception of duplication of StatementContextBase, which is performed when statements are inlined via uses statement. This is caused by the fact we copy all declared/effective statements eagerly, no matter if they will be touched in future – just to update their parent, path and copy history. We need to create StatementContextBase equivalent proxy, which will perform the copy lazily as needed and add the changed CopyHistory and similar details to createEffective(), so it ends up reusing statement definitions if they are not touched by an augment, refine or similar. |
| Comment by Robert Varga [ 05/Jun/17 ] |
|
A memory snapshot taken due to an OOM during FULL_DECLARATION parsing phase shows 1.5GB memory usage, with 6.2M SubstatementContext instances retaining ~1.1GB. While this memory usage is temporary (BuildGlobalContext shows 800MB retained at this point), we obviously need to address this to keep our resource usage within reasonable bounds. |
| Comment by Robert Varga [ 05/Jun/17 ] |
|
I am not sure we can deliver this by Carbon SR1. Solving this issue requires re-wiring 'uses', 'augment', 'refine' and 'deviate' mechanics, which deals with namespace management and child nodes. Namespace management there is tricky and we essentially want a copy-on-write StatementContextBase. |
| Comment by Robert Varga [ 07/Jun/17 ] |
|
There are three sides to this problem: 1) the side we are investigating, which is reuse and lazily-populated StatementContextBase objects (which will need to be refactored). This involves dealing with CopyHistory and instantiation namespaces. Since this is traversed in a tree-like manner, this is a relatively simple problem, as all it boils down to is using forwarding objects in UsesStatementImpl's resolution phase. 2) the side of augment/refine/deviate making modifications to objects, which really expands forwarding objects with lazy instantiation to cope with modifications. 3) the side of actually triggering augment/deviate/uses phase actions, which is driven by reactor and SchemaNodeIdentifierBuildNamespace, which is populated by StmtContext.Mutable object coming and going. It is the combination of 1) and 3) which makes this hard, as there are competing requirements. If 'uses' stops instantiating StatementContextBases, it stops feeding the namespace, which means augment/deviate/uses triggers will no longer trigger when they should. |
| Comment by Robert Varga [ 07/Jun/17 ] |
|
Turning this into a milestone, as this will require multiple steps, which we do not want to do in one go simply because of the complexities involved. |
| Comment by Robert Varga [ 09/Mar/18 ] |
|
Part 3) would be solved with |
| Comment by Robert Varga [ 08/Aug/18 ] |
|
https://git.opendaylight.org/gerrit/73972 solves the triggering problem. All triggers now follow ChildSchemaNodeNamespace, hence lazy StatementContextBase equivalent class can be plugged in: it can implement namespaces and it can also act as a statement – hence 3) will not be a problem. Part 2) is being solved in |
| Comment by Andrii Mazurian [ 14/Aug/19 ] |
|
So, I've run unit test with next models:
module bar {
namespace bar-ns;
prefix bar;
import foo { prefix foo;}
container bar-export {
uses foo:export;
}
container bar-foo {
uses foo:foo;
}
}
module baz {
namespace baz-ns;
prefix baz;
import foo { prefix foo;}
container baz-export {
uses foo:export;
}
container baz-foo {
uses foo:foo;
}
}
module foo {
namespace foo-ns;
prefix foo;
grouping foo {
leaf foo {
type string;
units foo;
}
}
grouping export {
uses foo;
}
}
And memory usage less than 50 Mb. rovarga any ideas how to reproduce it?
|
| Comment by Robert Varga [ 14/Aug/19 ] |
|
Well, you need to analyze a heap dump to see if there are duplicate objects – using YourKit, for example. |
| Comment by Andrii Mazurian [ 15/Aug/19 ] |
|
Profile looks normal. There aren't thousands of duplicates. |
| Comment by Andrii Mazurian [ 15/Aug/19 ] |
|
heap.bin |
| Comment by Robert Varga [ 15/Aug/19 ] |
|
|
| Comment by Robert Varga [ 15/Aug/19 ] |
|
As you can see there are 6 identical objects, expanded from a single grouping. This may not look like much, but it certainly adds up when you work with something like https://github.com/Juniper/yang/tree/master/19.2/19.2R1/junos/conf .
|
| Comment by Robert Varga [ 15/Aug/19 ] |
|
To add some colour to the picture: a SchemaContext can easily weigh 375MiB, as noted (as a matter of course) in |
| Comment by Robert Varga [ 22/Sep/20 ] |
|
There are three remaining steps in this task:
|
| Comment by Robert Varga [ 03/Dec/20 ] |
|
So where are here:
the highlighted statements should really be squashed, as at a number of them are duplicates, as confirmed by examining declared instance references:
|
| Comment by Robert Varga [ 03/Dec/20 ] |
|
This issue should take care of at least these. |