[YANGTOOLS-660] SchemaContext - excessive memory consumption Created: 20/Sep/16  Updated: 10/Apr/22  Resolved: 19/Oct/16

Status: Resolved
Project: yangtools
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Highest
Reporter: Andrej Mak Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Issue Links:
Blocks
blocks NETCONF-256 excessive memory consumption when mou... Resolved
blocks YANGTOOLS-692 Parser: duplicate ModelDefinedStateme... Resolved
blocks YANGTOOLS-693 Parser: duplicate TypeStatementImpl o... Resolved
Epic Link: Parser Performance
External issue ID: 6757

 Description   

When I try to create SchemaContext including 500 complicated models (8 MB of yang sources), SchemaContext instance occupies 1.6 GB of memory. Profiling has shown, that one of the causes could be large number of duplicated objects - probably EnumMaps in class SubstatementContext.
Would it be possible to somehow optimize it?



 Comments   
Comment by Robert Varga [ 15/Oct/16 ]

Can you share a heap dump and the list of models?

Comment by Robert Varga [ 17/Oct/16 ]

Analysis of the provided memory dump shows quite clearly that we are retaining implementation-internal details and state in the end result.

We retain BuildSchemaContext, SourceSpecificContext, SubstatementContext and similar, which should certainly not be retained once EffectiveSchemaContext has been built.

It would seem that the problem is EffectiveStatementBase.unknownSubstatementsToBuild, introduced in https://git.opendaylight.org/gerrit/#/c/28300/.

Comment by Robert Varga [ 17/Oct/16 ]

The approach taken to fix the recursion issue needs to be revised so that that field does not exist at all.

Comment by Robert Varga [ 17/Oct/16 ]

master: https://git.opendaylight.org/gerrit/#/c/47038/

Comment by Robert Varga [ 17/Oct/16 ]

boron: https://git.opendaylight.org/gerrit/47041

Comment by A H [ 18/Oct/16 ]

Reopening bug since it needs to be fixed in Beryllium SR4.

Comment by A H [ 18/Oct/16 ]

To better assess the impact of this bug and fix, could someone from your team please help us identify the following:
Severity: Could you elaborate on the severity of this bug? Is this a BLOCKER such that we cannot release Beryllium without it? Is there a workaround such that we can write a release note?
Testing: Could you also elaborate on the testing of this patch? How extensively has this patch been tested? Is it covered by any unit tests or system tests?
Impact: Does this fix impact any dependent projects?

Comment by Robert Varga [ 18/Oct/16 ]

Severity:
This issue was introduced during fixing of BUG-4456 and while it fixed that issue, it inadvertently caused temporary objects to be retained indefinitely instead of being garbage collected, causing excessive memory overhead.

The problem is triggered by presence of an 'extension' statement containing another extension. The only workaround is to not to load models containing such constructs, which is not practical.

The issue could be exploited by a malicious SB device to cause a resource exhaustion on the controller. Given the SR4 is our last planned maintenance release, we should not be releasing it without this issue being fixed.

Testing:
BUG-4456 is covered by a unit test. The patch attached to this issue reverts the original fix, temporarily introducing a functionality regression (while fixing the memory leak). The patch attached to BUG-4456 reworks the original fix in a way, which does not leak memory. This is confirmed by the unit test passing again as expected.

The combination of the two patches has been manually tested against the device which flushed the original bug report and confirmed to work (no regression on BUG-4456) and to have eliminated the memory leak (reported memory usage was ~240MB).

We currently do not have system tests which would identify this sort of issues, but we are in process of introducing such a test suite in Carbon.

Impact:
No ODL projects use models which trigger this behavior as far as we can tell, hence they should not be impacted by this issue.

Comment by Robert Varga [ 18/Oct/16 ]

beryllium: https://git.opendaylight.org/gerrit/47091

Comment by A H [ 19/Oct/16 ]

Has this bug been verified as fixed in the latest Beryllium SR4 Build 20161019?

Generated at Wed Feb 07 20:53:55 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.