[YANGTOOLS-1093] yang-model-validator tool crashes with OOM Created: 31/Mar/20  Updated: 28/Oct/20  Resolved: 12/Jun/20

Status: Resolved
Project: yangtools
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Jamo Luhrsen Assignee: Unassigned
Resolution: Won't Do Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: JPEG File YANGTOOLS-1093.jpg    
Issue Links:
Issue split
split to YANGTOOLS-1112 Separate out ModelProcessingPhase.SOU... Confirmed

 Description   

This may be user error, or maybe improperly written yang, but when I try to run the  model validator against some juniper yang models from the yang github repo the tool consumes all given memory, spikes the CPU and eventually crashes with an OOM.

Command to recreate running from root of yang repo:

java -Xmx12288m -jar ../yangtools/yang/yang-model-validator/target/yang-model-validator-5.0.0-SNAPSHOT-jar-with-dependencies.jar -p ./vendor/juniper/18.2/18.2R1/ -r -d vendor/juniper/18.2/18.2R1/junos/conf/junos-conf-root@2018-01-01.yang

Not very helpful, but here is one OOM crash trace:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at org.antlr.v4.runtime.misc.DoubleKeyMap.<init>(DoubleKeyMap.java:19)
	at org.antlr.v4.runtime.atn.ParserATNSimulator.computeReachSet(ParserATNSimulator.java:776)
	at org.antlr.v4.runtime.atn.ParserATNSimulator.execATNWithFullContext(ParserATNSimulator.java:664)
	at org.antlr.v4.runtime.atn.ParserATNSimulator.execATN(ParserATNSimulator.java:505)
	at org.antlr.v4.runtime.atn.ParserATNSimulator.adaptivePredict(ParserATNSimulator.java:393)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:276)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.antlr.YangStatementParser.statement(YangStatementParser.java:229)
	at org.opendaylight.yangtools.yang.parser.rfc7950.repo.YangStatementStreamSource.parseYangSource(YangStatementStreamSource.java:171)
	at org.opendaylight.yangtools.yang.parser.rfc7950.repo.YangStatementStreamSource.create(YangStatementStreamSource.java:98)
	at org.opendaylight.yangtools.yang.parser.impl.YangParserImpl.sourceToStatementStream(YangParserImpl.java:119)
	at org.opendaylight.yangtools.yang.parser.impl.YangParserImpl.addLibSource(YangParserImpl.java:73)
	at org.opendaylight.yangtools.yang.validator.SystemTestUtils.parseYangSources(SystemTestUtils.java:103)
	at org.opendaylight.yangtools.yang.validator.SystemTestUtils.parseYangSources(SystemTestUtils.java:87)
	at org.opendaylight.yangtools.yang.validator.Main.runSystemTest(Main.java:177)
	at org.opendaylight.yangtools.yang.validator.Main.main(Main.java:136)

Notice the above command is using 12G heap size. Obviously, it's much faster to hit
with a smaller heap. I will link a heap dump from a 1G recreation



 Comments   
Comment by Jamo Luhrsen [ 31/Mar/20 ]

Attached is a screen shot from the yourkit profiler showing an expanded view of the objects taking 99% of allocated memory in case
that helps. It was taken from this heap dump

Some of the juniper models are very large and I noticed the life profiler indicated some kind fo deadlock when it noticed some threads
not moving and mentioned some recursion happening. Possibly if we are reading and holding on to the same large models over
and over in duplicate during recursion we could end up in this trouble?

Comment by Robert Varga [ 12/Jun/20 ]

The stack trace indicates 'addLibSource', i.e. when a source is being added to the library. We are creating an ASTSchemaSource to determine the real identifier of the source and are retaining it for future use (same as we do for normal sources).

For normal sources this makes sense, as they are required and guaranteed to be touched during assembly. For library sources this is not as clear-cut, as they may or may not be referenced.

We need to instantiate ASTSchemaSource lazily when it is actually needed. We also should not retain a strong reference to it, so as to allow it to be GC'd if it ends up being unneeded.

Comment by Robert Varga [ 12/Jun/20 ]

This is going to take a bit more effort, as it requires reworking how SOURCE_PRE_LINKAGE phase works – which is part of YANGTOOLS-1112. Without that refactor, any laziness in file parsing would be immediately negated by the indiscriminate loading of statements.

 

Comment by Robert Varga [ 12/Jun/20 ]

jluhrsen while this issue is a valid improvement, I think the command uses a vastly over-specified library, which pulls in a ton of unrelated models – hence we should be able to limit the impact.

Comment by Jamo Luhrsen [ 12/Jun/20 ]

agreed. Eventually I found ways to limit unneeded models and not kill things. Something like this is in a testtool is
low priority and likely not anything to push for a fix at this point. Let's close it as won't do for now and if someone
else comes along with a need we can look again.

Generated at Wed Feb 07 20:55:10 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.