Loading...

XML

Word

Printable

Type: Improvement
Resolution: Done
Priority: Medium
Fix Version/s: 6.0.0, 5.0.6
Affects Version/s: None
Component/s: parser
Labels:
None

Experiments in the ~~YANGTOOLS-1128~~ area show that using ANTLR parse tree has downside in the amount of memory we consume. This comes from three facts:

ANTLR is completely transparent, hence each token has a lot of metadata about where it comes from. We do not use most of that metadata.
A lot of the tokens are simple separators. We do not use those tokes at all.
Tokens are not really immutable to support some use cases which we do not use.

We also do not allocate strings entirely efficiently: there are plenty of models in the wild which are auto-generated and do not take advantage of YANG facilities, hence there are a number of duplicate construct definitions – and those strings end up being duplicated simply because we are geared towards sane models.

Introduce a heavily-interned intermediate representation instead of relying on ANTLR-generated tree. While interning expends a non-trivial amount of CPU cycles to get strings de-duplicated, the results for benchmark models end up saving a ton of duplication. This also has some bearing onto the size of the effective model, which seems to benefit from this upfront work. Experimentation shows >90% memory footprint reduction.

split from

YANGTOOLS-1128 Add a dedicated ANTLR token factory

Resolved

Assignee:: Robert Varga

Reporter:: Robert Varga

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 15/Aug/20 6:15 AM

Updated:: 07/Sep/20 12:46 PM

Resolved:: 07/Sep/20 12:46 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates