[YANGTOOLS-1130] Add an explicit intermediate representation for YANG Created: 15/Aug/20  Updated: 07/Sep/20  Resolved: 07/Sep/20

Status: Resolved
Project: yangtools
Component/s: parser
Affects Version/s: None
Fix Version/s: 6.0.0, 5.0.6

Type: Improvement Priority: Medium
Reporter: Robert Varga Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Issue split
split from YANGTOOLS-1128 Add a dedicated ANTLR token factory Resolved

 Description   

Experiments in the YANGTOOLS-1128 area show that using ANTLR parse tree has downside in the amount of memory we consume. This comes from three facts:

  1. ANTLR is completely transparent, hence each token has a lot of metadata about where it comes from. We do not use most of that metadata.
  2. A lot of the tokens are simple separators. We do not use those tokes at all.
  3. Tokens are not really immutable to support some use cases which we do not use.

We also do not allocate strings entirely efficiently: there are plenty of models in the wild which are auto-generated and do not take advantage of YANG facilities, hence there are a number of duplicate construct definitions – and those strings end up being duplicated simply because we are geared towards sane models.

Introduce a heavily-interned intermediate representation instead of relying on ANTLR-generated tree. While interning expends a non-trivial amount of CPU cycles to get strings de-duplicated, the results for benchmark models end up saving a ton of duplication. This also has some bearing onto the size of the effective model, which seems to benefit from this upfront work. Experimentation shows >90% memory footprint reduction.

 



 Comments   
Comment by Robert Varga [ 27/Aug/20 ]

The integration with yang.repo.api is unfortunately incompatible and slightly wrong. We need to introduce IRSchemaSource to properly integrate. ASTSchemaSource changes will need to be rolled back, so that we maintain the previous API, for backports.

Generated at Wed Feb 07 20:55:16 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.