[YANGTOOLS-694] Eliminate duplicate DescriptionEffectiveStatementImpl objects Created: 18/Oct/16  Updated: 10/Apr/22  Resolved: 20/Jan/21

Status: Resolved
Project: yangtools
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0

Type: Improvement Priority: Highest
Reporter: Robert Varga Assignee: Miroslav Kovac
Resolution: Done Votes: 0
Labels: None
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


Attachments: PNG File Screen Shot 2019-08-14 at 12.46.25.png     PNG File Screen Shot 2019-08-15 at 10.07.46.png     File heap.bin     PNG File image-2020-12-03-18-31-00-574.png     PNG File image-2020-12-03-18-56-27-486.png     PNG File yt694-dups.png    
Issue Links:
Blocks
blocks YANGTOOLS-1067 Use flyweight EffectiveStatement impl... Resolved
is blocked by YANGTOOLS-784 Defer statement initialization in Inf... Resolved
is blocked by YANGTOOLS-859 Augmenting a container with if-feature Resolved
Duplicate
is duplicated by YANGTOOLS-779 OOM expection while parsing 27.1 MB o... Resolved
Relates
relates to YANGTOOLS-1065 Minimize EffectiveStatement implement... Resolved
relates to YANGTOOLS-1066 SchemaPath identification of SchemaNo... Resolved
Sub-Tasks:
Key
Summary
Type
Status
Assignee
YANGTOOLS-724 Make AbstractEffectiveDataSchemaNode.... Sub-task Resolved Robert Varga  
YANGTOOLS-743 Eliminate use of StatementContextBase... Sub-task Resolved  
YANGTOOLS-1121 Cleanup AbstractStatementSupport impl... Sub-task Resolved Robert Varga  
YANGTOOLS-1163 Clarify StatementSupport.applyCopyPol... Sub-task Resolved Robert Varga  
YANGTOOLS-1164 Audit all StatementSupport classes fo... Sub-task Resolved Miroslav Kovac  
YANGTOOLS-1195 Shortcut InferredStatement.buildEffec... Sub-task Resolved Miroslav Kovac  
Epic Link: Parser Performance

 Description   

Analysis of a memory snapshot with vendor models shows we end up with a SchemaContext which weighs about 240MB. Further analysis of duplicate objects shows we have ~4000 instances of DescriptionEffectiveStatementImpl wasting ~400kB, which refer to a single description statement defined in a grouping.

This hints at us not having the understanding that a statement has been copied and unmodified from a grouping. Analyze the exact cause for this and fix it either by not performing a copy, or squashing resulting duplicates.

This theory is reinforced by the fact that the same dump shows ~64K WhenEffectiveStatementImpl objects (23x groups of 2769 instances each) coming from a construct like:

grouping foo {
  leaf bar {
    when "../x";
  }
}

A similar situation happens with UnitsEffectiveStatementImpl – 15K instances wasting 450kB.



 Comments   
Comment by Robert Varga [ 19/Oct/16 ]

I think the root cause for this is in SubstatementContext.copy

{Declared,Effective}

Statements and GroupingUtils.needToCopyByUses(). The check is a weird and undocumented one and does not take effect in situations like:

grouping foo {
  leaf foo {
    description "foo";
  }
}

Note leaf is copied, but since description's getParentContext() points to a leaf, the second part check fails and the description ends up being copied.

If this is true, the fix for this issue will also mean we will cut down temporary object use, too.

Comment by Robert Varga [ 19/Oct/16 ]

A simple unit test for this can be created. With the following models:

module foo {
  grouping foo {
    leaf foo {
      type string;
      units foo;
    }
  }

  grouping export {
    uses grouping foo;
  }
}

module bar {
  import foo;

  container bar-export {
    uses foo:export;
  }

  container bar-foo {
    uses foo:foo;
  }
}

module baz {
  import foo;

  container baz-export {
    uses foo:export;
  }

  container baz-foo {
    uses foo:foo;
  }
}

All units visible from effective model should point to the same UnitsEffectiveStatementImpls bject.

Comment by Robert Varga [ 25/Oct/16 ]

We have eliminated all obvious duplicates during assembly, with the exception of duplication of StatementContextBase, which is performed when statements are inlined via uses statement.

This is caused by the fact we copy all declared/effective statements eagerly, no matter if they will be touched in future – just to update their parent, path and copy history.

We need to create StatementContextBase equivalent proxy, which will perform the copy lazily as needed and add the changed CopyHistory and similar details to createEffective(), so it ends up reusing statement definitions if they are not touched by an augment, refine or similar.

Comment by Robert Varga [ 05/Jun/17 ]

YANGTOOLS-779 shows that this particular issue can be hit out in the field. ~28MB of models, which are presumably grouping/uses-heavy can cause us to run out 4GB of heap.

A memory snapshot taken due to an OOM during FULL_DECLARATION parsing phase shows 1.5GB memory usage, with 6.2M SubstatementContext instances retaining ~1.1GB.

While this memory usage is temporary (BuildGlobalContext shows 800MB retained at this point), we obviously need to address this to keep our resource usage within reasonable bounds.

Comment by Robert Varga [ 05/Jun/17 ]

I am not sure we can deliver this by Carbon SR1.

Solving this issue requires re-wiring 'uses', 'augment', 'refine' and 'deviate' mechanics, which deals with namespace management and child nodes. Namespace management there is tricky and we essentially want a copy-on-write StatementContextBase.

Comment by Robert Varga [ 07/Jun/17 ]

There are three sides to this problem:

1) the side we are investigating, which is reuse and lazily-populated StatementContextBase objects (which will need to be refactored). This involves dealing with CopyHistory and instantiation namespaces. Since this is traversed in a tree-like manner, this is a relatively simple problem, as all it boils down to is using forwarding objects in UsesStatementImpl's resolution phase.

2) the side of augment/refine/deviate making modifications to objects, which really expands forwarding objects with lazy instantiation to cope with modifications.

3) the side of actually triggering augment/deviate/uses phase actions, which is driven by reactor and SchemaNodeIdentifierBuildNamespace, which is populated by StmtContext.Mutable object coming and going.

It is the combination of 1) and 3) which makes this hard, as there are competing requirements. If 'uses' stops instantiating StatementContextBases, it stops feeding the namespace, which means augment/deviate/uses triggers will no longer trigger when they should.

Comment by Robert Varga [ 07/Jun/17 ]

Turning this into a milestone, as this will require multiple steps, which we do not want to do in one go simply because of the complexities involved.

Comment by Robert Varga [ 09/Mar/18 ]

Part 3) would be solved with YANGTOOLS-859, that issue needs to have a phase action trigger when an unsupported node is added – which can be also used to move a registration downwards until it hits the appropriate proxy, eventually either fully resolving the node, or knowing further resolution is impossible.

Comment by Robert Varga [ 08/Aug/18 ]

https://git.opendaylight.org/gerrit/73972 solves the triggering problem. All triggers now follow ChildSchemaNodeNamespace, hence lazy StatementContextBase equivalent class can be plugged in: it can implement namespaces and it can also act as a statement – hence 3) will not be a problem.

Part 2) is being solved in YANGTOOLS-724, and that is the next step we need to take.

Comment by Andrii Mazurian [ 14/Aug/19 ]

So, I've run unit test with next models:

module bar {
  namespace bar-ns;
  prefix bar;  
  
  import foo { prefix foo;}  
  
  container bar-export {
    uses foo:export;
  }  
  container bar-foo {
    uses foo:foo;
  }
}

module baz {
  namespace baz-ns;
  prefix baz;  
  
  import foo { prefix foo;}  
  
  container baz-export {
    uses foo:export;
  }  
  
  container baz-foo {
    uses foo:foo;
  }
}

module foo {
  namespace foo-ns;
  prefix foo;  
  
  grouping foo {
    leaf foo {
      type string;
      units foo;
    }
  }  
  grouping export {
    uses foo;
  }
} 

And memory usage less than 50 Mb.

rovarga any ideas how to reproduce it?

 

Comment by Robert Varga [ 14/Aug/19 ]

Well, you need to analyze a heap dump to see if there are duplicate objects – using YourKit, for example.

Comment by Andrii Mazurian [ 15/Aug/19 ]

Profile looks normal. There aren't thousands of duplicates.

Comment by Andrii Mazurian [ 15/Aug/19 ]

heap.bin here is heap memory dump

Comment by Robert Varga [ 15/Aug/19 ]

Comment by Robert Varga [ 15/Aug/19 ]

As you can see there are 6 identical objects, expanded from a single grouping. This may not look like much, but it certainly adds up when you work with something like https://github.com/Juniper/yang/tree/master/19.2/19.2R1/junos/conf .

 

Comment by Robert Varga [ 15/Aug/19 ]

To add some colour to the picture: a SchemaContext can easily weigh 375MiB, as noted (as a matter of course) in YANGTOOLS-1004 – and these things are per device-type, so they directly affect scalability to no small extent.

Comment by Robert Varga [ 22/Sep/20 ]

There are three remaining steps in this task:

  1. clarify StatementSupport.applyCopyPolicy() behavior
  2. audit all StatementSupport classes to check whether they are using correct CopyPolicy
  3. address the FIXME in CONTEXT_INDEPENDENT case of StatementContextBase.copyAsChildOf(), so that completely independent statements are not copied. Note the second FIXME should be taken care of by YANGTOOLS-784.
Comment by Robert Varga [ 03/Dec/20 ]

So where are here:

the highlighted statements should really be squashed, as at a number of them are duplicates, as confirmed by examining declared instance references:

Comment by Robert Varga [ 03/Dec/20 ]

This issue should take care of at least these.

Generated at Wed Feb 07 20:54:01 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.