[YANGTOOLS-1074] Define NormalizedBody and specializations Created: 22/Jan/20 Updated: 19/Sep/23 |
|
| Status: | Confirmed |
| Project: | yangtools |
| Component/s: | data-impl |
| Affects Version/s: | None |
| Fix Version/s: | 14.0.0 |
| Type: | Improvement | Priority: | High |
| Reporter: | Robert Varga | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Epic Link: | Redesign NormalizedNode |
| Description |
|
Analysis of a netvirt heap dump is showing that the data tree involved is around 640MiB, with 147MiB (~23%) retained by NormalizedNode implementations (6.5M objects). Each of these is costing typically 24 bytes, with 4 byte alignment shadow - typically containing the identifier and map of children. If we eliminate the idea that a NormalizedNode has an identifier, we would save ~49MiB (33% shallow, 13% overall) heap by making these cost typically 16 bytes. There are few places where identifier is required, which would have to be changed to carry Map.Entry<PathArgument, NormalizedNode> instead - a change cascading through quite a few interfaces. The most problematic will probably be NormalizedNodeStreamWriter. Aside from memory savings, this would solve the interesting problem of DataTree root node (i.e. corresponding to SchemaContext) needing an identifier, which is currently wedged to use SchemaContext.NAME - which can lead to problems with revisions in some cases. Consider dropping Identifiable from NormalizedNode, adjusting all users to cope with that. |
| Comments |
| Comment by Robert Varga [ 21/Feb/20 ] |
|
The idea is to have: public interface ContainerNode extends Map.Entry<NodeIdentifier, ContainerNodeState> [ extends SchemaNode state mess], Identifiable<NodeIdentifier>, ContainerNodeState { default NodeIdentifier getIdentifier() { return getKey(); } } public class DefaultContainerNode extends AbstractMap.ImmutableSimpleEntry<NodeIdentifier, ContainerNodeState> implements ContainerNode { } which would be a compat view. Except naming sucks – is much better to not lug around "State" suffix in most places. It also has performance implications on streaming API, where we really want to propagate these from a Map.entrySet(), otherwise we could end up hurting the GC. I do not believe we can deliver this in Aluminium.
|
| Comment by Robert Varga [ 13/Nov/20 ] |
|
The entry part of this proposal needs to be provided separately, as it would end up propagating type. The idea is that NormalizedNode is semantically equivalent to Entry<PathArgument, ? extends Object>, but we do not express that as a generic parameter by rather through an
NormalizedNode {
default Entry<? extends PathArgument, ? extends Object> toEntry() {
return Map.entry(getIdentifier(), body());
}
}
We then override this method to provide more precise approcimation, so that: ContainerNode extends NormalizedNode { @Override NodeIdentifier getIdentifier(); @Override ContainerBody body() @Override default Entry<NodeIdentifier, ContainerBody> toEntry() { return Map.entry(getIdentifier(), body()); } } |
| Comment by Robert Varga [ 13/Nov/20 ] |
|
We need the equivalent of ContainerBody for every NormalizedNode so that nodes can reconstructed easily. ForeignDataNodes are an exception: their body is a foreign representation, which we know nothing about (it can be a DOMSource). |
| Comment by Robert Varga [ 04/Nov/22 ] |
|
There is an interesting conflict between LeafNode and AnydataNode: how do we know an AnydataNode is not represented as a byte[], which is the same as a leaf for 'type binary'. If we scope out those two, they boil down to:
Looking at DataContainerNode, it is a DistinctNodeContainer (implying child addressability) and has three specializations:
DataContainerNode itself covers AugmentatioNode (which is going away) and ChoiceNode, ContainerNode, MapEntryNode, MountPointNode |