[YANGTOOLS-1308] Reactor failure with OpenROADM models Created: 21/Jul/21  Updated: 24/Jul/21  Resolved: 24/Jul/21

Status: Resolved
Project: yangtools
Component/s: parser
Affects Version/s: 7.0.0, 7.0.3
Fix Version/s: 7.0.4

Type: Bug Priority: Highest
Reporter: Robert Varga Assignee: Robert Varga
Resolution: Done Votes: 0
Labels: regression
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File log.txt.xz    
Issue Links:
Relates
relates to YANGTOOLS-1184 Release InferredStatementContext's ef... Resolved

 Description   

The reference counting machinery introduced in yang-parser-reactor seems to have a hole triggered by OpenROADM models.

This is evident in https://git.opendaylight.org/gerrit/c/transportpce/+/96939 , where we end up with the following splat:

[WARNING] Unexpected error processing source RevisionSourceIdentifier [name=org-openroadm-de-operations@2020-05-29]. Please file an issue with this model attached.
com.google.common.base.VerifyException: Attempted to access reference count of SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)type}, rawArgument=string}
    at com.google.common.base.Verify.verify (Verify.java:181)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.incRef (ReactorStmtCtx.java:686)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.ReplicaStatementContext.<init> (ReplicaStatementContext.java:42)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.StatementContextBase.replicaAsChildOf (StatementContextBase.java:825)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.lambda$reusePrototypeReplicas$5 (InferredStatementContext.java:312)
    at java.util.stream.ReferencePipeline$3$1.accept (ReferencePipeline.java:195)
    at java.util.stream.ReferencePipeline$2$1.accept (ReferencePipeline.java:177)
    at java.util.Iterator.forEachRemaining (Iterator.java:133)
    at java.util.Spliterators$IteratorSpliterator.forEachRemaining (Spliterators.java:1801)
    at java.util.stream.AbstractPipeline.copyInto (AbstractPipeline.java:484)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto (AbstractPipeline.java:474)
    at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining (StreamSpliterators.java:312)
    at com.google.common.collect.CollectSpliterators$FlatMapSpliterator.lambda$forEachRemaining$1 (CollectSpliterators.java:361)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining (Spliterators.java:948)
    at com.google.common.collect.CollectSpliterators$FlatMapSpliterator.forEachRemaining (CollectSpliterators.java:357)
    at java.util.stream.AbstractPipeline.copyInto (AbstractPipeline.java:484)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto (AbstractPipeline.java:474)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential (ReduceOps.java:913)
    at java.util.stream.AbstractPipeline.evaluate (AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect (ReferencePipeline.java:578)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.reusePrototypeReplicas (InferredStatementContext.java:316)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.reusePrototypeReplicas (InferredStatementContext.java:306)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.tryToReuseSubstatements (InferredStatementContext.java:287)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.tryToReusePrototype (InferredStatementContext.java:228)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.createEffective (InferredStatementContext.java:217)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.StatementContextBase.createEffective (StatementContextBase.java:412)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.loadEffective (ReactorStmtCtx.java:382)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.buildEffective (ReactorStmtCtx.java:372)
    at java.util.stream.ReferencePipeline$3$1.accept (ReferencePipeline.java:195)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining (ArrayList.java:1655)
    at java.util.stream.AbstractPipeline.copyInto (AbstractPipeline.java:484)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto (AbstractPipeline.java:474)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential (ReduceOps.java:913)
    at java.util.stream.AbstractPipeline.evaluate (AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect (ReferencePipeline.java:578)
    at org.opendaylight.yangtools.yang.parser.spi.meta.AbstractStatementSupport.buildEffectiveSubstatements (AbstractStatementSupport.java:141)
    at org.opendaylight.yangtools.yang.parser.spi.meta.AbstractStatementSupport.createEffective (AbstractStatementSupport.java:85)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.StatementContextBase.createEffective (StatementContextBase.java:426)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.StatementContextBase.createEffective (StatementContextBase.java:420)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.StatementContextBase.createEffective (StatementContextBase.java:412)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.loadEffective (ReactorStmtCtx.java:382)
    at org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.buildEffective (ReactorStmtCtx.java:372)

i.e. we are trying to increment reference count on a statement which has had its processing completed:

        verify(current >= REFCOUNT_NONE, "Attempted to access reference count of %s", this);

The error reproduces most of the time, but the reported model may differ. With some luck the assembly may even finish without an issue.



 Comments   
Comment by Robert Varga [ 21/Jul/21 ]

The pass through InferredStatementContext means we are going through an augment/uses edge towards a definition source and that source has already been processed by its parent while we were not holding a reference. It would seem we either are failing to take a reference in some edge case, or we are dropping a reference too early.

This will require a run with full tracing, I am afraid.

Comment by Robert Varga [ 22/Jul/21 ]

So the reproducer is in transportpce.git/ordmodels/device, mostly. I have run 11 runs, out of which:

  • 3 succeeded
  • 6 failed on org.openroadm-de-operations, with 1:2:3 split between revisions 2016-10-14, 2018-10-19, 2020-05-29
  • 2 failed on org-openroadm-resources, in one in each of 2016-10-14 and 2018-10-19

All of the failures are touching a statement which has already been cleaned up. Augmented logging, not present in current code base, reveals that just before the failure we are sweeping a 'uses' statement referencing a grouping from org-openroadm-resource-types.

The failures are of two kinds, not related to which module is being blamed. Twice we do this:

[INFO] Sweeping on decrement SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)uses}, rawArgument=org-openroadm-resource-types:device-id, refCount=0}
// ...
com.google.common.base.VerifyException: Attempted to access reference count of SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)description}, rawArgument=Node Id is a globally unique identifier for a device.
Same as leafref value in model, if applicable., refCount=SWEPT}

and 6 times we do this:

[INFO] Sweeping on decrement ReplicaStatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)description}, rawArgument=Node Id is a globally unique identifier for a device.
Same as leafref value in model, if applicable., refCount=0}
[INFO] Sweeping on decrement InferredStatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)leaf}, rawArgument=node-id, refCount=0}
[INFO] Sweeping on decrement SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)uses}, rawArgument=org-openroadm-resource-types:circuit-pack-name, refCount=0}
[INFO] Sweeping on decrement ReplicaStatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)type}, rawArgument=string, refCount=0}
[INFO] Sweeping on decrement ReplicaStatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)mandatory}, rawArgument=true, refCount=0}
[INFO] Sweeping on decrement ReplicaStatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)description}, rawArgument=Circuit-Pack identifier. Unique within the context of a device.
Same as leafref value in model, if applicable., refCount=0}
[INFO] Last released on SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)leaf}, rawArgument=circuit-pack-name, refCount=0}
[INFO] Sweeping on decrement SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)leaf}, rawArgument=circuit-pack-name, refCount=0}
[INFO] Sweeping on decrement InferredStatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)leaf}, rawArgument=circuit-pack-name, refCount=0}
[INFO] Sweeping on decrement SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)case}, rawArgument=circuit-pack, refCount=0}
[INFO] Sweeping on decrement SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)uses}, rawArgument=org-openroadm-resource-types:port-name, refCount=0}
// ...
com.google.common.base.VerifyException: Attempted to access reference count of SubstatementContext{definition=StatementDefinitionContext{statement=(urn:ietf:params:xml:ns:yang:yin:1)type}, rawArgument=string, refCount=SWEPT}

 

Comment by Robert Varga [ 22/Jul/21 ]

The second failure mode is probably more interesting, as it shows a consistent sweep cascade.
It seems both failures modes are rooted in the following block in org-openroadm-device:

  grouping external-link {
    leaf external-link-name {
      type string;
    }
    container source {
      uses org-openroadm-resource-types:device-id {
        refine "node-id" {
          mandatory true;
        }
      }
      uses org-openroadm-resource-types:port-name {
        refine "circuit-pack-name" {
          mandatory true;
        }
        refine "port-name" {
          mandatory true;
        }
      }
    }
    container destination {
      uses org-openroadm-resource-types:device-id {
        refine "node-id" {
          mandatory true;
        }
      }
      uses org-openroadm-resource-types:port-name {
        refine "circuit-pack-name" {
          mandatory true;
        }
        refine "port-name" {
          mandatory true;
        }
      }
    }
  }

There does not seem to be anything special about these, except one thing: they are the only 'uses' from an external module which is coupled with a 'refine' statement.

Comment by Robert Varga [ 22/Jul/21 ]

It suspect the problem is that we need to still retain a refcount to "port-name/circuit-pack-name" or "device-id/node-id", in the source tree until after the refine statement itself is completely built.

It needs some more investigation, but perhaps this is enough to build a minimal reproducer model.

Comment by Robert Varga [ 23/Jul/21 ]

The statement has been swept from this context:

org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweepDone(ReactorStmtCtx.java:913)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweepState(ReactorStmtCtx.java:923)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweepOnDecrement(ReactorStmtCtx.java:815)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.lastDecRef(ReactorStmtCtx.java:743)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.decRef(ReactorStmtCtx.java:725)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReplicaStatementContext.sweepSubstatements(ReplicaStatementContext.java:142)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweepState(ReactorStmtCtx.java:921)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweep(ReactorStmtCtx.java:784)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweep(ReactorStmtCtx.java:773)
org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.sweepSubstatements(InferredStatementContext.java:505)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweepState(ReactorStmtCtx.java:921)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.sweepOnDecrement(ReactorStmtCtx.java:815)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.lastDecRef(ReactorStmtCtx.java:743)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.decRef(ReactorStmtCtx.java:725)
org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.tryToReuseSubstatements(InferredStatementContext.java:288)
org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.tryToReusePrototype(InferredStatementContext.java:228)
org.opendaylight.yangtools.yang.parser.stmt.reactor.InferredStatementContext.createEffective(InferredStatementContext.java:217)
org.opendaylight.yangtools.yang.parser.stmt.reactor.StatementContextBase.createEffective(StatementContextBase.java:412)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.loadEffective(ReactorStmtCtx.java:385)
org.opendaylight.yangtools.yang.parser.stmt.reactor.ReactorStmtCtx.buildEffective(ReactorStmtCtx.java:375)
Comment by Robert Varga [ 23/Jul/21 ]

log.txt contains a run with some additional logging, which might help with finding out the cause behind this.

Comment by Robert Varga [ 24/Jul/21 ]

After a bit of analysis, it seems the problem is with the UsesStatementSupport.performRefine().  We are performing materialization of substatements, which is fine, but as part of that operation we end up calculating (and memoizing) parent reference counts. Since we are still working within inference, not all of refcount has been materialized, yet – and we therefore memoize that 'type string's parent 'leaf type' has no references. It later grows a reference, but by that time the damage has already been done – since 'type string' saw no parent references and it built its effective statement, it sweeps itself.

We later follow the reference we are holding and end up colliding with the invariant.

An indication of this is that the not all statements in the hierarchy (most notably module) have not reached ModelProcessingPhase.EFFECTIVE_MODEL. This means that any asks for parent refcount should not result in 'absent', as we just do not know.

Generated at Wed Feb 07 20:55:47 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.