[NETCONF-346] modules-state conflict kills rest-connector-default-impl Created: 02/Feb/17  Updated: 15/Mar/19  Resolved: 24/Feb/17

Status: Resolved
Project: netconf
Component/s: restconf-nb
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Vratko Polak Assignee: Ivan Hrasko
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 7728

 Description   

Not sure if this belongs to Restconf or Netconf.

This Bug is very similar to BGPCEP-596 except it happens much more frequently in 3node all jobs.
"Server is unhealthy" (which causes red dot in CSIT) is caused by a conflicting modification, which prevents config subsystem module instantiation.

Karaf.log segments from [1]:

2017-02-02 12:35:19,049 | WARN | lt-dispatcher-21 | ShardDataTree | 238 - org.opendaylight.controller.sal-distributed-datastore - 1.5.0.SNAPSHOT | member-2-shard-default-operational: Store Tx member-2-datastore-operational-fe-0-chn-1-txn-0-0: Conflicting modification for path /(urn:ietf:params:xml:ns:yang:ietf-yang-library?revision=2016-06-21)modules-state.

...

2017-02-02 12:35:19,117 | ERROR | config-pusher | ConfigPusherImpl | 153 - org.opendaylight.controller.config-persister-impl - 0.6.0.SNAPSHOT | Failed to apply configuration snapshot: 10-rest-connector.xml(odl-restconf,odl-restconf)
java.lang.IllegalStateException: Error - getInstance() failed for ModuleIdentifier

{factoryName='rest-connector-impl', instanceName='rest-connector-default-impl'}

in transaction TransactionIdentifier

{name='ConfigTransaction-52-54'}

...

[0] https://bugs.opendaylight.org/show_bug.cgi?id=7102
[1] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-clustering-only-carbon/576/console.log.gz



 Comments   
Comment by Vratko Polak [ 02/Feb/17 ]

This sometimes happen [2] also in Netconf only job, and sometimes [3] all job is unaffected.

[2] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-only-carbon/434/console.log.gz
[3] https://jenkins.opendaylight.org/releng/view/netconf/job/netconf-csit-3node-clustering-all-carbon/180/

Comment by Jamo Luhrsen [ 02/Feb/17 ]

seems more like a "major" (or higher) bug to me.

Comment by Vratko Polak [ 16/Feb/17 ]

After looking into the code, I believe that this Bug is caused by the design choice of storing yang library (and monitoring) data in the operational datastore [0], while not expecting two members might be processing schema updates at the same time, causing OptimisticLockFailedException here [1].

Note that different members may have different set of features installed, so different schema contexts. But I guess the correct operation in heterogeneous cluster is for later releases to figure out.

As a Carbon workaround, recognizing OptimisticLockFailedException and interpreting it as another node already putting the same data (thus ignoring instead of throwing RestconfDocumentedException would) suffice.

[0] https://git.opendaylight.org/gerrit/gitweb?p=netconf.git;a=blob;f=restconf/sal-rest-connector/src/main/java/org/opendaylight/restconf/handlers/SchemaContextHandler.java;h=b00d0cfabd1c589a478bc7885759a40a34c4583d;hb=refs/heads/master#l59
[1] https://git.opendaylight.org/gerrit/gitweb?p=netconf.git;a=blob;f=restconf/sal-rest-connector/src/main/java/org/opendaylight/restconf/handlers/SchemaContextHandler.java;h=b00d0cfabd1c589a478bc7885759a40a34c4583d;hb=refs/heads/master#l78

Comment by Ivan Hrasko [ 16/Feb/17 ]

Carbon workaround:
https://git.opendaylight.org/gerrit/#/c/51961/

Comment by Vratko Polak [ 22/Feb/17 ]

> https://git.opendaylight.org/gerrit/#/c/51961/

Merged, but should be also cherry-picked to stable/boron.

Comment by Ivan Hrasko [ 22/Feb/17 ]

stable/boron cherry pick:
https://git.opendaylight.org/gerrit/#/c/52169/1

Comment by Vratko Polak [ 22/Feb/17 ]

The workaround is not working around the problem well enough.

See "Transaction chain has failed" in https://jenkins.opendaylight.org/sandbox/job/controller-csit-3node-clustering-only-carbon/13/console

Also: https://git.opendaylight.org/gerrit/#/c/51961/10/restconf/sal-rest-connector/src/main/java/org/opendaylight/restconf/handlers/SchemaContextHandler.java@87

Comment by Ivan Hrasko [ 24/Feb/17 ]

improved workaround with transaction chain reseting:
https://git.opendaylight.org/gerrit/#/c/52199/

Comment by Vratko Polak [ 24/Feb/17 ]

> improved workaround

Merged to carbon and stable/boron.

Generated at Wed Feb 07 20:14:46 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.