[NETCONF-1103] Incorrect response for failed PATCH request Created: 27/Jul/23  Updated: 10/Nov/23  Resolved: 10/Nov/23

Status: Resolved
Project: netconf
Component/s: restconf-nb
Affects Version/s: 6.0.0, 5.0.7, 4.0.8
Fix Version/s: 7.0.0, 4.0.9, 6.0.5, 5.0.10

Type: Bug Priority: Medium
Reporter: Sangwook Ha Assignee: Peter Suna
Resolution: Done Votes: 0
Labels: pt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Issue split
split to NETCONF-1197 Translate the patch request into a si... Open
split to NETCONF-1176 Improve failed YANG patch error output Resolved

 Description   

The status code of a PATCH request is always 200 as long as the request passes schema validation even if a transaction of an edit fails.

For example, the following PATCH request update/create NETCONF node:

PATCH /rests/data/network-topology:network-topology/topology=topology-netconf
Accept: application/yang-data+json
Content-Type: application/yang-patch+json

{
    "ietf-yang-patch:yang-patch": {
        "edit": [
            {
                "edit-id": "edit1",
                "operation": "merge",
                "target": "/node=netconf-mdsal",
                "value": {
                    "network-topology:node": [
                        {
                            "node-id": "netconf-mdsal",
                            "netconf-node-topology:concurrent-rpc-limit": 0,
                            "netconf-node-topology:schema-cache-directory": "netconf-mdsal",
                            "netconf-node-topology:password": "admin",
                            "netconf-node-topology:username": "admin",
                            "netconf-node-topology:default-request-timeout-millis": 1800000,
                            "netconf-node-topology:port": 2830,
                            "netconf-node-topology:tcp-only": false,
                            "netconf-node-topology:host": "127.0.0.1",
                            "netconf-node-topology:keepalive-delay": 100
                        }
                    ]
                }
            },
            {
                "edit-id": "edit2",
                "operation": "create",
                "target": "/node=netconf-mdsal",
                "value": {
                    "network-topology:node": [
                        {
                            "node-id": "netconf-mdsal",
                            "netconf-node-topology:keepalive-delay": 200
                        }
                    ]
                }
            }
        ],
        "patch-id": "patch1"
    }
}

returns 200 along with the following yang-patch-status:

{
    "ietf-yang-patch:yang-patch-status": {
        "patch-id": "patch1",
        "edit-status": {
            "edit": [
                {
                    "edit-id": "edit1",
                    "ok": [
                        null
                    ]
                },
                {
                    "edit-id": "edit2",
                    "errors": {
                        "error": [
                            {
                                "error-type": "protocol",
                                "error-tag": "data-exists",
                                "error-path": "/(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)network-topology/topology/topology[{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)topology-id=topology-netconf}]/node/node[{(urn:TBD:params:xml:ns:yang:network-topology?revision=2013-10-21)node-id=netconf-mdsal}]",
                                "error-message": "Data already exists"
                            }
                        ]
                    }
                }
            ]
        }
    }
}

The data-exists error is correctly captured for the second edit, but the status code does not reflect the error.

And the PATCH request for a NETCONF device has other issues.

For example, the following PATCH request to update/create/update configuration on a NETCONF node:

PATCH /rests/data/network-topology:network-topology/topology=topology-netconf/node=ncserver/yang-ext:mount HTTP/1.1
Accept: application/yang-data+json
Content-Type: application/yang-patch+json

{
    "ietf-yang-patch:yang-patch": {
        "patch-id": "patch1",
        "edit": [
            {
                "edit-id": "edit1",
                "operation": "merge",
                "target": "/foo:foo",
                "value": {
                    "foo:foo": {
                        "baz": "edit1"
                    }
                }
            },
            {
                "edit-id": "edit2",
                "operation": "create",
                "target": "/foo:foo",
                "value": {
                    "foo:foo": {
                        "baz": "edit2"
                    }
                }
            },
            {
                "edit-id": "edit3",
                "operation": "merge",
                "target": "/foo:foo",
                "value": {
                    "foo:foo": {
                        "baz": "edit3"
                    }
                }
            }
        ]
    }
}

returns status code of 200 with the following yang-patch-status:

{
    "ietf-yang-patch:yang-patch-status": {
        "patch-id": "patch1",
        "errors": {
            "error": [
                {
                    "error-type": "protocol",
                    "error-tag": "data-exists",
                    "error-message": "Data already exists"
                }
            ]
        },
        "edit-status": {
            "edit": [
                {
                    "edit-id": "edit1",
                    "ok": [
                        null
                    ]
                },
                {
                    "edit-id": "edit2",
                    "ok": [
                        null
                    ]
                },
                {
                    "edit-id": "edit3",
                    "ok": [
                        null
                    ]
                }
            ]
        }
    }
}

The error is because of the second edit - it was trying to create a data resource which already exists. But the error information is in the global errors, errors right under yang-patch-status, while edit-status has ok for all 3 edits.

RFC 8072 Section 3:

  case global-errors {
    uses rc:errors;
    description
      "This container will be present if global errors that
       are unrelated to a specific edit occurred.";
}

So this response is very misleading - the yang-patch-status message implies that an error occurred after successful processing of all 3 edits.

NOTE

  • For this PATCH request 3 <edit-config> RPC calls are sent to the device. Ideally the 3rd edit shouldn't take place since the second one failed although any changes made to the candidate datastore will be reverted with <discard-changes> before the transaction is closed. The following log shows the 3rd <edit-confg> and its response:
    21:33:49.040 TRACE [globalWorkerGroup-3-4] RemoteDeviceId[name=ncserver, address=/127.0.0.1:830]: Matched request: <rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-8505">
        <edit-config>
            <target>
                <candidate/>
            </target>
            <error-option>rollback-on-error</error-option>
            <config>
                <foo xmlns="urn:foo" xmlns:op="urn:ietf:params:xml:ns:netconf:base:1.0" op:operation="merge">
                    <baz>edit3</baz>
                </foo>
            </config>
        </edit-config>
    </rpc>
     to response: <rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-8505">
        <ok/>
    </rpc-reply>
    
  • When the device does not support the candidate datastore and only supports the running datastore, there is no easy way to restore the original configuration, and the following requirement on error handling is not guaranteed if a PATCH request has more than one edit.
      container yang-patch {
        description
          "Represents a conceptual sequence of datastore edits,
           called a patch.  Each patch is given a client-assigned
           patch identifier.  Each edit MUST be applied
           in ascending order, and all edits MUST be applied.
           If any errors occur, then the target datastore MUST NOT
           be changed by the YANG Patch operation.
    


 Comments   
Comment by Ivan Hrasko [ 16/Aug/23 ]

On master branch the current logic is that when some operation fails then others are not executed.

Comment by Sangwook Ha [ 16/Aug/23 ]

I think that's true for MD-SAL but not for NETCONF.
For example, after the second RPC for edit2 failed:

20:17:05.219 TRACE [globalWorkerGroup-3-1] RemoteDeviceId[name=ncserver, address=/127.0.0.1:830]: Matched request: <rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-63">
    <edit-config>
        <target>
            <candidate/>
        </target>
        <error-option>rollback-on-error</error-option>
        <config>
            <foo xmlns="urn:foo" xmlns:op="urn:ietf:params:xml:ns:netconf:base:1.0" op:operation="create">
                <baz>edit2</baz>
            </foo>
        </config>
    </edit-config>
</rpc>
 to response: <rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-63">
    <rpc-error>
        <error-type>application</error-type>
        <error-tag>data-exists</error-tag>
        <error-severity>error</error-severity>
        <error-message xml:lang="en">Node "baz" to be created already exists.</error-message>
    </rpc-error>
</rpc-reply>

it still sends another RPC for edit3:

20:17:05.244 TRACE [globalWorkerGroup-3-1] RemoteDeviceId[name=ncserver, address=/127.0.0.1:830]: Matched request: <rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-64">
    <edit-config>
        <target>
            <candidate/>
        </target>
        <error-option>rollback-on-error</error-option>
        <config>
            <foo xmlns="urn:foo" xmlns:op="urn:ietf:params:xml:ns:netconf:base:1.0" op:operation="merge">
                <baz>edit3</baz>
            </foo>
        </config>
    </edit-config>
</rpc>
 to response: <rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-64">
    <ok/>
</rpc-reply>

An exception caught while processing each edit in PatchDataTransactionUtil.patchData, prevents further processing of outstanding patch entities. But in the case of NETCONF all createDataWithinTransaction does is just to enqueue a NETCONF RPC operation, which does not fail. And when a NETCONF RPC actually returns an error this does not cancel the rest of the operations, hence all the RPCs are dispatched whether previous one is successful or not.

Comment by Ivan Hrasko [ 17/Aug/23 ]

rkashapov can you please check this? If true you can create new task that addresses that.

Comment by Ruslan Kashapov [ 23/Aug/23 ]

Patch request is processed based on RestconfStrategy which varies depending on referenced instance identifier – see RestconfDataServiceImpl 

If instance identifier is associated with mount point and having NetconfService associated then NetconfRestconfStrategy used which relies on NETCONF protocol to process the request. This approach is used to addresses external devices mounted via topology mechanism. If instance identifier points to generic data node then MdsalRestconfStrategy  is used which relies on a standard DomDataBroker instance to update the datastore.

Each strategy has own implementation of RestconfTransaction with commit() acting differently. NetconfRestconfTransaction delegates almost all transaction operations to an instance of NetconfService. In a context of topology it's a ProxyNetconfService which queues operations before commit() then executed asynchronously on commit(). It's the reason why

  • no error is detected on each operations processing
  • NETCONF requests are continued despite error(s) on some of operations (during commit). 

Resuming:

  • It seems reasonable to make operations processing consistent – synchronous processing of operations with abort on first error for both MdsalRestconfTransaction and NetconfRestconfTransaction i.e. ProxyNetconfService.commit(). Significant effort is required to implement this.
  • For both implementations commit() errors contain no reference (edit-id) to operation failed. The main reason is RestconfTransaction API has no operation ID input on per operation basis, so there will be no reference in output. This results the commit error is always shown as global error. Updating transaction API and implementations seems to require a HUGE effort. It seems more reasonable to pay more attention to error messages, making them informative. Anyway no error message is lost even now no matter we point the operation failed or not.

The case when global error is shown together with all operations listed having success status is an obvious (easy to fix) bug. Will address it.

Comment by Robert Varga [ 06/Sep/23 ]

Right, and this will be probably be further differentiated, as we really should be translating the patch request into a single edit-config, based on what the NETCONF device actually supports.

Comment by Ivan Hrasko [ 09/Nov/23 ]
  1. https://git.opendaylight.org/gerrit/c/netconf/+/108880 solves the incorrect response code.
  2. https://git.opendaylight.org/gerrit/c/netconf/+/107687 solves "So this response is very misleading - the yang-patch-status message implies that an error occurred after successful processing of all 3 edits.".
Comment by Ivan Hrasko [ 09/Nov/23 ]

This can be done in https://jira.opendaylight.org/browse/NETCONF-1197.

Generated at Wed Feb 07 20:16:41 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.