[NETCONF-224] Netconf session closes and reconnects on error from non-heart beat RPC Created: 04/Jul/16  Updated: 15/Mar/19  Resolved: 19/Jul/16

Status: Resolved
Project: netconf
Component/s: netconf
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Brian Freeman Assignee: Jakub Morvay
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 6153

 Description   

NETCONF session closes on rpc error for non-heatbeat transaction. The desired behavoir is for the session to stay up and an error returned to the application (in this case it was a RESTconf initiated PUT (lock before edit-config).

Here is the log entry for the lock that failed. The failed lock was due to a mis-configuration on the device side wrt write transactions but get-config and other read only transactions were working fine before this error.

2016-06-28 21:59:10,858 | TRACE | oupCloseable-7-1 | NetconfDeviceCommunicator | RemoteDevice

{ncs5502}: Matched request: <rpc message-id="m-1" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<lock>
<target>
<candidate/>
</target>
</lock>
</rpc>
to response: <rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-1">
<rpc-error>
<error-type>application</error-type>
<error-tag>operation-failed</error-tag>
<error-severity>error</error-severity>
<error-message xml:lang="en">ME_BACKEND_ERROR_UNAUTHORIZED [non-QT backend request failed]</error-message>
</rpc-error>
<rpc-error>
<error-type>application</error-type>
<error-tag>backend-status-summary</error-tag>
<error-severity>warning</error-severity>
<error-message xml:lang="en">QT/SysDB backend passed, non-QT backend failed [read rest of response for details]</error-message>
</rpc-error>
</rpc-reply>

2016-06-28 21:59:10,860 | WARN | oupCloseable-7-1 | KeepaliveSalFacade | RemoteDevice{ncs5502}

: Rpc failure detected. Reconnecting netconf session



 Comments   
Comment by Jakub Morvay [ 12/Jul/16 ]

Hi Brian,

I tried to reproduce this bug but I wasn't able to. Simulated with netconf test tool modified to return error, when processing lock request.

But keep alive mechanism didn't drop the session. Actually when looking through implementation, reconnect should be triggered only when RPC fails to reach device, response is not received etc.. But any rpc response, even with rpc error should be fine.

If possible, can you please send karaf log. Or you can try to configure device without keep alive mechanism and send some rpcs to device after unsuccessful lock.

Comment by Brian Freeman [ 19/Jul/16 ]

I just tried to recreate it as well on Beryllium SR2 and could not. I suspect the first iterations of the interface with the device were flaky. If this re-occurs I will re-open the report. Sorry for false alarm.

Generated at Wed Feb 07 20:14:28 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.