[NETCONF-850] Cannot reconnect on second time Created: 24/Jan/22 Updated: 11/Feb/22 Resolved: 11/Feb/22 |
|
| Status: | Resolved |
| Project: | netconf |
| Component/s: | netconf |
| Affects Version/s: | 2.0.11 |
| Fix Version/s: | 2.0.13, 1.13.8 |
| Type: | Bug | Priority: | High |
| Reporter: | Martin Sunal | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
ODL NETCONF client cannot reconnect after first reconnection. Distribution: onap-karaf-0.15.1 Steps to reproduce: 1. start netconf-testtool with 10 devices:
java -Djava.security.egd=file:/dev/./urandom -jar netconf-testtool-2.0.11-executable.jar --ssh true --device-count 10
2. mount all devices (mount-all.txt) => all devices are in "connected" state and it is possible to fetch data like:
GET http://localhost:8181/rests/data/network-topology:network-topology/topology=topology-netconf/node=1/yang-ext:mount/ietf-netconf-monitoring:netconf-state?content=nonconfig
3. kill netconf-testtool => all devices are in "connecting" state and fetch data from step2 returns:
{{ "errors": { "error": [ { "error-tag": "resource-denied-transport", "error-message": "Mount point does not exist.", "error-type": "protocol" } ] }}
All that is expected. 4. start testtool as in step 1 and acknowledge that it works as in step 2 5. kill netconf-testtool => all devices will be still in "connected" state = PROBLEM and fetch data from step2 returns: 500 Internal server error
{{ "errors": { "error": [ { "error-tag": "operation-failed", "error-info": "java.lang.IllegalArgumentException: Unable to read data: Optional[/(urn:ietf:params:xml:ns:yang:ietf-netconf-monitoring?revision=2010-10-04)netconf-state], errors: [RpcError [message=Channel closed, severity=ERROR, errorType=TRANSPORT, tag=operation-failed, applicationTag=null, info=null, cause=null]]", "error-message": "Transaction failed", "error-type": "application" } ] }}
and keepalives are running and in logs show:
10:53:47.212 WARN [globalWorkerGroup-3-9] RemoteDevice{1}: Keepalive RPC failed with error: [RpcError [message=Channel closed, severity=ERROR, errorType=TRANSPORT, tag=operation-failed, applicationTag=null, info=null, cause=null]]
6. start testtool as in step 1 => ODL will not reconnect devices and ODL is stuck in state from step 5.
The problem is in step 5. and 6. when ODL reports "connected" but there is no connection to NETCONF testtool and ODL does not reconnect automatically when testtool is started again. |
| Comments |
| Comment by Peter Puškár [ 11/Feb/22 ] |
|
Confirmed this issue is present on Phosphorus SR-1, although after retest on master it is not present anymore. Seems like it was fixed by this patch: |