[CONTROLLER-1332] Implement keepalive mechanism in netconf southbound plugin Created: 22/May/15  Updated: 05/Jun/15  Resolved: 05/Jun/15

Status: Resolved
Project: controller
Component/s: netconf
Affects Version/s: Post-Helium
Fix Version/s: None

Type: Improvement
Reporter: Maros Marsalek Assignee: Maros Marsalek
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All



 Description   

Some devices drop the netconf session with ODL but the SSH/TCP connection itself is still present. In such situations, ODL has no way of knowing that the session was dropped and does not perform a reconnect.

Netconf southbound connector in ODL could periodically invoke some sort of RPC to check whether a netconf session is active. If there is no response, or the request fails in network, ODL should disconnect the session and schedule a reconnect. This way ODL would know that the session is no longer active, but the connection was not fully dropped.

Suitable RPC call would be get-config with an empty filter section. According to the RFC, an empty response should be returned (so the device is not burdened too much by this keepalive):
http://tools.ietf.org/html/rfc6241#section-6.4.2

  • The keepalive frequency should be configurable with a reasonable default value
  • Keepalive should be posponed if any other netconf RPCs are being invoked invoked. Its pointless to invoke keepalive RPC if there are other RPCs being invoked at the time by an application or a user


 Comments   
Comment by Andrew McLachlan [ 22/May/15 ]

Hi Maros

There is a thread on this in the NETCONF WG. We would need any solution to be tuneable, since requirement will differ provider to provider.

http://www.ietf.org/mail-archive/web/netconf/current/msg10012.html

A clearer digest is here - https://github.com/netconf-wg/server-model/issues/43

Kent is proposing:

[1] .../connection-type/persistent/keep-alives/interval-secs:

Set default to 5 minutes

[2] .../connection-type/periodic/linger-secs:

Remove this node.

[3] .../reconnect-strategy/interval-secs:

Remove this node.

Comment by Maros Marsalek [ 25/May/15 ]

Hi Andrew,

I have already proposed first version for this:

https://git.opendaylight.org/gerrit/#/c/20988/

The keepalive delays are configurable (in seconds). Users can turn it completely off by providing a negative value.

The reconnect is performed immediately after a keepalive fails. No second timeout until reconnect is present.

Generated at Wed Feb 07 19:55:16 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.