[NETCONF-453] odl-netconf-topology creates two parallel connection for each configured device Created: 14/Aug/17  Updated: 15/Mar/19  Resolved: 27/Sep/17

Status: Resolved
Project: netconf
Component/s: netconf
Affects Version/s: None
Fix Version/s: None

Type: Bug
Reporter: Vratko Polak Assignee: Jakub Morvay
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Operating System: All
Platform: All


External issue ID: 8989

 Description   

This is happening on Nitrogen. Not sure for how long, as there were other issues hiding this.

On the robot side, this occasionally manifests [0] as observed data not matching the expected data.

Looking at the testtool log [1], the data do not match as two separate datastores are created:

12:57:03.218 [nioEventLoopGroup-2-1] DEBUG o.o.c.s.c.s.d.SnapshotBackedWriteTransaction - Write Tx: DOM-OPER-0 allocated with snapshot org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@14d698c6
12:57:03.218 [nioEventLoopGroup-2-1] DEBUG o.o.c.s.c.s.d.SnapshotBackedWriteTransaction - Write Tx: DOM-CFG-0 allocated with snapshot org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@4fea454c
12:57:03.219 [nioEventLoopGroup-2-4] DEBUG o.o.c.s.c.s.d.SnapshotBackedWriteTransaction - Write Tx: DOM-OPER-0 allocated with snapshot org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@2f6c84d4
12:57:03.222 [nioEventLoopGroup-2-4] DEBUG o.o.c.s.c.s.d.SnapshotBackedWriteTransaction - Write Tx: DOM-CFG-0 allocated with snapshot org.opendaylight.yangtools.yang.data.api.schema.tree.spi.Version@44e2b43d

And the reason for that is two clients connecting to the same simulated device (testtool does not expect two clients for the same device):

12:56:55.897 [main] INFO o.o.n.t.tool.NetconfDeviceSimulator - All simulated devices started successfully from port 17830 to 17830
12:56:56.943 [sshd-netconf-ssh-server-nio-group-thread-1] DEBUG o.a.sshd.common.io.nio2.Nio2Session - Creating IoSession on /10.29.13.225:17830 from /10.29.14.155:54522
...
12:57:02.727 [sshd-netconf-ssh-server-nio-group-thread-2] DEBUG o.a.sshd.common.io.nio2.Nio2Session - Creating IoSession on /10.29.13.225:17830 from /10.29.14.155:54524

Looking at karaf.log [2], the first duplicity is:

2017-08-14 12:56:56,941 | INFO | n-dispatcher-105 | AbstractNetconfTopology | 126 - netconf-topology-config - 1.3.0.SNAPSHOT | Connecting RemoteDevice

{Uri [_value=netconf-test-device]}

, with config Node{getNodeId=Uri [_value=netconf-test-device], augmentations={interface org.opendaylight.yang.gen.v1.urn.opendaylight.netconf.node.topology.rev150114.NetconfNode=NetconfNode{getActorResponseWaitTime=5, getBetweenAttemptsTimeoutMillis=2000, getConcurrentRpcLimit=0, getConnectionTimeoutMillis=20000, getCredentials=LoginPassword{getPassword=topsecret, getUsername=admin, augmentations={}}, getDefaultRequestTimeoutMillis=60000, getHost=Host [_ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.225]]], getKeepaliveDelay=0, getMaxConnectionAttempts=0, getPort=PortNumber [_value=17830], getSchemaCacheDirectory=schema, getSleepFactor=1.5, isReconnectOnChangedSchema=false, isSchemaless=false, isTcpOnly=false}}}

followed by an equivalent line at 12:56:56,951 but from thread on-dispatcher-67.

[0] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-1node-userfeatures-only-nitrogen/110/log.html.gz#s1-s4-s1-t15-k2-k1-k2-k1
[1] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-1node-userfeatures-only-nitrogen/110/testtool--netconf-userfeatures-txt-CRUD-CRUD.log.gz
[2] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-1node-userfeatures-only-nitrogen/110/odl1_karaf.log.gz



 Comments   
Comment by Vratko Polak [ 14/Aug/17 ]

> 12:57:02.727

Not sure why the second session is opened so much later,
or why the Robot check [3] has not seen "the device" connected until the second connection has been established.

[3] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-1node-userfeatures-only-nitrogen/110/log.html.gz#s1-s4-s1-t5-k2-k1

Comment by Jakub Morvay [ 15/Aug/17 ]

This issue seems to be introduced by https://git.opendaylight.org/gerrit/#/c/52528/

Above mentioned patch encrypts netconf-node's password. This encryption process occurs during mounting specified netconf device and works as follows:

1) Netconf node is being created in netconf-topology. This triggers ODL to mount specified netconf-node. This is place where password encryption logic takes place.
Password from netconf-node is tried to be decrypted (with help of AAAEncryptionService) and if it cannot be decrytped, that means the password is in plain text.

2) Assuming the password is in plain-text, we try to encrypt it and write this new modified netconf-node back to neconf-topology. The problem is that this triggers the whole mounting process over again. Note that we continue mounting process for the first DS write operation, we are not ending it anyhow.

So we end up with two parallel connections to one netconf device. Actual netconf communication with device should be done only trough latter one, but I guess the netconf-testtool is not ready for such scenario. Testtool creates new persisted DS for each socket, so after reconnect, ODL can be connected to different DS.

Comment by Vratko Polak [ 17/Aug/17 ]

> we continue mounting process for the first DS write operation, we are not ending it anyhow.

Ok, this looks like the ting to fix in this Bug.

> write this new modified netconf-node back to neconf-topology. The problem is that this triggers the whole mounting process over again.

That might be a design flaw which could be tracked in a separate Bugzilla item.

But that would not affect CSIT results directly, so only this Bug can be considered a blocker for Nitrogen.

Comment by Jakub Morvay [ 22/Aug/17 ]

https://git.opendaylight.org/gerrit/#/c/61980/

Generated at Wed Feb 07 20:15:04 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.