[NETVIRT-1015] NatEvpnUtil: getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2 Created: 20/Nov/17  Updated: 17/Sep/18  Resolved: 17/Sep/18

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: Oxygen, Fluorine
Fix Version/s: None

Type: Bug Priority: Lowest
Reporter: Sam Hague Assignee: Sam Hague
Resolution: Done Votes: 0
Labels: csit:3node
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to NETVIRT-1324 OptimisticLockFailedException.../flow... Resolved
Epic Link: Clustering Stability

 Description   

3node tests where ODL1 is taken down and brought back to service. Below happens on bringing ODL1 back up.

https://logs.opendaylight.org/releng/jenkins092/netvirt-csit-3node-openstack-ocata-upstream-stateful-carbon/184/odl_1/odl1_err_warn_exception.log.gz

2017-11-20 03:43:35,420 | ERROR | eChangeHandler-0 | NeutronvpnManager                | 338 - org.opendaylight.netvirt.neutronvpn-impl - 0.4.3.SNAPSHOT | createSubnetmapNode: Subnetmap node for subnet ID 1b71e2fd-772e-4984-ac49-091ffcd5f8ec already exists, returning
2017-11-20 03:43:35,482 | WARN  | eChangeHandler-0 | CentralizedSwitchChangeListener  | 334 - org.opendaylight.netvirt.vpnmanager-impl - 0.4.3.SNAPSHOT | No router data found for router id 846ad560-cd27-4e81-a81f-f97248ff8509
2017-11-20 03:43:35,501 | ERROR | eChangeHandler-0 | NeutronvpnManager                | 338 - org.opendaylight.netvirt.neutronvpn-impl - 0.4.3.SNAPSHOT | createSubnetmapNode: Subnetmap node for subnet ID 22a1844a-1435-4738-b9d6-0fed96d8c59e already exists, returning
2017-11-20 03:43:35,558 | ERROR | eChangeHandler-0 | AsyncDataTreeChangeListenerBase  | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Thread terminated due to uncaught exception: AsyncDataTreeChangeListenerBase-DataTreeChangeHandler-0
java.lang.NullPointerException
2017-11-20 03:43:45,484 | ERROR | eChangeHandler-0 | NatEvpnUtil                      | 343 - org.opendaylight.netvirt.natservice-impl - 0.4.3.SNAPSHOT | getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2
2017-11-20 03:43:45,491 | ERROR | eChangeHandler-0 | NatEvpnUtil                      | 343 - org.opendaylight.netvirt.natservice-impl - 0.4.3.SNAPSHOT | getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2
2017-11-20 03:43:45,492 | ERROR | eChangeHandler-0 | NatEvpnUtil                      | 343 - org.opendaylight.netvirt.natservice-impl - 0.4.3.SNAPSHOT | getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2
2017-11-20 03:43:45,492 | ERROR | eChangeHandler-0 | NatEvpnUtil                      | 343 - org.opendaylight.netvirt.natservice-impl - 0.4.3.SNAPSHOT | getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2
2017-11-20 03:43:45,493 | ERROR | eChangeHandler-0 | NatEvpnUtil                      | 343 - org.opendaylight.netvirt.natservice-impl - 0.4.3.SNAPSHOT | getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2
2017-11-20 03:43:45,493 | ERROR | eChangeHandler-0 | NatEvpnUtil                      | 343 - org.opendaylight.netvirt.natservice-impl - 0.4.3.SNAPSHOT | getExtNwProvTypeFromRouterName : external network UUID is not available for router 53678e60-fadf-4eac-ac7c-23fad8c99dc2
2017-11-20 03:43:45,559 | WARN  | nPool-1-worker-0 | NeutronPortChangeListener        | 338 - org.opendaylight.netvirt.neutronvpn-impl - 0.4.3.SNAPSHOT | Interface 58c61dda-b7fb-45b1-a02e-ebcfc133425a is already present
2017-11-20 03:43:45,592 | ERROR | eChangeHandler-0 | VpnSubnetRouteHandler            | 334 - org.opendaylight.netvirt.vpnmanager-impl - 0.4.3.SNAPSHOT | SUBNETROUTE: onSubnetAddedToVpn: SubnetOpDataEntry for subnet 1b71e2fd-772e-4984-ac49-091ffcd5f8ec with ip 90.0.0.0/24 and vpn 53678e60-fadf-4eac-ac7c-23fad8c99dc2 already detected to be present
2017-11-20 03:43:45,692 | ERROR | eChangeHandler-0 | VpnSubnetRouteHandler            | 334 - org.opendaylight.netvirt.vpnmanager-impl - 0.4.3.SNAPSHOT | SUBNETROUTE: onSubnetAddedToVpn: SubnetOpDataEntry for subnet 22a1844a-1435-4738-b9d6-0fed96d8c59e with ip 100.0.0.0/24 and vpn 53678e60-fadf-4eac-ac7c-23fad8c99dc2 already detected to be present
2017-11-20 03:43:45,728 | WARN  | nPool-1-worker-2 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Job VPN-53678e60-fadf-4eac-ac7c-23fad8c99dc2 took 10175ms to complete
2017-11-20 03:43:45,732 | WARN  | nPool-1-worker-1 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Job VPN-ba01eac7-29b1-481a-807d-f67c14abc058 took 10173ms to complete
2017-11-20 03:43:45,844 | WARN  | nPool-1-worker-3 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Job 420d899d-379d-4a5e-8cba-9bdfe0ee0fd9 took 10287ms to complete
2017-11-20 03:43:45,899 | ERROR | nPool-1-worker-2 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='afec949a-c73a-4006-a10a-89fc7ee70721', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=afec949a-c73a-4006-a10a-89fc7ee70721], _zoneName=afec949a-c73a-4006-a10a-89fc7ee70721, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=afec949a-c73a-4006-a10a-89fc7ee70721], _zoneName=afec949a-c73a-4006-a10a-89fc7ee70721, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=afec949a-c73a-4006-a10a-89fc7ee70721], _zoneName=afec949a-c73a-4006-a10a-89fc7ee70721, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:45,942 | ERROR | nPool-1-worker-3 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='d7bb65a1-4ae7-4691-907b-8cdbf54a2d76', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=d7bb65a1-4ae7-4691-907b-8cdbf54a2d76], _zoneName=d7bb65a1-4ae7-4691-907b-8cdbf54a2d76, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=d7bb65a1-4ae7-4691-907b-8cdbf54a2d76], _zoneName=d7bb65a1-4ae7-4691-907b-8cdbf54a2d76, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=d7bb65a1-4ae7-4691-907b-8cdbf54a2d76], _zoneName=d7bb65a1-4ae7-4691-907b-8cdbf54a2d76, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:45,944 | ERROR | nPool-1-worker-3 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='53678e60-fadf-4eac-ac7c-23fad8c99dc2', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=53678e60-fadf-4eac-ac7c-23fad8c99dc2], _zoneName=53678e60-fadf-4eac-ac7c-23fad8c99dc2, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=53678e60-fadf-4eac-ac7c-23fad8c99dc2], _zoneName=53678e60-fadf-4eac-ac7c-23fad8c99dc2, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=53678e60-fadf-4eac-ac7c-23fad8c99dc2], _zoneName=53678e60-fadf-4eac-ac7c-23fad8c99dc2, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:45,947 | ERROR | nPool-1-worker-3 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='d8ef35ed-8282-42f9-ab00-5c424c01ddbf', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=d8ef35ed-8282-42f9-ab00-5c424c01ddbf], _zoneName=d8ef35ed-8282-42f9-ab00-5c424c01ddbf, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=d8ef35ed-8282-42f9-ab00-5c424c01ddbf], _zoneName=d8ef35ed-8282-42f9-ab00-5c424c01ddbf, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=d8ef35ed-8282-42f9-ab00-5c424c01ddbf], _zoneName=d8ef35ed-8282-42f9-ab00-5c424c01ddbf, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:45,954 | ERROR | nPool-1-worker-3 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='1c6a94be-45cd-4c3a-b86e-3b9217ab9205', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=1c6a94be-45cd-4c3a-b86e-3b9217ab9205], _zoneName=1c6a94be-45cd-4c3a-b86e-3b9217ab9205, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=1c6a94be-45cd-4c3a-b86e-3b9217ab9205], _zoneName=1c6a94be-45cd-4c3a-b86e-3b9217ab9205, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=1c6a94be-45cd-4c3a-b86e-3b9217ab9205], _zoneName=1c6a94be-45cd-4c3a-b86e-3b9217ab9205, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:45,986 | ERROR | nPool-1-worker-2 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='332fa0a2-c555-4219-bb74-85a0f54857fe', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=332fa0a2-c555-4219-bb74-85a0f54857fe], _zoneName=332fa0a2-c555-4219-bb74-85a0f54857fe, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=332fa0a2-c555-4219-bb74-85a0f54857fe], _zoneName=332fa0a2-c555-4219-bb74-85a0f54857fe, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=332fa0a2-c555-4219-bb74-85a0f54857fe], _zoneName=332fa0a2-c555-4219-bb74-85a0f54857fe, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:45,990 | ERROR | nPool-1-worker-2 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='27ac5507-66fa-4df8-9658-8649f377afbc', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=27ac5507-66fa-4df8-9658-8649f377afbc], _zoneName=27ac5507-66fa-4df8-9658-8649f377afbc, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=247085905742467, _key=DPNTEPsInfoKey [_dPNID=247085905742467], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=247085905742467:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.23]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=27ac5507-66fa-4df8-9658-8649f377afbc], _zoneName=27ac5507-66fa-4df8-9658-8649f377afbc, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=27ac5507-66fa-4df8-9658-8649f377afbc], _zoneName=27ac5507-66fa-4df8-9658-8649f377afbc, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException
2017-11-20 03:43:46,013 | ERROR | nPool-1-worker-2 | DataStoreJobCoordinator          | 293 - org.opendaylight.genius.mdsalutil-api - 0.2.3.SNAPSHOT | Exception when executing jobEntry: JobEntry{key='5a70e14d-58f3-4f46-ab00-722055aa5b78', mainWorker=ItmTepAddWorker  { Configured Dpn List : [DPNTEPsInfo [_dPNID=88256775084985, _key=DPNTEPsInfoKey [_dPNID=88256775084985], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=88256775084985:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.15.91]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=5a70e14d-58f3-4f46-ab00-722055aa5b78], _zoneName=5a70e14d-58f3-4f46-ab00-722055aa5b78, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]], DPNTEPsInfo [_dPNID=176858005830847, _key=DPNTEPsInfoKey [_dPNID=176858005830847], _tunnelEndPoints=[TunnelEndPoints [_gwIpAddress=IpAddress [_ipv4Address=Ipv4Address [_value=0.0.0.0]], _interfaceName=176858005830847:tunnel_port:0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _key=TunnelEndPointsKey [_portname=tunnel_port, _vLANID=0, _ipAddress=IpAddress [_ipv4Address=Ipv4Address [_value=10.29.13.247]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan], _optionTunnelTos=0, _portname=tunnel_port, _subnetMask=IpPrefix [_ipv4Prefix=Ipv4Prefix [_value=0.0.0.0/0]], _tunnelType=class org.opendaylight.yang.gen.v1.urn.opendaylight.genius.interfacemanager.rev160406.TunnelTypeVxlan, _tzMembership=[TzMembership [_key=TzMembershipKey [_zoneName=5a70e14d-58f3-4f46-ab00-722055aa5b78], _zoneName=5a70e14d-58f3-4f46-ab00-722055aa5b78, augmentation=[]]], _vLANID=0, _optionOfTunnel=false, augmentation=[]]], augmentation=[]]] }, rollbackWorker=null, retryCount=0, futures=null}
java.util.ConcurrentModificationException


 Comments   
Comment by Sam Hague [ 06/Apr/18 ]

still present: https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-openstack-queens-upstream-stateful-oxygen/233/odl_1/odl1_karaf.log.gz

Comment by Sam Hague [ 19/Jun/18 ]

Still seen: https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-openstack-queens-upstream-stateful-fluorine/119/

Comment by Michael Vorburger [ 02/Jul/18 ]

I've picked up on the DataStoreJobCoordinator ConcurrentModificationException shown above because I thought were curious, but grepping for "ConcurrentModificationException" in odl[1-3]_karaf.log.gz in this job I don't see those anymore, so that seems to have gotten fixed, and this is only about that other ERROR from NatEvpnUtil from "external network UUID is not available". Perhaps edit Description?

Comment by Sam Hague [ 07/Aug/18 ]

This error looks to be coming because the neutron northbound ds is down. The ODL nodes do not have a leader and are quarantined so the ds operations are failing.

https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-fluorine/23/odl_1/odl1_err_warn_exception.log.gz

2018-08-06T15:43:11,882 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-2 | ReliableDeliverySupervisor       | 41 - com.typesafe.akka.slf4j - 2.5.11 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.30.170.219:2550] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://opendaylight-cluster-data@10.30.170.219:2550]] Caused by: [Connection refused: /10.30.170.219:2550]
2018-08-06T15:43:14,002 | WARN  | qtp1641743688-115 | ServletHandler                   | 164 - org.eclipse.jetty.util - 9.3.21.v20170918 | 
javax.servlet.ServletException: org.opendaylight.neutron.spi.ReadFailedRuntimeException: ReadFailedException{message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, errorList=[RpcError [message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.]]}
Caused by: org.opendaylight.neutron.spi.ReadFailedRuntimeException: ReadFailedException{message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, errorList=[RpcError [message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.]]}
Caused by: org.opendaylight.controller.md.sal.common.api.data.ReadFailedException: Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks
	at org.opendaylight.controller.sal.core.compat.ReadFailedExceptionAdapter.newWithCause(ReadFailedExceptionAdapter.java:28) ~[?:?]
	at org.opendaylight.controller.sal.core.compat.ReadFailedExceptionAdapter.newWithCause(ReadFailedExceptionAdapter.java:18) ~[?:?]
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:91) ~[?:?]
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:40) ~[?:?]
	at org.opendaylight.controller.md.sal.common.api.MappingCheckedFuture.mapException(MappingCheckedFuture.java:60) ~[?:?]
	at org.opendaylight.controller.md.sal.common.api.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:64) ~[?:?]
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:713) ~[?:?]
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:713) ~[?:?]
	at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:54) ~[?:?]
Caused by: org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.
Caused by: org.opendaylight.controller.cluster.datastore.exceptions.NoShardLeaderException: Shard member-1-shard-default-config currently has no leader. Try again later.
	at org.opendaylight.controller.cluster.datastore.shardmanager.ShardManager.createNoShardLeaderException(ShardManager.java:955) ~[?:?]
2018-08-06T15:43:14,035 | WARN  | qtp1641743688-115 | HttpChannel                      | 164 - org.eclipse.jetty.util - 9.3.21.v20170918 | //10.30.170.217:8181/controller/nb/v2/neutron/networks
javax.servlet.ServletException: javax.servlet.ServletException: org.opendaylight.neutron.spi.ReadFailedRuntimeException: ReadFailedException{message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, errorList=[RpcError [message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.]]}
Caused by: javax.servlet.ServletException: org.opendaylight.neutron.spi.ReadFailedRuntimeException: ReadFailedException{message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, errorList=[RpcError [message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.]]}
Caused by: org.opendaylight.neutron.spi.ReadFailedRuntimeException: ReadFailedException{message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, errorList=[RpcError [message=Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks, severity=ERROR, errorType=APPLICATION, tag=operation-failed, applicationTag=null, info=null, cause=org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.]]}
Caused by: org.opendaylight.controller.md.sal.common.api.data.ReadFailedException: Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/networks
	at org.opendaylight.controller.sal.core.compat.ReadFailedExceptionAdapter.newWithCause(ReadFailedExceptionAdapter.java:28) ~[?:?]
	at org.opendaylight.controller.sal.core.compat.ReadFailedExceptionAdapter.newWithCause(ReadFailedExceptionAdapter.java:18) ~[?:?]
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:91) ~[?:?]
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:40) ~[?:?]
	at org.opendaylight.controller.md.sal.common.api.MappingCheckedFuture.mapException(MappingCheckedFuture.java:60) ~[?:?]
	at org.opendaylight.controller.md.sal.common.api.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:64) ~[?:?]
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:713) ~[?:?]
	at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:713) ~[?:?]
	at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:54) ~[?:?]
Caused by: org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-config currently has no leader. Try again later.
Caused by: org.opendaylight.controller.cluster.datastore.exceptions.NoShardLeaderException: Shard member-1-shard-default-config currently has no leader. Try again later.
	at org.opendaylight.controller.cluster.datastore.shardmanager.ShardManager.createNoShardLeaderException(ShardManager.java:955) ~[?:?]
2018-08-06T15:43:15,212 | WARN  | qtp1641743688-116 | BrokerFacade                     | 328 - org.opendaylight.netconf.restconf-nb-bierman02 - 1.8.0 | Error reading /(urn:opendaylight:neutron?revision=2015-07-12)neutron/hostconfigs from datastore OPERATIONAL
org.opendaylight.controller.md.sal.common.api.data.ReadFailedException: Error executeRead ReadData for path /(urn:opendaylight:neutron?revision=2015-07-12)neutron/hostconfigs
	at org.opendaylight.controller.sal.core.compat.ReadFailedExceptionAdapter.newWithCause(ReadFailedExceptionAdapter.java:28) [234:org.opendaylight.controller.sal-core-compat:1.8.0]
	at org.opendaylight.controller.sal.core.compat.ReadFailedExceptionAdapter.newWithCause(ReadFailedExceptionAdapter.java:18) [234:org.opendaylight.controller.sal-core-compat:1.8.0]
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:91) [414:org.opendaylight.yangtools.util:2.0.9]
	at org.opendaylight.yangtools.util.concurrent.ExceptionMapper.apply(ExceptionMapper.java:40) [414:org.opendaylight.yangtools.util:2.0.9]
	at org.opendaylight.controller.md.sal.common.api.MappingCheckedFuture.mapException(MappingCheckedFuture.java:60) [229:org.opendaylight.controller.sal-common-api:1.8.0]
	at org.opendaylight.controller.md.sal.common.api.MappingCheckedFuture.wrapInExecutionException(MappingCheckedFuture.java:64) [229:org.opendaylight.controller.sal-common-api:1.8.0]
Caused by: org.opendaylight.mdsal.common.api.DataStoreUnavailableException: Shard member-1-shard-default-operational currently has no leader. Try again later.
Caused by: org.opendaylight.controller.cluster.datastore.exceptions.NoShardLeaderException: Shard member-1-shard-default-operational currently has no leader. Try again later.
	at org.opendaylight.controller.cluster.datastore.shardmanager.ShardManager.createNoShardLeaderException(ShardManager.java:955) ~[?:?]
Comment by Michael Vorburger [ 07/Aug/18 ]

> This error looks to be coming because the neutron northbound ds is down.
> The ODL nodes do not have a leader and are quarantined so the ds operations are failing.

erm, wait; you're saying that NatEvpnUtil error (above) is due to this? You sure? Or this is an entirely new thing now?

Anyway, the this new error isn't anything to "fix" in Neutron - question now is why cluster died in this CSIT.

Comment by Sam Hague [ 07/Aug/18 ]

Yeah, I mean the whole clustering is down so reads are failing - just happens to be these are sometimes northbound reads. As to what to fix - that depends. If we think the lower layers can be changed to be more reliable then that is a fix. If not, then the apps will need to change to handle the issues like reading later when the cluster is available or using caches maybe. I think this falls in the MDSAL best practices work where the applications are not well-designed to handle problems.

Comment by Jamo Luhrsen [ 07/Aug/18 ]

here is the full karaf.log

Comment by Michael Vorburger [ 14/Aug/18 ]

shague and jluhrsen re. the point we discussed during the weekly kernel projects call today about this issue, where I promised that I would look into improving the propagation of datastore errors from neutron to the driver so that it (OpenStack driver) can notify operators and retry, I just saw that we actually already did this, but quite recently - that was c/72735 for NEUTRON-157.

Comment by Tom Pantelis [ 21/Aug/18 ]

That log is from odl1 and apparently was restarted 6 times in a 3 hour period:

2018-08-06T12:39:29,643 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-3 | Slf4jLogger                      | 41 - com.typesafe.akka.slf4j - 2.5.11 | Slf4jLogger started
2018-08-06T14:23:10,238 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-2 | Slf4jLogger                      | 41 - com.typesafe.akka.slf4j - 2.5.11 | Slf4jLogger started
2018-08-06T14:49:20,546 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-2 | Slf4jLogger                      | 41 - com.typesafe.akka.slf4j - 2.5.11 | Slf4jLogger started
2018-08-06T15:05:43,794 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-3 | Slf4jLogger                      | 41 - com.typesafe.akka.slf4j - 2.5.11 | Slf4jLogger started
2018-08-06T15:39:18,545 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-3 | Slf4jLogger                      | 41 - com.typesafe.akka.slf4j - 2.5.11 | Slf4jLogger started
2018-08-06T15:48:57,768 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-3 | Slf4jLogger                      | 41 - com.typesafe.akka.slf4j - 2.5.11 | Slf4jLogger started

IPs:

odl1: 10.30.170.226
odl2: 10.30.170.218
odl3: 10.30.170.219

After the first startup, the other 2 nodes joined and the clustered was formed around 12:39:48. At 14:23:10, odl1 restarted. At 14:25:41, odl1 lost connection with odl2 (10.30.170.218) and then odl2 re-joined at 14:29:57. At 14:35:25, odl1 lost connection with odl3 (10.30.170.219) and then odl3 re-joined at 14:39:38, odl2 re-joined. At 14:49:20, odl1 was restarted again.

The DataStoreUnavailableExceptions via neutron occurred between 15:42 - 15:44 after the 15:39:18 restart. odl1 re-joined odl2 at 15:39:21. At 15:41:25, odl1 lost connection to both odl1 and odl2:

2018-08-06T15:41:25,443 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-54 | ReliableDeliverySupervisor       | 41 - com.typesafe.akka.slf4j - 2.5.11 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.30.170.218:2550] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 
2018-08-06T15:41:26,275 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-54 | ReliableDeliverySupervisor       | 41 - com.typesafe.akka.slf4j - 2.5.11 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.30.170.219:2550] has failed, address is now gated for [5000] ms. Reason: [Disassociated] 
2018-08-06T15:41:29,830 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-38 | ClusterCoreDaemon                | 41 - com.typesafe.akka.slf4j - 2.5.11 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.226:2550] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://opendaylight-cluster-data@10.30.170.218:2550, status = Up)]. Node roles [member-1, dc-default]
2018-08-06T15:41:30,485 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-38 | NettyTransport                   | 41 - com.typesafe.akka.slf4j - 2.5.11 | Remote connection to [null] failed with java.net.ConnectException: Connection refused: /10.30.170.218:2550
2018-08-06T15:41:30,491 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-38 | ReliableDeliverySupervisor       | 41 - com.typesafe.akka.slf4j - 2.5.11 | Association with remote system [akka.tcp://opendaylight-cluster-data@10.30.170.218:2550] has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://opendaylight-cluster-data@10.30.170.218:2550]] Caused by: [Connection refused: /10.30.170.218:2550]
2018-08-06T15:41:30,826 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-54 | ClusterCoreDaemon                | 41 - com.typesafe.akka.slf4j - 2.5.11 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.226:2550] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://opendaylight-cluster-data@10.30.170.219:2550, status = Up)]. Node roles [member-1, dc-default]
2018-08-06T15:41:31,300 | WARN  | opendaylight-cluster-data-akka.actor.default-dispatcher-54 | NettyTransport                   | 41 - com.typesafe.akka.slf4j - 2.5.11 | Remote connection to [null] failed with java.net.ConnectException: Connection refused: /10.30.170.219:2550

odl2 re-joined at 15:45:06:

2018-08-06T15:45:06,921 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-65 | Cluster(akka://opendaylight-cluster-data) | 41 - com.typesafe.akka.slf4j - 2.5.11 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.226:2550] - Node [akka.tcp://opendaylight-cluster-data@10.30.170.218:2550] is JOINING, roles [member-2, dc-default]

odl3 re-joined at 15:45:20:

2018-08-06T15:45:20,125 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-63 | Cluster(akka://opendaylight-cluster-data) | 41 - com.typesafe.akka.slf4j - 2.5.11 | Cluster Node [akka.tcp://opendaylight-cluster-data@10.30.170.226:2550] - Node [akka.tcp://opendaylight-cluster-data@10.30.170.219:2550] is JOINING, roles [member-3, dc-default]

So the neutron requests failed as expected since odl1 lost connection to both odl2 and odl3. So is this expected for the tests?

Comment by Sam Hague [ 21/Aug/18 ]

Still present in jobs, need to add more triage info: https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-oxygen/32/odl_1/odl1_err_warn_exception.log.gz

Comment by Tom Pantelis [ 21/Aug/18 ]

What is the test suite actually doing? From my analysis above, it is taking down 2 nodes in which case you would expect DS DataStoreUnavailableException/NoShardLeaderException failures. Notice the "java.net.ConnectException: Connection refused" messages - that indicates it could connect to the node IP but the port isn't open, ie the process isn't running. Can someone verify that is the case?

Comment by Chetan Arakere Gowdru [ 23/Aug/18 ]

Hi All,

 

I did analyse on the below points(tracing for shard-default-config) and steps performed as part of this Suite there below exceptions are observed.

Caused by: org.opendaylight.controller.cluster.datastore.exceptions.NoShardLeaderException: Shard member-3-shard-default-config currently has no leader. Try again later.

 

1) As part of ha_l2 suite, the ODL-1(member-1) been first elected as Leader node for default-config shard.(as per [1] from member-1,member-2 and member-3 section output)

2) First ODL-1 is brought down and ODL-2(member-2) been elected as leader node for default-config shard.(as per [2] of member-2 and member-3 section output)

3) ODL-1 is brought up later and it also have member-2 as leader node for default-config shard. (as per [3] from member-1 section output)

4) Now ODL-2 is brought down and now the member-1 got re-elected back as leader node(as per [4] of member-1 section and [3] of member-1 section output)

5) After this ODL-2 is brought back and member-1 continuing as leader node(as per [3] from member-2 section output)

6) finally ODL-3 is brought down and at this point, still member-1 is an leader node for the default-config shard.

7) Now the ODL-3 is brought up and at this point I still see member-1 continuing as leader node(as per [4] from member-3 section output)

8) As part of ODL-3 up, all the Listener are registered and triggered. As part of this, then READ operations are done on DS.

9) Mean time, the Suite brought down both ODL-1 and ODL-2 and during this time ODL-3 not elected as leader node(as per [5] from member-3 section output) and an READ operations during this period been thrown with below observed exception.

10) These exception are observed until ODL-1 and ODL-2 are brought back and new leader node is re-elected which is member-3(as per [6] from member-3 and member-1 section output)

 

Basically, I'm seeing where when two nodes are brought down, the lone left over node is not been re-elected as leader node and any READ operations during this time are resulting in this exception.

 

**Below are the few key logs captured from karaf logs frorm all ODLs.

 

Ref : https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-3node-0cmb-1ctl-2cmp-openstack-queens-upstream-stateful-fluorine/23/robot-plugin/log_06_ha_l2.html.gz

 

Thanks,

Chetan

 

 [1] Line 7203: 2018-08-06T14:21:18,827 | INFO  | pipe-log:log "ROBOT MESSAGE: Killing ODL1 10.30.170.226" | core                             | 122 - org.apache.karaf.log.core - 4.1.5 | ROBOT MESSAGE: Killing ODL1 10.30.170.226

[2] Line 7357: 2018-08-06T14:22:37,300 | INFO  | pipe-log:log "ROBOT MESSAGE: Starting test 06 ha l2.Bring Up ODL1" | core                             | 122 - org.apache.karaf.log.core - 4.1.5 | ROBOT MESSAGE: Starting test 06 ha l2.Bring Up ODL1

[3] Line 7476: 2018-08-06T14:25:30,247 | INFO  | pipe-log:log "ROBOT MESSAGE: Starting test 06 ha l2.Take Down ODL2" | core                             | 122 - org.apache.karaf.log.core - 4.1.5 | ROBOT MESSAGE: Starting test 06 ha l2.Take Down ODL2

[4] Line 7997: 2018-08-06T14:29:13,181 | INFO  | pipe-log:log "ROBOT MESSAGE: Starting test 06 ha l2.Bring Up ODL2" | core                             | 122 - org.apache.karaf.log.core - 4.1.5 | ROBOT MESSAGE: Starting test 06 ha l2.Bring Up ODL2

[5] Line 8199: 2018-08-06T14:35:05,041 | INFO  | pipe-log:log "ROBOT MESSAGE: Starting test 06 ha l2.Take Down ODL3" | core                             | 122 - org.apache.karaf.log.core - 4.1.5 | ROBOT MESSAGE: Starting test 06 ha l2.Take Down ODL3

[6] ODL3 is brought up back 2018-08-06T14:38:55,105

 

 

member-1/ODL-1 (10.30.170.226)

 

[1] 2018-08-06T12:39:51,446 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-29 | RoleChangeNotifier               | 228 - org.opendaylight.controller.sal-clustering-commons - 1.8.0 | RoleChangeNotifier for member-1-shard-default-config , received role change from Candidate to Leader

[2] 2018-08-06T14:25:51,191 | INFO  | opendaylight-cluster-data-shard-dispatcher-56 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-default-config, leaderId=member-1-shard-default-config, leaderPayloadVersion=9]

[3] 2018-08-06T14:23:13,749 | INFO  | opendaylight-cluster-data-shard-dispatcher-53 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-default-config, leaderId=member-2-shard-default-config, leaderPayloadVersion=9]

[4] 2018-08-06T14:25:51,191 | INFO  | opendaylight-cluster-data-shard-dispatcher-56 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-default-config, leaderId=member-1-shard-default-config, leaderPayloadVersion=9]

[5] 2018-08-06T14:49:25,702 | INFO  | opendaylight-cluster-data-shard-dispatcher-56 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-default-config, leaderId=member-3-shard-default-config, leaderPayloadVersion=9]

[6] 2018-08-06T14:49:25,702 | INFO  | opendaylight-cluster-data-shard-dispatcher-56 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-1-shard-default-config, leaderId=member-3-shard-default-config, leaderPayloadVersion=9]

 

member-2/ODL-2 (10.30.170.218)

 

[1] 2018-08-06T12:39:43,561 | INFO  | opendaylight-cluster-data-akka.actor.default-dispatcher-20 | RoleChangeNotifier               | 228 - org.opendaylight.controller.sal-clustering-commons - 1.8.0 | RoleChangeNotifier for member-2-shard-default-config , received role change from Follower to Candidate

[2] 2018-08-06T14:21:28,922 | INFO  | opendaylight-cluster-data-shard-dispatcher-41 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-2-shard-default-config, leaderId=member-2-shard-default-config, leaderPayloadVersion=9]

[3] 2018-08-06T14:29:49,881 | INFO  | opendaylight-cluster-data-shard-dispatcher-58 | ShardManager                    | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-2-shard-default-config, leaderId=member-1-shard-default-config, leaderPayloadVersion=9]

[4] 2018-08-06T14:49:21,530 | INFO  | opendaylight-cluster-data-shard-dispatcher-62 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-2-shard-default-config, leaderId=member-3-shard-default-config, leaderPayloadVersion=9]

[5] 2018-08-06T14:49:21,530 | INFO  | opendaylight-cluster-data-shard-dispatcher-62 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-2-shard-default-config, leaderId=member-3-shard-default-config, leaderPayloadVersion=9]

 

 

member-3/ODL-3 (10.30.170.219)

 

[1] 2018-08-06T12:39:51,448 | INFO  | opendaylight-cluster-data-shard-dispatcher-25 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-3-shard-default-config, leaderId=member-1-shard-default-config, leaderPayloadVersion=9]

[2] 2018-08-06T14:21:28,926 | INFO  | opendaylight-cluster-data-shard-dispatcher-39 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-3-shard-default-config, leaderId=member-2-shard-default-config, leaderPayloadVersion=9]

[3] 2018-08-06T14:25:51,192 | INFO  | opendaylight-cluster-data-shard-dispatcher-41 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-3-shard-default-config, leaderId=member-1-shard-default-config, leaderPayloadVersion=9]

[4] 2018-08-06T14:39:29,755 | INFO  | opendaylight-cluster-data-shard-dispatcher-36 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-3-shard-default-config, leaderId=member-1-shard-default-config, leaderPayloadVersion=9]

[5] 2018-08-06T14:41:26,947 | INFO  | opendaylight-cluster-data-shard-dispatcher-27 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-3-shard-default-config, leaderId=null, leaderPayloadVersion=-1]

[6] 2018-08-06T14:49:21,526 | INFO  | opendaylight-cluster-data-shard-dispatcher-62 | ShardManager                     | 236 - org.opendaylight.controller.sal-distributed-datastore - 1.8.0 | shard-manager-config: Received LeaderStateChanged message: LeaderStateChanged [memberId=member-3-shard-default-config, leaderId=member-3-shard-default-config, leaderPayloadVersion=9]

Comment by Tom Pantelis [ 23/Aug/18 ]

> Basically, I'm seeing where when two nodes are brought down, the lone left over node is not been re-elected as leader node and any READ  operations during this time are resulting in this exception.

 

yes - that is the expected behavior as I mentioned in my prior comment. At least 2 nodes in a 3 node cluster are needed for consensus.

Comment by Jamo Luhrsen [ 23/Aug/18 ]

ok, so we are learning and making progress here, but this begs the question. How is someone supposed to know that they
are looking at ugly scary logs like this from a node that happened to be the only one up in the cluster? At this point, it feels
like this is just tribal knowledge.

Comment by Michael Vorburger [ 23/Aug/18 ]

Isn't another question one may ask here why we are having a CSIT test for a totally unsupported scenario? So it tests to two of three nodes down, and then expects to confirm that... what, actually? I'm not trying to be funny, but isn't this is simply a bad test?

Comment by Jamo Luhrsen [ 23/Aug/18 ]

even if two ODLs are down, the dataplane (as it was) is expected to still work. So, openstack instances should still
be able to have connectivity, etc. it's a valid test, but our CSIT is also wired up to be more than just
black box testing. Every test case teardown is looking for unexpected exceptions, logging data
models, etc. Those kinds of things might have to be dialed back, or ignored if they produce failures
that we are supposed to ignore. I still wish we didn't have to get such ugly exceptions and logs in
a single node, just because it happens to be the last one standing. But, as was mentioned on the TSC today
it sounds like others are fine with that and people need to develop tooling/monitoring on top of
the cluster to let admins understand when they need to ignore these scary types of logs.

Comment by Sam Hague [ 28/Aug/18 ]

Yes, this is a valid test for the dataplane connectivity. We can do better in how the test is written as far as reducing the errors by gracefully shutting down odl3, then bringing it back and waiting until all the mdsal updates have finished. Log this better in the logs so we know when this is happening. This will reduce the errors since the odl3 won't be doing any work while 1 and 2 are down. Once odl3 is fully up and processed all the mdsal updates we can take down odl1 and 2 and check the dataplane.

To Jamo's point, though, this is still an issue if this happens with customers. All they will see are errors in the log and they will start to file cases. We need tooling to make it very clear and bubble up the no leader status and leading to errors. The logs are there but it is difficult to relate them.

Generated at Wed Feb 07 20:23:01 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.