[NETVIRT-1178] ODL L2 Agent is dead after restarting a compute node Created: 27/Mar/18  Updated: 23/May/18  Resolved: 23/May/18

Status: Resolved
Project: netvirt
Component/s: General
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: High
Reporter: Itzik Brown Assignee: Josh Hershberg
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File karaf.log     Text File karaf.log    
Issue Links:
Relates
relates to OVSDB-444 Port binding failure after rebooting ... Resolved

 Description   

After rebooting a compute node the OVS is connected to the all the controllers but the pseudo agent is down.

In Neutron log:
2018-03-27 07:31:09.202 34 WARNING neutron.db.agents_db [req-86b57593-85d6-4c20-bba1-d408151e94ef - - - - -] Agent healthcheck: found 1 dead agents out of 11:
                Type       Last heartbeat host
              ODL L2  2018-03-27 07:05:06 compute-0.localdomain

Version:
opendaylight-8.0.0-3.el7ost.noarch



 Comments   
Comment by Itzik Brown [ 11/Apr/18 ]

The second karaf.log is with OVSDB Trace

Comment by Josh Hershberg [ 23/Apr/18 ]

Root cause:

When an ovsdb client connects a component, StalePassiveConnectionService, checks to see if there are any previous connections to that same client. If there are, they are pinged to determine whether or not they are still actually connected. The callback that handles the ping results does not fire properly in the event of a timeout and the onFailure method does not call OvsdbConnectionService.notifyListenerForPassiveConnection for the new connection. This results in the new connection not being being reported "up the stack." 

Comment by Josh Hershberg [ 23/May/18 ]

https://git.opendaylight.org/gerrit/#/c/71203/

Generated at Wed Feb 07 20:23:25 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.