[BGPCEP-850] ll-graceful-restart - transaction domchain failed after peer disconnect Created: 21/Nov/18  Updated: 26/Nov/18  Resolved: 26/Nov/18

Status: Verified
Project: bgpcep
Component/s: BGP
Affects Version/s: Neon
Fix Version/s: Neon

Type: Bug Priority: Medium
Reporter: Tomas Markovic Assignee: Matej Perina
Resolution: Done Votes: 0
Labels: bgp, csit:failures
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File configure.py     File karaf.log     File play.py     File start_play1.py     File start_play3.py    

 Description   

Overview: Odl is configured with ll-graceful-restart, we connect peer to it with one route, and disconnect it. After graceful-restart-timer (5s in this case) runs out, an error in karaf logs pops up, and play.py is unable to connect again.

Steps to reproduce
Configure odl with ll-graceful-restart (one peer, ipv4 only)

./configure.py

Connect peer

./start_play1.py

Kill play.py

Ctrl+c

Wait 5 seconds and see karaf.log

At this point we wanted to start different play.py on different ip, so we would get advertisement from odl with route containing ll-graceful-restart community.

to do this ->

./start_play3.py


 Comments   
Comment by Claudio David Gasparini [ 21/Nov/18 ]

tomas.markovic an error? what error? attach some logs or stack trace please.

 

Regards,

Comment by Tomas Markovic [ 21/Nov/18 ]

I attached as karaf.log. It starts when I kill the peer, and than after 5 seconds first error appears.

Comment by Matej Perina [ 21/Nov/18 ]

I pushed new patch set that fixes the issue. However during debugging I didn't seen any ll-graceful-restart-capability received. Should there be any?

Also this new patch also includes changes coming from Robert's review, so please notice that graceful-restart rpc is now called  "restart-gracefully". 

Comment by Tomas Markovic [ 21/Nov/18 ]

There definitely should ll-gr capability recieved, I wouldnt be able to connect unless I didnt have it in the peer open message.

Comment by Tomas Markovic [ 21/Nov/18 ]

Ok, there might have not been ll-gr enabled, but I replaced the old scripts with new ones, and there definitely is ll-gr enabled.
But right now, after I kill first peer, after first 5 seconds route gets removed, and even when I start second peer I dont get any advertisement.

You can use same scenario as before.

Generated at Wed Feb 07 19:14:18 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.