[TRNSPRTPCE-713] Path restoration fails on multi-hop path Created: 15/Dec/22  Updated: 26/Sep/23  Resolved: 10/Mar/23

Status: Verified
Project: transportpce
Component/s: None
Affects Version/s: None
Fix Version/s: Chlorine, Argon

Type: Bug Priority: High
Reporter: Tianliang Zhang Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: 0 minutes
Time Spent: 2 days
Original Estimate: 0 minutes

Issue Links:
Relates
relates to TRNSPRTPCE-731 Path restoration fails on multi-hop p... Open

 Description   

When the path under protection (path restoration) has more than one hop, the path restoration mechanism fails.

For example, after a lightpath is restored from a single hop path to a longer path, if we recover the failed single link, and cut the working path, the path restoration for this longer path will not be triggered.



 Comments   
Comment by Gilles Thouenon [ 05/Jan/23 ]

I tested this use case using the current version of honeynode simulator available on transportpce/tests repository.

The issue is not exactly quite that simple. Resotration may be correctly triggered when we degrade the longer path provided that the failure is detected on the input port of ROADM-A. So, yes I the restoration is not triggered when the failure is located on other input port of the longer path (ROADM-B or ROADM-C).

After analysis, it appears that this is due to the fact that ROADM-B and ROADM-C have not exactly the same xml initial configuration file as ROADM-A when we start the sims. PM threshold configurations are missing (current-pm-description from pm-handling yang model).

After that being fixed (see change 103928: https://git.opendaylight.org/gerrit/c/transportpce/+/103928), a second issue appeared:

when a failure occured on one of the input port of ROADM-B (so on the longest path), and then has been cleared, the link remains in the operational state "out-of-service", and then, in case of failure on the current (shortest) path, the restoration can no longer be effective because it no longer exist other alternative path available.

After analysis, this second issue is due to a bug in honeynode implementation (bad management of list with streams).

Comment by Tianliang Zhang [ 08/Feb/23 ]

This patch has been tested working on the testbed in UTD. the following tests have been done.

  1. fiber break on a single hop lightpath(primary), path restoration is triggered and the path is restored on the secondary path (two hops).
  2. if the secondary path is in service, connecting the primary fiber back, no restoration will be triggered. Lightpath remains working on the secondary path.
  3. if the secondary path is out of service, connecting the primary fiber back, restoration will be triggered. Lightpath will be restored to the primary path.
  4. following number 2, if break the fiber on the secondary path, the path is restored back to the primary path.
Generated at Wed Feb 07 20:43:52 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.