[CONTROLLER-2004] EOS gossip propagation takes too long Created: 30/Sep/21  Updated: 21/Oct/21  Resolved: 21/Oct/21

Status: Resolved
Project: controller
Component/s: eos
Affects Version/s: 4.0.0
Fix Version/s: 4.0.4

Type: Bug Priority: High
Reporter: Robert Varga Assignee: Tomas Cere
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to BGPCEP-983 PCEPTopologyDeployerImpl restarts top... Resolved

 Description   

With the switch to distributed-data we have much longer times in propagating, to the point of flushing out BGPCEP-983, where the topology disappears for 2+ seconds.

According to the documentation of Akka Distributed Data, this is due to us using writeLocal() and gossip dissemination. The latter is configured by default as:

akka.cluster.distributed-data {
  # How often the Replicator should send out gossip information
  gossip-interval = 2 s

  # How often the subscribers will be notified of changes, if any
  notify-subscribers-interval = 500 ms
}

These affect the reaction time we get in propagating changes – which are quite critical to our ability to converge.

Update these to accelerate propagation to at most a few tens of milliseconds.


Generated at Wed Feb 07 19:56:58 UTC 2024 using Jira 8.20.10#820010-sha1:ace47f9899e9ee25d7157d59aa17ab06aee30d3d.