-
Bug
-
Resolution: Done
-
None
-
None
-
None
-
None
-
Operating System: All
Platform: All
-
8858
The "flapping service" is a simple test service implemented using singleton service. Upon creation, flapping service instance tries to de-register flapping service from the current member, and upon close, the instance tries to re-register on the current member again.
The robot test registers the flapping service on all members, waits a while, and then un-registers all members. This is also running as a longevity job.
This week, the longevity job failed [0], after 22 hours (after 175814 successful de-registrations) in the re-registration phase. Karaf log [1] shows VerifyException:
2017-07-16 21:10:57,018 | WARN | pool-30-thread-1 | FlappingSingletonService | 257 - org.opendaylight.controller.samples.clustering-it-provider - 1.5.2.SNAPSHOT | There was a problem re-registering flapping singleton service.java.lang.RuntimeException: com.google.common.base.VerifyException
at org.opendaylight.mdsal.singleton.dom.impl.ClusterSingletonServiceGroupImpl.registerService(ClusterSingletonServiceGroupImpl.java:180)
at org.opendaylight.mdsal.singleton.dom.impl.AbstractClusterSingletonServiceProviderImpl.registerClusterSingletonService(AbstractClusterSingletonServiceProviderImpl.java:107)
at Proxy057965cf_73f3_4f46_a035_5d06b8a2497b.registerClusterSingletonService(Unknown Source)
at Proxy1a1b6ecc_a451_4f57_8a21_81dd42385c83.registerClusterSingletonService(Unknown Source)
at org.opendaylight.controller.clustering.it.provider.impl.FlappingSingletonService.lambda$closeServiceInstance$1(FlappingSingletonService.java:78)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_131]
...
Garbage collection log [2] shows there was a small GC pause just before, not sure how that could be related.
2017-07-16T21:10:56.605+0000: 78867.300: [GC (Allocation Failure) [PSYoungGen: 598048K->64416K(604160K)] 1060366K->526734K(1238528K), 0.4121114 secs] [Times: user=2.06 sys=0.00, real=0.41 secs]
[0] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-chasing-leader-longevity-only-carbon/13/log.html.gz#s1-s2-t3-k3-k2-k1-k1-k2-k1-k4-k7-k1
[1] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-chasing-leader-longevity-only-carbon/13/odl1_karaf.log.gz
[2] https://logs.opendaylight.org/releng/jenkins092/controller-csit-3node-cs-chasing-leader-longevity-only-carbon/13/gclogs-1/gc_1500160588.82.log.gz
- is blocked by
-
CONTROLLER-1755 RaftActor lastApplied index moves backwards
- Resolved
-
CONTROLLER-1757 Singleton leader chasing exhausts heap space in few hours
- Resolved
- is duplicated by
-
MDSAL-234 Cluster singleton service unable to handle closeServiceInstance() being called from instantiateServiceInstance()
- Resolved