Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1637

Intermittent failure in DistributedDataStoreIntegrationTest.testChangeListenerRegistration

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Done
    • None
    • None
    • clustering
    • None
    • Operating System: All
      Platform: All

    • 8231

    Description

      This test failed on jenkins:

      00:50:06 Tests run: 50, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 10.977 sec <<< FAILURE! - in org.opendaylight.controller.cluster.datastore.DistributedDataStoreIntegrationTest
      00:50:06 testChangeListenerRegistration[class org.opendaylight.controller.cluster.databroker.ClientBackedDataStore](org.opendaylight.controller.cluster.datastore.DistributedDataStoreIntegrationTest) Time elapsed: 0.108 sec <<< FAILURE!
      00:50:06 java.lang.AssertionError: Change 1 does not contain /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test. Actual: {}
      00:50:06 at org.junit.Assert.fail(Assert.java:88)
      00:50:06 at org.junit.Assert.assertTrue(Assert.java:41)
      00:50:06 at org.opendaylight.controller.cluster.datastore.utils.MockDataChangeListener.waitForChangeEvents(MockDataChangeListener.java:66)
      00:50:06 at org.opendaylight.controller.cluster.datastore.DistributedDataStoreIntegrationTest$21.<init>(DistributedDataStoreIntegrationTest.java:1178)
      00:50:06 at org.opendaylight.controller.cluster.datastore.DistributedDataStoreIntegrationTest.testChangeListenerRegistration(DistributedDataStoreIntegrationTest.java:1162)

      It failed in the second run for the tell-based protocol. The test writes the test model container, registers a DCL, then expects the initial created data notification for the test container. Afterwards it writes a couple entries to the contained "outer-list" and verifies the notification updates.

      However, in the failed run, the initial notification received by the DCL was:

      12:50:04,36 AM [cluster-test-akka.actor.default-dispatcher-10] [DEBUG] DataChangeListener - Sending change notification DOMImmutableDataChangeEvent [created=[], updated=[/(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test], removed=[/(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test/outer-list/outer-list[

      {(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)id=1}

      ], /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test/outer-list/outer-list[

      {(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)id=2}

      ]/id, /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test/outer-list/outer-list[

      {(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)id=2}

      ], /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test/outer-list, /(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test/outer-list/outer-list[

      {(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)id=1}

      ]/id]] to listener org.opendaylight.controller.cluster.datastore.utils.MockDataChangeListener@d65e2b4

      Right after it received the expected notification:

      12:50:04,38 AM [cluster-test-akka.actor.default-dispatcher-10] [DEBUG] DataChangeListener - Sending change notification DOMImmutableDataChangeEvent [created=[/(urn:opendaylight:params:xml:ns:yang:controller:md:sal:dom:store:test?revision=2014-03-13)test], updated=[], removed=[]] to listener org.opendaylight.controller.cluster.datastore.utils.MockDataChangeListener@d65e2b4

      There's 2 problems here. One is that the DistributedDataStoreIntegrationTest class doesn't clear the InMemoryJournal for each test case. So when testChangeListenerRegistration runs for the tell-based protocol it observes the data left over from the ask-based test run. So when the test overwrites the test model container, a remove notification is generated for the stale outer list entries.

      The second problem is why the DCL even received the remove notification in the first place since the initial write completes before the DCL is registered. It should just receive the created data notification. This is indirectly due to the sharing of the ListenerTree between the ShardDataTree and the ShardDataTreeNotificationPublisherActor. On registration, the listener is first added to the shared ListenerTree and then it sends a PublishNotifications message to the publisher actor to generate the initial notification using a local ListenerTree. The problem here is that listener is leaked to the shared ListenerTree prior to the initial notification so if the publisher actor already has a queued PublishNotifications message for a prior update, the listener will inadvertently observe that notification first.

      To alleviate this, we need to have the ListenerTree owned by the ShardDataTreeNotificationPublisherActor and not share it with the ShardDataTree. On registration, the ShardDataTree should send a message to the ShardDataTreeNotificationPublisherActor to perform the on-boarding of the new listener, ie atomically generate and send the initial notification and then add the listener to the ListenerTree.

      Attachments

        No reviews matched the request. Check your Options in the drop-down menu of this sections header.

        Activity

          People

            tpantelis Tom Pantelis
            tpantelis Tom Pantelis
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: