Uploaded image for project: 'OpenFlowPlugin'
  1. OpenFlowPlugin
  2. OPNFLWPLUG-859

Internal SalSevice queue gets full and ODL does not perform any more Openflow actions in the switch and does not release the mastership

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Boron
    • None
    • clustering
    • None
    • Operating System: All
      Platform: All

    • 7846

      After running ODL for few days, it seems some internal queue gets fulls and SalService does not allow to perform any action on the switch. The connectivity with the switch is established and the hello messages from the switch are returned properly.

      Restarting the switch does not solve the problem, the controller that holds the mastership does not free the cluster singleton service. The rest of the controller that holds "candidate" status on the cluster singleton (slave mode in the switch) release the singleton service.

      The problem was solved only after restarting the controller that holds the mastership to the switch.

      I noticed exceptions that may point to the source of the problem. Basically, following code does not return a requestContext . (requestContext is always null).

      final RequestContext<O> requestContext = requestContextStack.createRequestContext();
      if (requestContext == null)

      { LOG.trace("Request context refused."); getMessageSpy().spyMessage(AbstractService.class, MessageSpy.STATISTIC_GROUP.TO_SWITCH_DISREGARDED); return failedFuture(); }

      And following logs points that the queue is full.

      2017-02-24 09:47:34,239 | TRACE | Thread-101 | AbstractService | 287 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Handling general service call
      2017-02-24 09:47:34,239 | TRACE | Thread-101 | RpcContextImpl | 287 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Device queue org.opendaylight.openflowplugin.i mpl.rpc.RpcContextImpl@3a502d68 at capacity
      2017-02-24 09:47:34,239 | TRACE | Thread-101 | AbstractService | 287 - org.opendaylight.openflowplugin.impl - 0.3.1.Boron-SR1 | Request context refused.

      The log "Device queue org.opendaylight.openflowplugin.i mpl.rpc.RpcContextImpl@3a502d68 at capacity" is returned because following code cannot acquire the lock which is a semaphore.

      public <T> RequestContext<T> createRequestContext() {
      if (!tracker.tryAcquire()) {
      LOG.trace("Device queue {} at capacity", this);
      return null;
      } else {
      LOG.trace("Acquired semaphore for {}, available permits:{} ", nodeInstanceIdentifier.getKey().getId().getValue(), tracker.availablePermits());
      }

            JalpaModasiya Jalpa Modasiya
            castro.jon@gmail.com Jon Castro
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: