Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-307

Lots of errors/exceptions on controller shutdown

    XMLWordPrintable

Details

    • Bug
    • Status: Resolved
    • Resolution: Done
    • Helium
    • None
    • adsal
    • None
    • Operating System: Linux
      Platform: Macintosh

    • 716

    Description

      When you shutdown all the bundles via the equinox console ('shutdown' command), there are many logged errors and exceptions that occur. While most of these are likely benign as the entire app is shutting down anyway, we should strive for clean shutdown.

      Many of the errors emanate from the config subsystem initiated by the ModuleFactoryBundleTracker on bundle shutdown. ModuleFactoryBundleTracker.removedBundle invokes the BlankTransactionServiceTracker to perform a commit. I assume this is done to cleanup the modules/services from the cache. There is a comment that states "workaround for service tracker not getting removed service". I assume this is referring to the ServiceTracker for ModuleFactory services which also uses the BlankTransactionServiceTracker. It seems the intent was for the BlankTransactionServiceTracker.removedService method to be invoked to do the commit but it doesn't because the addingService method returns null. The ServiceTracker won't actually track a service and subsequently call removedService/modifiedService unless a non-null instance is returned from addingService. So addingService should extract and return the service instance from the ServiceReference. This obviates the need for the ModuleFactoryBundleTracker to do the work.

      Either way, the BlankTransactionServiceTracker commit fails with an IllegalStateException from ServiceReferenceRegistryImpl.saveServiceReference when copying the previous mBeans entries from the readable ServiceReferenceRegistryImpl to a new writable ServiceReferenceRegistryImpl instance. This is initiated by ConfigRegistryImpl.beginConfig.

      The IllegalStateException is thrown from here:

      Set<String> serviceInterfaceQNames = factoryNamesToQNames.get(
      moduleIdentifier.getFactoryName());
      if (serviceInterfaceQNames == null) {
      ...
      throw new IllegalStateException("Possible error in code: ...");

      The factoryNamesToQNames map is dynamically populated with the current ModuleFactory services from the BundleContext at the start of the transaction. However since this code is being invoked due to a ModuleFactory service that was just removed from the OSGi registry, that service won't be found. Since the service no longer exists, it can't be populated to the new writable ServiceReferenceRegistryImpl so it seems reasonable to ignore it and not throw an ex and fail the entire transaction. This method is called from the copy method which passes 'true' for skipChecks so it also seems reasonable to ignore the lookup failure and return based on skipChecks:

      if (serviceInterfaceQNames == null) {
      if( skipChecks )

      { return null; }

      throw new IllegalStateException(...);
      }

      After getting by that error, validation errors occurred from Modules during the commit due to service dependencies that no longer existed which failed the entire transaction. This is a tricky one. It seems to me that if a service is removed, it should invalidate other services that depend on it. These services should be properly removed from the cache and cleaned up rather than failing the transaction and leaving them. So in the copyExistingModulesAndProcessFactoryDiff phase of beginConfig, I prototyped changes to validate the previously cached Module prior to copying it to the new registry. If the validation fails then call destroyModule to clean it up (although it first had to be added to the DependencyResolver for this to work).

      I'm not sure what ramifications this might have, especially when a bundle is simply stopped then restarted. I'm new to ODL and I'm not familiar with the config subsystem. I don't know if it is intended to handle individual bundle restarts. If not and if the purpose of the BlankTransactionServiceTracker handling removed services is to cleanup the config cache, then we can simply do nothing when a service is removed.

      There were also some other miscellaneous errors on shutdown:

      • FlowProvider, GroupProvider and MeterProvider: all throw an UnsupportedOperationEx from their close methods with a TODO message. This seems rather excessive. I'm not sure if anything actually needs to be done on close but, assuming the UnsupportedOperationEx was thrown as a reminder to eventually implement it, I think it's better to log a warning instead.
      • NetconfSSHServer: logs a socket closed exception due to closing the socket on shutdown. We can ignore this and not log the error (if 'up' is false).
      • RestconfProvider: NPE in stop method calling session.close. The 'session' member is never initialized or otherwise used. Either remove it or check for null in stop.
      • RuntimeMappingModuleServiceProxy: IllegalStateEx from bundleContext.ungetService due to bundle already shutdown. This can happen when cleaning up module services in the config subsystem so ignore the ex.

      Attachments

        Issue Links

          No reviews matched the request. Check your Options in the drop-down menu of this sections header.

          Activity

            People

              Unassigned Unassigned
              tpantelis Tom Pantelis
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: