Uploaded image for project: 'controller'
  1. controller
  2. CONTROLLER-1898

Improve NormalizedNodeData{Input,Output} QName coding

XMLWordPrintable

    • Icon: Improvement Improvement
    • Resolution: Done
    • Icon: Medium Medium
    • Sodium, Neon SR2
    • None
    • clustering
    • None

      After addressing CONTROLLER-1897, overall performance improved by about 18%, but the subsequent profiling is showing readQName() still accounting for 46% of CPU time spent.

      There are two components to this cost:

      • 30% is spent in readCodedString()
      • 61% is spent in QNameFactory.create()

      The sample (a 350MiB snapshot) invokes readQName() 1.7M times, with the overall result being 506 unique QNames – hence QNames are obviously a good candidate for the same coding we are using for Strings.

      Implementing such coding will allow us:

      • trim snapshot, as already-encoded QNames will result in one-third of reads, i.e. 1 read of 5 bytes instead of 3
      • eliminate most of the QNameFactory.create() overhead, as repetitive QNames will be looked up in a local List instead of hash-based concurrent LoadingCache

      Finally, this also enables us to cache NodeIdentifier instances, as NodeIdentifier is only a wrapper around a QName. Adding a secondary lookup table for caching these wrappers should allow us to lower the memory footprint of the deserialized data.

       

            rovarga Robert Varga
            rovarga Robert Varga
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: