Uploaded image for project: 'OpenFlowPlugin'
  1. OpenFlowPlugin
  2. OPNFLWPLUG-962

Multiple "expired" flows take up the memory resource of CONFIG DS which leads to Controller shutdown.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: High High
    • Carbon-SR3, Nitrogen-SR2, Oxygen
    • None
    • None
    • None

      #security-status: confirmed

      Please Note: This issue is a possible security vulnerability, do not discuss outside of this Jira or stage any patches on gerrit until the embargo process reaches that stage.

      I am sending this to you in advance to give us some lead time to triage...

      ISSUE: Multiple "expired" flows take up the memory resource of CONFIG DS
      which leads to CONTROLLER shutdown.

      STEPS TO REPRODUCE:

      1. Start the controller.
      2. Connect Openflow swtiches (can use mininet)
      3. Send multiple different flows with "idle-timeout" and "hard-timeout"
      set to config store:
      (http://<CONTROLLER-IP>:8181/restconf/config/opendaylight-inventory:nodes/node/openflow:1/table/0)

      4. Verify that the OPENFLOWPLUGIN is working fine: Flows are
      transferred to network and to OPERATIONAL datastore.
      5. Depending on JVM size, the expired flows are bound to crash the
      controller

      Karaf crash LOGS are attached

      OBSERVATION:

      Although the installed flows(with timeout set) are removed from network (an thus also from controller's operations DS), the expired entries are still present in CONFIG DS.This may adhere with the design goals of CONFIG datastore, but, is
      prone to dangerous attacks on controller.

      The attack can originate both from NORTH or SOUTH. Above description is for north bound attack. A south bound attack can originate when an
      attacker is attempting a flow flooding attack and since flows come with timeouts, the attack is not successful. However, the attacker will now
      be successful in CONTROLLER overflow attack(resource consumption). This is more severe and dangerous than the actual flow-table-flooding attack.

      Although, the network(actual flow tables) and operational DS are only (~)1% occupied, the controller shouts for resource consumption. This happens because the installed flows get removed from the network upon timeout.

      The error is not recoverable and shuts down the controller.

      MITIGATION:

      The expired flows should be removed from controller's CONFIG datastore.

      If the design goal is to keep the flow entries persistent, there should be a threshold which should be calculated truly based on JVM's heap size.

      Another thought: it makes sense to have operational DS (active state of nw) contain only those many tables as present in the network, the CONFIG DS(desired state) can have different size which can be scaled up and scaled down depending on resource usage.

            Avishnoi Anil Vishnoi
            vhd Vaibhav Hemant Dixit
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: