[OPNFLWPLUG-635] Lifecycle cleaning Created: 15/Mar/16 Updated: 27/Sep/21 Resolved: 16/May/16 |
|
| Status: | Resolved |
| Project: | OpenFlowPlugin |
| Component/s: | General |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Jozef Slezák | Assignee: | Jozef Bacigal |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Attachments: |
|
||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||
| External issue ID: | 5523 | ||||||||||||||||||||||||||||
| Description |
|
When restarting switches on running controller controller has issues in clening up its state. We were testing that on stable/beryllim SR1 using loop: start_network.sh, sleep 0.5sec stop_network.sh. Please reproduce it also on 3node cluster. |
| Comments |
| Comment by Jozef Slezák [ 16/Mar/16 ] |
|
Please take a look at this log messages in attached log20160316.zip. During last testing on stable/beryllium I saw node in operational inventory on disconnected network after that not all flows were reconciled. 2016-03-16 18:42:04,174 | WARN | pool-33-thread-1 | RpcContextImpl | 157 - org.opendaylight.openflowplugin.impl - 0.2.1.SNAPSHOT | Xid cannot be reserved for new RequestContext, node:Uri [_value=openflow:18087942711149584731] org.opendaylight.yangtools.yang.data.api.schema.tree.ModifiedNodeDoesNotExistException: Node /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[ {(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:18088224186126241907}]/AugmentationIdentifier {childNames=[(urn:opendaylight:flow:inventory?revision=2013-08-19)supported-match-types, (urn:opendaylight:flow:inventory?revision=2013-08-19)port-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)serial-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)group, (urn:opendaylight:flow:inventory?revision=2013-08-19)ip-address, (urn:opendaylight:flow:inventory?revision=2013-08-19)manufacturer, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-group, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-instructions, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-actions, (urn:opendaylight:flow:inventory?revision=2013-08-19)table, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-meter, (urn:opendaylight:flow:inventory?revision=2013-08-19)hardware, (urn:opendaylight:flow:inventory?revision=2013-08-19)switch-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)description, (urn:opendaylight:flow:inventory?revision=2013-08-19)software, (urn:opendaylight:flow:inventory?revision=2013-08-19)meter]}/(urn:opendaylight:flow:inventory?revision=2013-08-19)group/group[ {(urn:opendaylight:flow:inventory?revision=2013-08-19)group-id=3}] does not exist. Cannot apply modification to its children. |
| Comment by Jozef Slezák [ 16/Mar/16 ] |
|
Attachment log20160316.zip has been added with description: Log files after restarting OVS2.4 on running controller |
| Comment by Jozef Slezák [ 17/Mar/16 ] |
|
Vaso, we have robot test to reproduce this problem. |
| Comment by A H [ 06/May/16 ] |
|
A patch for this bug was submitted: https://git.opendaylight.org/gerrit/#/c/38369/ To better assess the impact of this bug and fix, could someone from your team please help us identify the following: Severity: Could you elaborate on the severity of this bug? Is this a BLOCKER such that we cannot release Beryllium without it? Is there a workaround such that we can write a release note and fix in future Beryllium SR3? |
| Comment by Jozef Bacigal [ 09/May/16 ] |
|
This patch was made behalf on clustering changes needed to be implemented.On the changes were made and merged before the release SR2, we tested it in cluster in our place. Changes priorities were to be stable on connection flapping, and we rewrote whole connection and disconnection logic, made double candidate approach in plugin. Three patches mentioned, took to long to test, we manage better testing on cluster as we were able before, so we made a step forward. This patches meant to be ready before SR2 and merged before the lock. We all, most of us, agreed it is a regression for the SR2 in “cluster environment”. I don’t think we are able to make any workaround, its too complex to simplify this patch. We were testing in on our cluster every day for last two weeks. But we don’t have any robot tests, integration tests, but some unit test. This fix doesn’t impact any depending project, it is changing only plugin in cluster behavior like change mastership. It isn’t change statistics gathering nor rpc services nor yang models. Patches were already merged in beryllium, I will close this bug, if some errors or bug comes out from testing we should raise another bug. |
| Comment by A H [ 09/May/16 ] |
|
Have we verified if the bug has been fixed in the latest build for Beryllium SR2? |