[CONTROLLER-1754] Carbon: Sporadic cluster failure when member is restarted in Netconf cluster test Created: 22/Aug/17 Updated: 19/Oct/17 Resolved: 09/Sep/17 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | clustering |
| Affects Version/s: | Nitrogen |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Vratko Polak | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| Issue Links: |
|
||||||||
| External issue ID: | 9027 | ||||||||
| Description |
|
This is probably a duplicate of an already existing Bug.
NETCONF-454 is perhaps the same bug, but from its description that looks like a superset of this Bug. According to a comment [1] there, this Bug can be fixed by [2], which is a fix for This Bug is not easy to reproduce reliably, as single restart failure frequency is small. It only affects Netconf suite significantly, because there are multiple restarts. [0] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-only-carbon/630/log.html.gz#s1-s5-t13-k2-k2-k8-k1-k2-k1-k1-k3-k1 |
| Comments |
| Comment by Vratko Polak [ 22/Aug/17 ] |
|
Waiting for Sandbox test results to see whether the cherry-picked fix [3] works. |
| Comment by Vratko Polak [ 25/Aug/17 ] |
|
> cherry-picked fix [3] That does not work. I have also tried a change in test [4] adding hard resets. It perhaps reduces the frequency, but does not prevent this failure, as Sandbox shows [5]. I have run out of ideas. [4] https://git.opendaylight.org/gerrit/62194 |
| Comment by Robert Varga [ 04/Sep/17 ] |
|
Well, the karaf restarts don't do enough to clear state and end up being the equivalent of bundle reload. I think the correct fix is to either auto-detect, or expose as a knob, the mechanism of JVM shutdown:
|
| Comment by Vratko Polak [ 05/Sep/17 ] |
|
It seems this bug is way less frequent on This is important, because Releng/Builder decided [7] to drop [6] https://logs.opendaylight.org/releng/jenkins092/netconf-csit-3node-clustering-all-carbon/385/log.html.gz#s1-s10-t13-k2-k2-k8-k1-k2-k1-k1-k2-k1-k4-k1 |
| Comment by Vratko Polak [ 05/Sep/17 ] |
|
>> adding hard resets > karaf restarts don't do enough to clear state The suite at this segment does: |