[NETCONF-422] java.lang.OutOfMemoryError: GC overhead limit exceeded generates "Master is down" exception Created: 15/May/17 Updated: 15/Mar/19 Resolved: 16/May/17 |
|
| Status: | Resolved |
| Project: | netconf |
| Component/s: | netconf |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | ||
| Reporter: | Matej Perina | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Operating System: All |
||
| External issue ID: | 8454 |
| Description |
|
I'm running clustered 3-node ODL with 5 active netconf sessions with Honeycomb netconf server after some time generates :Master is down. Please contact me (mperina@cisco.com) about logs since thier size is over 1GB. |
| Comments |
| Comment by A H [ 15/May/17 ] |
|
Is this a blocker bug for Carbon? If so, is there an ETA for when a fix can be completed? If not, could someone from the NETCONF team please retarget the for Nitrogen or Carbon SR1? |
| Comment by Andrej Mak [ 16/May/17 ] |
|
I've found this in master node logs: 2017-05-15 12:29:51,897 | TRACE | lt-dispatcher-31 | NetconfDeviceCommunicator | 299 - org.opendaylight.netconf.sal-netconf-connector - 1.5.0.Carbon | RemoteDevice {overcloud-controller-0.opnfv.org}: Sending message <rpc message-id="m-3327" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">2017-05-15 12:29:51,898 | TRACE | oupCloseable-3-1 | NetconfDeviceCommunicator | 299 - org.opendaylight.netconf.sal-netconf-connector - 1.5.0.Carbon | Finished sending request <rpc message-id="m-3327" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> 2017-05-15 12:30:04,211 | DEBUG | oupCloseable-3-1 | NetconfDeviceCommunicator | 299 - org.opendaylight.netconf.sal-netconf-connector - 1.5.0.Carbon | RemoteDevice{overcloud-controller-0.opnfv.org} : Message received <rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="m-3327"> : Matched request: <rpc message-id="m-3327" xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"> Master node sends rpc to the device. Probably due to "GC overhead limit exceeded" on the device, rpc-reply is sent from device to ODL 11s later. Default ask timeout in odl-netconf-clustered-topology is 5s, hence "Master is down." message. Ask timeout can be set when mount point is created by parameter "actor-response-wait-time", see netconf-node-topology.yang. |
| Comment by Robert Varga [ 16/May/17 ] |
|
This is a problem on the SB device, which is running Boron-SR3 code and is probably not sized properly. Lowering priority and targetting Boron-SR4. |