Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Fix Version/s: None
Affects Version/s: Carbon
Component/s: mdsal
Labels:
None
Environment:

Operating System: All
Platform: All

External issue ID:
9034

We're seeing an OOM in Red Hat internal scale testing:

Our scenario is a cluster of 3 nodes with odl-netvirt-openstack being stress tested by OpenStack's rally benchmarking tool.

The ODL version that this is being seen with is a a Carbon built last Thursday; specifically https://nexus.opendaylight.org/content/repositories/opendaylight-carbon-epel-7-x86_64-devel/org/opendaylight/integration-packaging/opendaylight/6.2.0-0.1.20170817rel1931.el7.noarch/opendaylight-6.2.0-0.1.20170817rel1931.el7.noarch.rpm

We've started testing with giving all 3 ODL node VMs just 2 GB, in an effort to understand ODL memory requirements better. If it's just "normal" that we cannot run with 2 GB in such a "real world" scenario, we'll be gradually increasing Xmx in this environment - but we wanted to get community feedback for this OOM with 2 GB already.

I'll attach, or provide links, to the usual HPROF, plus a "Leak Suspects" report produced by https://www.eclipse.org/mat/, plus the Karaf log.

Basically what we're seeing is a huge (>1 GB) Map in ShardDataTree (I'm not sure if that's its Map<LocalHistoryIdentifier, ShardDataTreeTransactionChain> transactionChains or its Map<Payload, Runnable> replicationCallbacks).

As far as I can tell from my limited understanding, this is not the same as ~~CONTROLLER-1746~~ (there's nothing about "closedTransactions" anywhere..), and not ~~CONTROLLER-1755~~ either?

The Karaf log, among other errors which are less relevant in this context AFAIK, shows:

(1) a lot of those "ERROR ShardDataTree org.opendaylight.controller.sal-distributed-datastore - 1.5.2.Carbon | member-0-shard-default-operational: Failed to commit transaction ... java.lang.IllegalStateException: Store tree org.opendaylight.yangtools.yang.data.api.schema.tree.spi.MaterializedContainerNode@78fe0203 and candidate base org.opendaylight.yangtools.yang.data.api.schema.tree.spi
.MaterializedContainerNode@686861e8 differ" errors - seems vaguely familiar from recent list posts, someone remind me what were those that all about again?

(2) at the very end before it blows up, seems to indicate genius' lockmanager perhaps not being too happy due to " Waiting for the lock ... is timed out. retrying again" - probably just an impact of this OOM? Or could that (lockmanager) somehow be related and actually be the cause not the effect - could "bad application code" (like not closing DataBroker transaction correctly, or something like that?) cause this OOM?

blocks

CONTROLLER-1763 On restarting ODL on one node, ODL on another node dies in a clustered setup

Resolved

NETVIRT-883 Umbrella parent issue for grouping all suspected transaction leaks

Resolved

is blocked by

CONTROLLER-1755 RaftActor lastApplied index moves backwards

Resolved

CONTROLLER-1760 Tooling to find the real root cause culprit of memory leaks related to non-closed transactions (and tx chains)

Resolved

NETCONF-462 TransactionChain created in RestConnectorProvider.start line 87 is never closed

Resolved

OPNFLWPLUG-933 IllegalStateException: Attempted to close chain with outstanding transaction PingPongTransaction at org.opendaylight.openflowplugin.impl.device.TransactionChainManager.createTxChain

Resolved

OPNFLWPLUG-935 TransactionChain created in OperationProcessor.<init> line 36 is never closed

Resolved

OVSDB-423 TransactionChain created in TransactionInvokerImpl.<init> line 53 is never closed

Resolved

OVSDB-424 TransactionChain created in hwvtepsouthbound TransactionInvokerImpl.<init> line 61 is never closed

Resolved

is duplicated by

CONTROLLER-1762 ODL is up and ports are listening but not functional

Resolved

relates to

NETVIRT-985 java.lang.OutOfMemoryError: Java heap space

Resolved

(4 is blocked by, 1 is duplicated by, 1 relates to)

Assignee:: Unassigned

Reporter:: Michael Vorburger

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 23/Aug/17 10:42 AM

Updated:: 13/Nov/17 4:57 PM

Resolved:: 19/Sep/17 1:09 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates