[CONTROLLER-1870] Excessive Arrays.copyOf memory allocation from AbstractNormalizedNodeDataOutput Created: 12/Nov/18 Updated: 19/Dec/18 Resolved: 19/Dec/18 |
|
| Status: | Resolved |
| Project: | controller |
| Component/s: | None |
| Affects Version/s: | Oxygen SR3 |
| Fix Version/s: | Neon |
| Type: | Improvement | Priority: | Medium |
| Reporter: | Michael Vorburger | Assignee: | Tom Pantelis |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
I'm looking at a Java Flight Recording obtained from (internal) scale lab testing, and see extensive "TLAB Allocations" due to what appears to be very excessive Arrays.copyOf memory allocation from AbstractNormalizedNodeDataOutput: byte[] java.util.Arrays.copyOf(byte[], int) 1017 void java.io.ByteArrayOutputStream.grow(int) 808 void java.io.ByteArrayOutputStream.ensureCapacity(int) 808 void java.io.ByteArrayOutputStream.write(int) 429 void java.io.DataOutputStream.writeInt(int) 373 void com.google.common.io.ByteStreams$ByteArrayDataOutputStream.writeInt(int) 373 void org.opendaylight.controller.cluster.datastore.node.utils.stream.AbstractNormalizedNodeDataOutput.writeInt(int) 373 void org.opendaylight.controller.cluster.datastore.node.utils.stream.NormalizedNodeOutputStreamWriter.writeString(String) 373 void org.opendaylight.controller.cluster.datastore.node.utils.stream.NormalizedNodeOutputStreamWriter.writeQName(QName) 373 void org.opendaylight.controller.cluster.datastore.node.utils.stream.AbstractNormalizedNodeDataOutput.startNode(QName, byte) 195 void org.opendaylight.controller.cluster.datastore.node.utils.stream.AbstractNormalizedNodeDataOutput.leafNode(YangInstanceIdentifier$NodeIdentifier, Object) 195 boolean org.opendaylight.yangtools.yang.data.api.schema.stream.NormalizedNodeWriter.wasProcessAsSimpleNode(NormalizedNode) 195 NormalizedNodeWriter org.opendaylight.yangtools.yang.data.api.schema.stream.NormalizedNodeWriter.write(NormalizedNode) 195 void org.opendaylight.controller.cluster.datastore.node.utils.stream.AbstractNormalizedNodeDataOutput.writeNormalizedNode(NormalizedNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 195 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 183 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 183 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 175 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 175 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeNode(NormalizedNodeDataOutput, DataTreeCandidateNode) 175 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeChildren(NormalizedNodeDataOutput, Collection) 175 void org.opendaylight.controller.cluster.datastore.persisted.DataTreeCandidateInputOutput.writeDataTreeCandidate(DataOutput, DataTreeCandidate) 175 CommitTransactionPayload org.opendaylight.controller.cluster.datastore.persisted.CommitTransactionPayload.create(TransactionIdentifier, DataTreeCandidate) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree.startCommit(SimpleShardDataTreeCohort, DataTreeCandidate) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.commit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.CohortEntry.commit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.finishCommit(ActorRef, CohortEntry) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$3.onSuccess(DataTreeCandidate) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$3.onSuccess(Object) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.successfulPreCommit(DataTreeCandidateTip) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree$1.onSuccess(Void) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree$1.onSuccess(Object) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.doUserPreCommit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.userPreCommit(DataTreeCandidate, FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree.startPreCommit(SimpleShardDataTreeCohort) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.preCommit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.CohortEntry.preCommit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.doCommit(CohortEntry) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$2.onSuccess(Void) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator$2.onSuccess(Object) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.successfulCanCommit() 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree.lambda$processNextPendingTransaction$0(ShardDataTree$CommitEntry) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree$$Lambda$868.1938452935.accept(Object) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPending(Queue, ShardDataTreeCohort$State, Consumer) 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree.processNextPendingTransaction() 175 void org.opendaylight.controller.cluster.datastore.ShardDataTree.startCanCommit(SimpleShardDataTreeCohort) 175 void org.opendaylight.controller.cluster.datastore.SimpleShardDataTreeCohort.canCommit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.CohortEntry.canCommit(FutureCallback) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleCanCommit(CohortEntry) 175 void org.opendaylight.controller.cluster.datastore.ShardCommitCoordinator.handleReadyLocalTransaction(ReadyLocalTransaction, ActorRef, Shard) 175 void org.opendaylight.controller.cluster.datastore.Shard.handleReadyLocalTransaction(ReadyLocalTransaction) 175 void org.opendaylight.controller.cluster.datastore.Shard.handleNonRaftCommand(Object) 175 void org.opendaylight.controller.cluster.raft.RaftActor.handleCommand(Object) 175 |
| Comments |
| Comment by Tom Pantelis [ 12/Nov/18 ] |
|
We can pass in a larger initial size to ByteStreams.newDataOutput in CommitTransactionPayload.create to avoid excessive reallocations. |
| Comment by Stephen Kitt [ 13/Nov/18 ] |
|
Note that the JFR “stack traces” only provide a partial view: byte[] java.util.Arrays.copyOf(byte[], int) 1017 void java.io.ByteArrayOutputStream.grow(int) 808 void java.io.ByteArrayOutputStream.ensureCapacity(int) 808 void java.io.ByteArrayOutputStream.write(int) 429 void java.io.DataOutputStream.writeInt(int) 373 void com.google.common.io.ByteStreams$ByteArrayDataOutputStream.writeInt(int) 373 means that 1017 calls to copyOf() resulted in a new TLAB; of those 1017 calls, 808 were a result of grow(); of those 808, 429 were a result of write(); of those 429, 373 were a result of writeInt(). All the other calls aren’t explained by this output. |
| Comment by Michael Vorburger [ 13/Nov/18 ] |
You mean fixed? It will change, considerably, based on the exact NN, won't it? Or just be made a configurable max. option? But then we'd overallocate. Perhaps we could do something smarter and "guesstimate" the req size? Like some calculation on the NormalizedNode to pre-determine how many bytes it requires when streamed. (We know the format, and it would be hard-coded to that format.) Or... couldn't we rewrite things to completely avoid a byte[] and only stream? (I haven't looked into the code.) |
| Comment by Tom Pantelis [ 13/Nov/18 ] |
|
Fixed (and configurable) might suffice, say 512 sounds reasonable - that would eliminate 4 re-allocations. Doing a "guesstimate" would require pre-walking the NN tree which would add expense. The reason it's converted to a byte[] is that the serialized format has a much smaller footprint - the payload is stored in the in-memory journal so that can be significant. |
| Comment by Robert Varga [ 14/Nov/18 ] |
|
Yeah, guestimating the size will lower TLAB allocation, bug will bring down the overall performance, as we have to walk the tree twice – that certainly is not worth it. |
| Comment by Tom Pantelis [ 04/Dec/18 ] |
| Comment by Michael Vorburger [ 19/Dec/18 ] |
|
This is in Neon now, let's close this, for now; we can cherry-pick to Fluorine later, if needed? |