Loading...

XML

Word

Printable

Type: Improvement
Resolution: Unresolved
Priority: Medium
Fix Version/s: 10.0.0
Affects Version/s: None
Component/s: clustering
Labels:
- pt

atomix-storage has a SparseJournal implementation of JournalIndex. This implementation retains every 1000th entry (by default) in the index.
While this works for small writes reasonably well, it makes no point if entry sizes are relatively large compared to maxSegmentSize.

This is a rather required trade-off with TreeMap, as TreeMap.Entry costs 32-64 bytes. Plus we need 2x16-32 bytes for an Integer and a Long (as they will not be efficiently interned). That makes we are using 64-128 bytes to what amounts to 12 bytes of data.

There is a bit more efficiency possible, as we have a JournalIndex per segment, each segment is limited to 2G, each entry has to have a 8 byte header and 1 byte value: therefore there cannot be more than 238 609 294 entries in a file. So instead of storing 8 bytes of index, we can store the first index and use relative addressing. That requires 28 bits, i.e. can be stored in 4 bytes with potentially 4 bits left for other things.

At maximum occupancy, SparseJournalIndex ends out eating ~15-30 megabytes on the index.
At minimum occupancy it is empty.

There are three operations on an index:

append
lookup floor entry for an index
truncate to an index

These are inherently linear operations and therefore let's use linear array to store entries.

// most significant 32 bits = relative index, unsigned
// least significant 32 bits = file offset, signed
long[] entries; 
int size;

This helps with per-entry overhead down by a factor of 8-16x. That means though, worst case would cost us 1.9GiB, or at least ~950MiB, if we did not introduce sparseness.

Given the our low cost of entry, we can provide many more of them. If we expend 2MiB memory we can store 262144 entries, which ends up being sparseness factor of 910, translating to read ~8KiB of data to locate the exact entry.

is duplicated by

CONTROLLER-2111 Add a smarter Journalndex implementation

Resolved

split to

CONTROLLER-2112 Persistent SegmentIndex

Confirmed

Assignee:: Unassigned

Reporter:: Robert Varga

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 21/Mar/24 4:34 AM

Updated:: 22/Mar/24 6:20 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates