Class RspArray<T extends RspArray>

java.lang.Object
io.deephaven.engine.rowset.impl.RefCountedCow<T>
io.deephaven.engine.rowset.impl.rsp.RspArray<T>
Direct Known Subclasses:
RspBitmap

public abstract class RspArray<T extends RspArray> extends RefCountedCow<T>

A set representation for long values using Regular Space Partitioning (RSP) of the long space in "blocks" of (2^16) elements.

Modeled heavily after roaringbitmap.RoaringArray (keeping API method names and semantics as much as possible), with modifications for:

  1. Full "unsigned long" 64 bit range (as opposed to 32 bit in RoaringArray).
  2. Spans of all bits set ("AllSet") that can be arbitrarily big (ie, not constrained to 2^16 = RB Container size).

The handling of unsigned values follows RB; ie, key values are compared/sorted as unsigned longs.

Definitions:

  • A "block" is a particular interval [n*2^16, (n+1)*2^16 - 1] of the long domain.
  • A "span" is a partition of the domain consisting of one or more consecutive blocks; a span is a subset of the domain represented by an interval [n*2^16, (n+m)*2^16 - 1], m >= 1.
  • Full blocks are blocks whose domain are fully contained in the set, ie, the set contains every possible value in the block's interval (as a bitmap, it would be "all ones").
  • Spans of full blocks are represented by a single "full blocks span" object (just a Long) which knows how many 2^16 ranges it has (it's "full blocks span len" ("flen") is the number of full blocks in the span).
  • Individual blocks that are not completely full are stored in an RB Container; their "full blocks span len" is zero.

Our internal representation uses two parallel arrays:

  • a long[] spanInfos array that contains the information for the offset to the values in the span, which we call the span's "key". For instance, a full block span that represents all the long values in [65536, 196607] has as its key the value 65536.
  • an Object[] spans array that contains the actual spans. At the most basic level, a span can be either a full block span or a container of values (but there is nuance in exactly how to represent them, see below).

We use several optimizations to reduce memory utilization for sparse sets. Details follow.

The long[] spanInfos and Object[] spans data members of this class are used, combined, to represent the offset (key) and span values in the set, against that offset. The two arrays are used, together, as parallel arrays and the information for a given conceptual span is contained in both of them for the same corresponding rowSet i.

There are two basic cases for a span: it is either a full blocks span, containing a >=1 number of full blocks, or it is a container, containing individual values in the particular 2^16 block corresponding to the span's key.

There are four ways total that these two cases can be represented between the long in the `spanInfos` array and the Object in the `spans` array. Given a span at position `i`:

  1. If the corresponding Object spans[i] is of type Long, then the long spanInfos[i] value is the key for the span (with its lower 16 bits as zero), and the Long value represents how many full blocks are present. Example, the set [ 0, 2^50 - 1 ] is represented as spanInfo==0 and span==Long(2^34).
  2. As an optimization to conserve memory, if the Object spans[i] is the Object reference with value FULL_BLOCK_SPAN_MARKER (a singleton and final marker Object defined statically in this file). then the upper 48 bits of the long spanInfo[i] value represent the key for the span, and the lower 16 bits of the long spanInfo[i] value represent the full block span length. Example, the set [ 65536, 196607 ] is represented by spanInfo==65538 and span==FULL_BLOCK_SPAN_MARKER (note 196607 == 65536*3 - 1, so the set is 2 full blocks, and 65538 == 65536 | 2.
  3. If the corresponding Object spans[i] is null, then the long spanInfos[i] represents the single value present in the span (note in this case, its upper 16 bits still corresponds to its key). Example, the set { 65537 } is represented by spanInfo==65537 and span==null.
  4. If the corresponding Object spans[i] is of type short[] or of type Container, then it represents a container of multiple values in a single block (but not all of the possible values in the block, since in that case it would be a full block span as above). In this case the higher 48 bits of its corresponding spanInfo represent the key for the span. Depending on the actual type of span there are two subcases:
    1. If spans[i] is of type Container, then the values in the roaringbitmaps container object are part of the set, considered against its key offset. The key is represented in the higher 48 bits of its corresponding spaninfo. The lower 16 bits of spanInfo are zero in this case. Example, the set [ 100,000-100,010, 100,020-100,030 ] is represented by spaInfo==65536, span==RunContainer({34464-34474, 34484-34494})
    2. If spans[i] is of type short[], then an ArrayContainer with the short[] contents needs to be reconstructed. The lower 16 bits of the spanInfo value are used to represent the other data members of ArrayContainer. This case exists as an optimization to reduce memory utilization for sparse blocks. For details of this reconstruction please see the code for the definition of the SpanView class below.

Notes:

  • Our version of RB Container supports a "shared" boolean flag that is used to implement copy-on-write (COW) semantics and allow operation results to share containers in COW fashion.
  • We extended the Container class hierarchy to include specializations for empty, single value, single range, and two values containers. These are immutable; empty and singleton are used only as a way to return empty and singleton results, and are never actually stored in the spans array. For details, please see the Container class definition and derived class hierarchy.
  • Field Details

    • debug

      public static final boolean debug
    • BLOCK_SIZE

      public static final int BLOCK_SIZE
      See Also:
    • BLOCK_LAST

      public static final int BLOCK_LAST
      See Also:
    • BITS_PER_BLOCK

      public static final int BITS_PER_BLOCK
    • spanInfos

      protected long[] spanInfos
      Array of keys (in the long's higher 48 bits) and other span data (in the long's lower 16 bits) parallel to the spans array, mapping the long value in a given array position to the corresponding span in the same position. Please see the documentation for this class for details of the different cases for the lower 16 bits, depending on the type of span. Values are kept in unsigned sorted order according to higher 16 bits to enable binary search of keys.
    • spans

      protected Object[] spans
      Array of Spans parallel to the spanInfos array, mapping the same index to the corresponding span for the spanInfo. Please see the documentation for this class for the different possible types allowed and their meanings.
    • FULL_BLOCK_SPAN_MARKER

      public static final Object FULL_BLOCK_SPAN_MARKER
    • workDataPerThread

      protected static final ThreadLocal<io.deephaven.engine.rowset.impl.rsp.RspArray.WorkData> workDataPerThread
  • Constructor Details

    • RspArray

      protected RspArray(RspArray other)
    • RspArray

      public RspArray(RspArray src, int startIdx, int endIdx)
    • RspArray

      public RspArray(RspArray src, int startIdx, long startOffset, int endIdx, long endOffset)
  • Method Details

    • shareContainers

      protected boolean shareContainers()
    • make

      protected abstract T make(RspArray src, int startIdx, long startOffset, int endIdx, long endOffset)
    • make

      protected abstract T make()
    • highBits

      public static long highBits(long val)
    • lowBits

      public static short lowBits(long val)
    • lowBitsAsInt

      public static int lowBitsAsInt(long val)
    • divBlockSize

      public static long divBlockSize(long v)
    • modBlockSize

      public static int modBlockSize(long v)
    • getSpanInfo

      protected final long getSpanInfo(int i)
    • spanInfoToKey

      protected static long spanInfoToKey(long spanInfo)
    • getKey

      protected final long getKey(int i)
    • getSingletonSpanValue

      protected final long getSingletonSpanValue(int i)
    • spanInfoToSingletonSpanValue

      protected static long spanInfoToSingletonSpanValue(long spanInfo)
    • setSingletonSpanRaw

      protected final void setSingletonSpanRaw(int i, long value)
    • setSingletonSpan

      protected final void setSingletonSpan(int i, long value)
    • applyKeyOffset

      protected final void applyKeyOffset(int i, long offset)
    • setFullBlockSpanRaw

      protected void setFullBlockSpanRaw(int i, long key, long flen)
    • setFullBlockSpanRaw

      protected static void setFullBlockSpanRaw(int i, long[] spanInfos, Object[] spans, long key, long flen)
    • setFullBlockSpan

      protected void setFullBlockSpan(int i, long key, long flen)
    • getPackedInfoLowBits

      public static long getPackedInfoLowBits(ArrayContainer ac)
    • setContainerSpanRaw

      protected static void setContainerSpanRaw(long[] spanInfos, Object[] spans, int i, long key, Container container)
    • setContainerSpanRaw

      protected void setContainerSpanRaw(int i, long key, Container container)
    • appendSharedContainer

      protected void appendSharedContainer(RspArray other, long otherSpanInfo, Container container)
    • appendSharedContainerMaybePacked

      protected void appendSharedContainerMaybePacked(RspArray other, int otherIdx, long otherSpanInfo, Object otherContainer)
    • setSharedContainerMaybePackedRaw

      protected void setSharedContainerMaybePackedRaw(int i, RspArray src, int srcIdx, long srcSpanInfo, Object srcContainer)
    • insertSharedContainer

      protected void insertSharedContainer(int i, RspArray other, long otherSpanInfo, Container otherContainer)
    • setSharedContainerRaw

      protected void setSharedContainerRaw(int i, RspArray other, long key, Container container)
    • copyKeyAndSpanStealingContainers

      protected void copyKeyAndSpanStealingContainers(int srcIdx, long[] srcSpanInfos, Object[] srcSpans, int dstIdx, long[] dstSpanInfos, Object[] dstSpans)
    • copyKeyAndSpanMaybeSharing

      protected void copyKeyAndSpanMaybeSharing(long shiftAmount, RspArray src, int srcIdx, long[] dstSpanInfos, Object[] dstSpans, int dstIdx, boolean tryShare)
    • copyKeyAndSpanMaybeSharing

      protected void copyKeyAndSpanMaybeSharing(RspArray src, int srcIdx, long[] dstSpanInfos, Object[] dstSpans, int dstIdx)
    • setContainerSpan

      protected void setContainerSpan(int i, long key, Container c)
    • setContainerSpan

      protected void setContainerSpan(Container oldContainer, int i, long key, Container newContainer)
    • isSingletonSpan

      protected static boolean isSingletonSpan(Object o)
    • size

      public int size()
    • isEmpty

      public boolean isEmpty()
    • firstValue

      public long firstValue()
    • firstValueAtIndex

      public long firstValueAtIndex(int i)
    • keyForFirstBlock

      public long keyForFirstBlock()
    • keyForLastBlock

      public long keyForLastBlock()
    • lastValue

      public long lastValue()
    • getRangeIterator

      public RspRangeIterator getRangeIterator()
    • getRangeBatchIterator

      public RspRangeBatchIterator getRangeBatchIterator(long initialSeek, long maxCount)
    • getIterator

      public RspIterator getIterator()
    • getReverseIterator

      public RspReverseIterator getReverseIterator()
    • ensureSizeCanGrowBy

      protected void ensureSizeCanGrowBy(int n)
      Make sure there is capacity for at least n more spans
    • tryCompactUnsafe

      public void tryCompactUnsafe(int compactFactor)
      Parameters:
      compactFactor - if k == 0, compact if count < capacity. k > 0, compact if (capacity - count > (capacity >> k).
    • tryCompact

      public void tryCompact(int compactFactor)
    • keySearch

      public int keySearch(int startPos, long key)
    • keySearch

      public int keySearch(int startPos, int endPosExclusive, long key)
    • isFullBlockSpan

      public static boolean isFullBlockSpan(Object s)
    • isContainer

      public static boolean isContainer(Object s)
    • getFullBlockSpanLen

      public static long getFullBlockSpanLen(long spanInfo, Object span)
    • getSpanIndex

      public int getSpanIndex(long key)
      Returns:
      if the key is included in some existing span, returns the index of that span. if the key is not included in any existing span, returns -(p - 1) where p is the position a span for the key would be inserted. Note that, since a span's covered interval may include multiple blocks, a key contained by a span may be different than its first key (if the span includes more than one block).
    • getSpanIndex

      public int getSpanIndex(int fromIndex, long key)
    • getSpanIndex

      public int getSpanIndex(int fromIndex, int endIndexExclusive, long key)
    • searchSpanIndex

      public int searchSpanIndex(int startPos, RowSetUtils.Comparator comp)
    • getSpanCardinalityAtIndexMaybeAcc

      public long getSpanCardinalityAtIndexMaybeAcc(int i)
    • getSpanCardinalityAtIndex

      public long getSpanCardinalityAtIndex(int i)
    • getSpanCardinalityAtIndex

      public long getSpanCardinalityAtIndex(int i, boolean optimizeContainers)
    • getCardinality

      public long getCardinality()
    • nextKey

      public static long nextKey(long key)
    • distanceInBlocks

      public static long distanceInBlocks(long blockKeyStart, long blockKeyEnd)
      blockKeyEnd is exclusive. Assumption on entry: blockKeyStart <= blockKeyEnd
      Parameters:
      blockKeyStart - inclusive start block key (only high 48 bits set).
      blockKeyEnd - exclusive end block key (only high 48 bits set).
      Returns:
      distance in blocks between blockKeyStart and blockKeyEnd
    • setOrInsertFullBlockSpanAtIndex

      public int setOrInsertFullBlockSpanAtIndex(int newSpanIdx, long newSpanKey, long newSpanFlen, org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu)
      Parameters:
      newSpanIdx - an index, as returned by getSpanAtIndex(k). Note this can be negative, in which case this is an insertion (existing elements pushed to the right as necessary).
      newSpanKey - the key.
      newSpanFlen - the number of 2^16 intervals.
      Returns:
      the (positive) index where the span was actually inserted.
    • setLastFullBlockSpan

      public void setLastFullBlockSpan(long k, long slen)
    • appendSingletonSpan

      public void appendSingletonSpan(long v)
    • appendContainer

      public void appendContainer(long k, Container c)
    • appendFullBlockSpan

      public void appendFullBlockSpan(long k, long slen)
    • insertFullBlockSpanAtIndex

      public void insertFullBlockSpanAtIndex(int i, long key, long flen)
      Insert a full block span at position i with key k, pushing the existing elements to the right. The caller should ensure that the key order is preserved by this operation.
      Parameters:
      i - position in which to insert
      key - key for the span to be inserted
      flen - the full block len for the span inserted.
    • insertSingletonAtIndex

      public void insertSingletonAtIndex(int i, long value)
      Insert a new singleton span at position i with key k, pushing the existing elements to the right. The caller should ensure that the key order is preserved by this operation.
      Parameters:
      i - position in which to insert
      value - the singleton value for the span to be inserted
    • insertContainerAtIndex

      public void insertContainerAtIndex(int i, long key, Container c)
      Insert a container at position i with key k, pushing the existing elements to the right. The caller should ensure that the key order is preserved by this operation.
      Parameters:
      i - position in which to insert
      key - key for the span to be inserted
      c - the container to be inserted
    • removeSpanAtIndex

      public void removeSpanAtIndex(int i)
    • replaceSpanAtIndex

      public void replaceSpanAtIndex(int i, RspArray.ArraysBuf buf)
      Replace the span at index i with the keys and spans from buf,
    • binarySearchKeys

      public int binarySearchKeys(int fromIndex, int toIndex, RowSetUtils.Comparator comp)
    • unsignedShortToLong

      public static long unsignedShortToLong(short x)
    • unsignedShortToInt

      public static int unsignedShortToInt(short x)
    • paste

      public static long paste(long highBits, short lowBits)
    • get

      public long get(long pos)
    • getKeysForPositions

      public void getKeysForPositions(PrimitiveIterator.OfLong inputPositions, LongConsumer outputKeys)
    • find

      public long find(long val)
    • subsetOf

      public boolean subsetOf(RspArray other)
    • containsRange

      public boolean containsRange(long start, long end)
    • overlaps

      public boolean overlaps(RspArray other)
    • overlapsRange

      public boolean overlapsRange(long first, long last)
      Returns true if any value in this RspArray is contained inside the range [first, last], false otherwise.
      Parameters:
      first - First value in the range to check
      last - Last value in the range to check
      Returns:
    • overlapsRange

      public int overlapsRange(int iStart, long start, long end)
    • orEqualsUnsafeNoWriteCheck

      public void orEqualsUnsafeNoWriteCheck(RspArray other)
      For every element in other, add element to this RspArray. The argument won't be modified (with the possible exclusion of sharing some of its containers Copy On Write).
      Parameters:
      other - the RspArray to add to this.
    • orEqualsShiftedUnsafeNoWriteCheck

      public void orEqualsShiftedUnsafeNoWriteCheck(long shiftAmount, RspArray other)
      For every element in other, add (element + shiftAmount) to this RspArray. Note shiftAmount is assumed to be a multiple of BLOCK_SIZE. The argument won't be modified (with the possible exclusion of sharing some of its containers Copy On Write).
      Parameters:
      shiftAmount - the amount to add to each key in other before insertion
      other - the base keys to add in the (key + shiftAmount) formula for insertion.
    • markIndexAsRemoved

      protected void markIndexAsRemoved(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu, int index)
    • markIndexRangeAsRemoved

      protected void markIndexRangeAsRemoved(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu, int iFirst, int iLast)
    • collectRemovedIndicesIfAny

      protected void collectRemovedIndicesIfAny(org.apache.commons.lang3.mutable.MutableObject<SortedRanges> madeNullSpansMu)
    • getWorkSortedRangesMutableObject

      protected org.apache.commons.lang3.mutable.MutableObject<SortedRanges> getWorkSortedRangesMutableObject(io.deephaven.engine.rowset.impl.rsp.RspArray.WorkData wd)
    • andNotEqualsUnsafeNoWriteCheck

      public void andNotEqualsUnsafeNoWriteCheck(RspArray other)
      Remove every element from argument from this RspArray. Argument won't be modified.
      Parameters:
      other - the elements to remove.
    • andEqualsUnsafeNoWriteCheck

      public void andEqualsUnsafeNoWriteCheck(RspArray other)
      Intersects this RspArray with the argument, leaving the result on this RspArray. The argument won't be modified.
      Parameters:
      other - an RspArray.
    • applyKeyOffset

      public void applyKeyOffset(long offset)
    • forEachLong

      public boolean forEachLong(LongAbortableConsumer lc)
    • forEachLongRangeInSpanWithOffsetAndMaxCardinality

      public boolean forEachLongRangeInSpanWithOffsetAndMaxCardinality(int i, long offset, long maxCardinality, LongRangeAbortableConsumer larc)
    • forEachLongRange

      public boolean forEachLongRange(LongRangeAbortableConsumer lrac)
    • subrangeByKeyInternal

      protected T subrangeByKeyInternal(long start, long end)
    • subrangeByPosInternal

      protected T subrangeByPosInternal(long firstPos, long lastPos)
    • removeRangeUnsafeNoWriteCheck

      public void removeRangeUnsafeNoWriteCheck(long start, long end)
    • removeRangesUnsafeNoWriteCheck

      public void removeRangesUnsafeNoWriteCheck(RowSet.RangeIterator rit)
    • uMax

      public static long uMax(long k1, long k2)
    • uMin

      public static long uMin(long k1, long k2)
    • uLess

      public static boolean uLess(long k1, long k2)
    • uLessOrEqual

      public static boolean uLessOrEqual(long k1, long k2)
    • uGreater

      public static boolean uGreater(long k1, long k2)
    • uGreaterOrEqual

      public static boolean uGreaterOrEqual(long k1, long k2)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getRowSequenceByPosition

      public RowSequence getRowSequenceByPosition(long startPositionInclusive, long length)
    • getRowSequenceByKeyRange

      public RowSequence getRowSequenceByKeyRange(long startValueInclusive, long endValueInclusive)
    • asRowSequence

      public RspRowSequence asRowSequence()
    • getRowSequenceIterator

      public RowSequence.Iterator getRowSequenceIterator()
    • getAverageRunLengthEstimate

      public long getAverageRunLengthEstimate()
    • getAverageRunLengthEstimate

      public long getAverageRunLengthEstimate(int startIdx, int endIdx)
    • rangesCountUpperBound

      public long rangesCountUpperBound()
    • rangesCountUpperBound

      public long rangesCountUpperBound(int startIdx, int endIdx)
    • valuesToString

      public String valuesToString()
    • containerOverhead

      public double containerOverhead()
    • sampleMetrics

      public void sampleMetrics(LongConsumer rspParallelArraysSizeUsed, LongConsumer rspParallelArraysSizeUnused, LongConsumer arrayContainersBytesAllocated, LongConsumer arrayContainersBytesUnused, LongConsumer arrayContainersCardinality, LongConsumer arrayContainersCount, LongConsumer bitmapContainersBytesAllocated, LongConsumer bitmapContainersBytesUnused, LongConsumer bitmapContainersCardinality, LongConsumer bitmapContainersCount, LongConsumer runContainersBytesAllocated, LongConsumer runContainersBytesUnused, LongConsumer runContainersCardinality, LongConsumer runContainersCount, LongConsumer runContainersRunsCount, LongConsumer singleRangeContainersCount, LongConsumer singleRangeContainerCardinality, LongConsumer singletonContainersCount, LongConsumer twoValuesContainerCount, LongConsumer fullBlockSpansCount, LongConsumer fullBlockSpansLen)
    • tryCompact

      protected final OrderedLongSet tryCompact()