Interface UpdateByOperator

All Known Implementing Classes:
BaseByteUpdateByOperator, BaseCharUpdateByOperator, BaseDoubleUpdateByOperator, BaseFloatUpdateByOperator, BaseIntUpdateByOperator, BaseLongUpdateByOperator, BaseObjectBinaryOperator, BaseObjectUpdateByOperator, BasePrimitiveEMAOperator, BaseShortUpdateByOperator, BigDecimalCumProdOperator, BigDecimalCumSumOperator, BigDecimalEMAOperator, BigIntegerCumProdOperator, BigIntegerCumSumOperator, BigIntegerEMAOperator, BigNumberEMAOperator, BooleanFillByOperator, ByteCumMinMaxOperator, ByteCumProdOperator, ByteCumSumOperator, ByteEMAOperator, ByteFillByOperator, CharFillByOperator, ComparableCumMinMaxOperator, DoubleCumMinMaxOperator, DoubleCumProdOperator, DoubleCumSumOperator, DoubleEMAOperator, DoubleFillByOperator, FloatCumMinMaxOperator, FloatCumProdOperator, FloatCumSumOperator, FloatEMAOperator, FloatFillByOperator, IntCumMinMaxOperator, IntCumProdOperator, IntCumSumOperator, IntEMAOperator, IntFillByOperator, LongCumMinMaxOperator, LongCumProdOperator, LongCumSumOperator, LongEMAOperator, LongFillByOperator, LongRecordingUpdateByOperator, ObjectFillByOperator, ShortCumMinMaxOperator, ShortCumProdOperator, ShortCumSumOperator, ShortEMAOperator, ShortFillByOperator

public interface UpdateByOperator
  • Field Details

  • Method Details

    • isAppendOnly

      static boolean isAppendOnly​(@NotNull TableUpdate update, long lastKnownKey)
      Check if the specified TableUpdate was append-only given the last known key within some other index.
      Parameters:
      update - the update to check
      lastKnownKey - the last known key from some other index.
      Returns:
      if the update was append-only given the last known key
    • determineSmallestVisitedKey

      static long determineSmallestVisitedKey​(@NotNull TableUpdate upstream, @NotNull RowSet affected)
      Find the smallest valued key that participated in the upstream TableUpdate.
      Parameters:
      upstream - the update
      Returns:
      the smallest key that participated in any part of the update.
    • determineSmallestVisitedKey

      static long determineSmallestVisitedKey​(@NotNull RowSet added, @NotNull RowSet modified, @NotNull RowSet removed, @NotNull RowSetShiftData shifted, @NotNull RowSet affectedIndex)
      Find the smallest valued key that participated in the upstream TableUpdate.
      Parameters:
      added - the added rows
      modified - the modified rows
      removed - the removed rows
      shifted - the shifted rows
      Returns:
      the smallest key that participated in any part of the update.
    • setBucketCapacity

      void setBucketCapacity​(int capacity)
      Notify the operator of the current maximum bucket.
      Parameters:
      capacity - the capacity
    • getInputColumnName

      @NotNull String getInputColumnName()
      Get the name of the input column this operator depends on.
      Returns:
      the name of the input column
    • getAffectingColumnNames

      @NotNull String[] getAffectingColumnNames()
      Get an array of column names that, when modified, affect the result of this computation.
      Returns:
      an array of column names that affect this operator.
    • getOutputColumnNames

      @NotNull String[] getOutputColumnNames()
      Get an array of the output column names.
      Returns:
      the output column names.
    • getOutputColumns

      @NotNull Map<String,​ColumnSource<?>> getOutputColumns()
      Get a map of outputName to output ColumnSource for this operation.
      Returns:
      a map of output column name to output column source
    • startTrackingPrev

      void startTrackingPrev()
      Indicate that the operation should start tracking previous values for ticking updates.
    • makeUpdateContext

      @NotNull UpdateByOperator.UpdateContext makeUpdateContext​(int chunkSize)
      Make an UpdateByOperator.UpdateContext suitable for use with non-bucketed updates.
      Parameters:
      chunkSize - The expected size of chunks that will be provided during the update,
      Returns:
      a new context
    • initializeForUpdate

      void initializeForUpdate​(@NotNull UpdateByOperator.UpdateContext context, @NotNull TableUpdate upstream, @NotNull RowSet resultSourceIndex, boolean usingBuckets, boolean isUpstreamAppendOnly)
      Initialize the operator for an update cycle. This is invoked before any other update processing occurs.
      Parameters:
      context - the context object
      upstream - the upstream update to process
      resultSourceIndex - the result index of the source table
      usingBuckets - if the update is bucketed or not
      isUpstreamAppendOnly - if the upstream update was detected to be append-only.
    • initializeFor

      void initializeFor​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSet updateIndex, @NotNull UpdateBy.UpdateType type)

      Initialize the context for the specified stage of the update process. This will always be followed by a call to finishFor(UpdateContext, UpdateBy.UpdateType) at the end of each successful update.

      Parameters:
      context - the context object
      updateIndex - the index of rows associated with the update.
      type - the type of update being applied
    • finishFor

      void finishFor​(@NotNull UpdateByOperator.UpdateContext context, @NotNull UpdateBy.UpdateType type)
      Perform and bookkeeping required at the end of a single part of the update. This is always preceded with a call to initializeFor(UpdateContext, RowSet, UpdateBy.UpdateType)
      Parameters:
      context - the context object
      type - the type of update being applied
    • getAdditionalModifications

      @NotNull RowSet getAdditionalModifications​(@NotNull UpdateByOperator.UpdateContext context)
      Get an index of rows that were modified beyond the input set of modifications from the upstream. This is invoked once at the end of a complete update cycle (that is, after all adds, removes, modifies and shifts have been processed) if anyModified(UpdateContext) has returned true.
      Parameters:
      context - the context object
      Returns:
      a index of additional rows that were modified
    • anyModified

      boolean anyModified​(@NotNull UpdateByOperator.UpdateContext context)
      Check if the update has modified any rows for this operator. This is invoked once at the end of a complete update cycle (that is, after all adds, removes, modifies and shifts have been processed).
      Parameters:
      context - the context object
      Returns:
      true if the update modified any rows.
    • requiresKeys

      boolean requiresKeys()
      Query if the operator requires key values for the current stage. This method will always be invoked after an appropriate invocation of initializeFor(UpdateContext, RowSet, UpdateBy.UpdateType)
      Returns:
      true if the operator requires keys for this operation
    • requiresValues

      boolean requiresValues​(@NotNull UpdateByOperator.UpdateContext context)
      Query if the operator requires values for the current stage.
      Parameters:
      context - the context object
      Returns:
      true if values are required for compuitation
    • canProcessNormalUpdate

      boolean canProcessNormalUpdate​(@NotNull UpdateByOperator.UpdateContext context)
      Query if this operator can process the update normally, or if it can only reprocess. This method is guaranteed to be invoked after initializeForUpdate(UpdateContext, TableUpdate, RowSet, boolean, boolean) so the operator is aware of the upstream TableUpdate.
      Parameters:
      context - the context
      Returns:
      true if this operator can process the update normally.
    • setChunkSize

      void setChunkSize​(@NotNull UpdateByOperator.UpdateContext context, int chunkSize)
      Set the chunk size to be used for operations. This is used during the UpdateBy.UpdateType.Reprocess phase when the chunks allocated during the normal processing phase may not be large enough.
      Parameters:
      context - the context object
      chunkSize - the new chunk size
    • onBucketsRemoved

      void onBucketsRemoved​(@NotNull RowSet removedBuckets)
      Called when some buckets have been completely emptied. Operators can use this to reset internal states.
      Parameters:
      removedBuckets - an index of removed bucket positions.
    • addChunk

      void addChunk​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSequence inputKeys, @Nullable LongChunk<OrderedRowKeys> keyChunk, @NotNull Chunk<Values> values, long bucketPosition)
      Add a chunk of non-bucketed items to the operation.
      Parameters:
      context - the context object
      keyChunk - a chunk of keys for the rows being added. If the operator returns false for requiresKeys() this will be null.
      values - the chunk of values for the rows being added
      bucketPosition - the group position
    • addChunk

      void addChunk​(@NotNull UpdateByOperator.UpdateContext context, @NotNull Chunk<Values> values, @NotNull LongChunk<? extends RowKeys> keyChunk, @NotNull IntChunk<RowKeys> bucketPositions, @NotNull IntChunk<ChunkPositions> startPositions, @NotNull IntChunk<ChunkLengths> runLengths)
      Add a chunk of bucketed items to the operation.
      Parameters:
      context - the context object
      values - the value chunk
      keyChunk - a chunk of keys for the rows being added
      bucketPositions - a chunk of hash bucket positions for each key
      runLengths - the runLengths of each run of bucket values
      startPositions - the start position of a run within the chunk
    • modifyChunk

      void modifyChunk​(@NotNull UpdateByOperator.UpdateContext context, @Nullable LongChunk<OrderedRowKeys> prevKeyChunk, @Nullable LongChunk<OrderedRowKeys> keyChunk, @NotNull Chunk<Values> prevValuesChunk, @NotNull Chunk<Values> postValuesChunk, long bucketPosition)
      Modify a chunk of values with the operation.
      Parameters:
      context - the context object
      prevKeyChunk - a chunk of pre-shift keys. This will be equal to keyChunk if no shift is present
      keyChunk - a chunk of post-shift space keys for the update.
      prevValuesChunk - a chunk of previous values for the update
      postValuesChunk - a chunk of current values for the update
      bucketPosition - the position of the current group being processed
    • removeChunk

      void removeChunk​(@NotNull UpdateByOperator.UpdateContext context, @Nullable LongChunk<OrderedRowKeys> keyChunk, @NotNull Chunk<Values> prevValuesChunk, long bucketPosition)
      Remove a chunk of values from the operation.
      Parameters:
      context - the context object
      keyChunk - a chunk of keys being removed.
      prevValuesChunk - the chunk of values being removed
      bucketPosition - the position of the current group being processed
    • applyShift

      void applyShift​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSet prevIndex, @NotNull RowSetShiftData shifted)
      Apply a shift to the operation.
      Parameters:
      context - the context object
      prevIndex - the pre-shifted index
      shifted - the shifts being applied
    • applyOutputShift

      void applyOutputShift​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSet subIndexToShift, long delta)
      Apply a shift to the operation.
      Parameters:
      context - the context object
    • reprocessChunk

      void reprocessChunk​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSequence inputKeys, @Nullable LongChunk<OrderedRowKeys> keyChunk, @NotNull Chunk<Values> valuesChunk, @NotNull RowSet postUpdateSourceIndex)
      Reprocess a chunk of data for a non-bucketed table.
      Parameters:
      context - the context object
      inputKeys - the keys contained in the chunk
      keyChunk - a LongChunk containing the keys if requested by requiresKeys() or null.
      valuesChunk - the current chunk of working values.
      postUpdateSourceIndex - the resulting source index af
    • reprocessChunk

      void reprocessChunk​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSequence inputKeys, @NotNull Chunk<Values> values, @NotNull LongChunk<? extends RowKeys> keyChunk, @NotNull IntChunk<RowKeys> bucketPositions, @NotNull IntChunk<ChunkPositions> runStartPositions, @NotNull IntChunk<ChunkLengths> runLengths)
      Reprocess a chunk of data for a bucketed table.
      Parameters:
      context - the context object
      inputKeys - the keys contained in the chunk
      values - the current chunk of working values.
      keyChunk - a LongChunk containing the keys.
      bucketPositions - a IntChunk containing the bucket position of each key. Parallel to `keyChunk` and `values
      runStartPositions - the starting positions of each run within the key and value chunk
      runLengths - the run runLengths of each run in the key and value chunks. Parallel to `runStartPositions`
    • resetForReprocess

      void resetForReprocess​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSet sourceIndex, long firstUnmodifiedKey)
      Reset the operator to the state at the `firstModifiedKey` for non-bucketed operation. This is invoked immediately prior to calls to reprocessChunk(UpdateContext, RowSequence, LongChunk, Chunk, RowSet).

      A `firstUnmodifiedKey` of RowSequence.NULL_ROW_KEY indicates that the entire table needs to be recomputed.
      Parameters:
      context - the context object
      sourceIndex - the current index of the source table
      firstUnmodifiedKey - the first unmodified key after which we will reprocess rows.
    • resetForReprocess

      void resetForReprocess​(@NotNull UpdateByOperator.UpdateContext context, @NotNull RowSet bucketIndex, long bucketPosition, long firstUnmodifiedKey)
      Reset the operator to the state at the `firstModifiedKey` for the specified bucket. This is invoked immediately prior to calls to reprocessChunk(UpdateContext, RowSequence, Chunk, LongChunk, IntChunk, IntChunk, IntChunk).
      Parameters:
      context - the context object
      bucketIndex - the current index of the specified bucket
      firstUnmodifiedKey - the first unmodified key in the bucket after which we will reprocess rows.