Class TailInitializationFilter

java.lang.Object
io.deephaven.engine.table.impl.util.TailInitializationFilter

public class TailInitializationFilter extends Object
For an Intraday restart, we often know that all data of interest must take place within a fixed period of time. Rather than processing all of the data, we can binary search in each partition to find the relevant rows based on a Timestamp.

This is only designed to operate against a source table, if any rows are modified or removed from the table, then the ShiftObliviousListener throws an IllegalStateException. Each contiguous range of indices is assumed to be a partition. If you filter or otherwise alter the source table before calling TailInitializationFilter, this assumption will be violated and the resulting table will not be filtered as desired.

Once initialized, the filter returns all new rows, rows that have already been passed are not removed or modified.

The input must be sorted by Timestamp, or the resulting table is undefined. Null timestamps are not permitted.

For consistency, the last value of each partition is used to determine the threshold for that partition.

  • Constructor Details

    • TailInitializationFilter

      public TailInitializationFilter()
  • Method Details

    • mostRecent

      public static Table mostRecent(Table table, String timestampName, String period)
      Get the most recent rows from each partition in source table.
      Parameters:
      table - the source table to filter
      timestampName - the name of the timestamp column
      period - interval between the last row in a partition (as converted by DateTimeUtils.parseDurationNanos(String))
      Returns:
      a table with only the most recent values in each partition
    • mostRecent

      public static Table mostRecent(Table table, String timestampName, long nanos)
      Get the most recent rows from each partition in source table.
      Parameters:
      table - the source table to filter
      timestampName - the name of the timestamp column
      nanos - interval between the last row in a partition, in nanoseconds
      Returns:
      a table with only the most recent values in each partition