Class CrossJoinHelper

java.lang.Object
io.deephaven.engine.table.impl.CrossJoinHelper

public class CrossJoinHelper extends Object
Implementation for chunk-oriented joins that produce multiple RHS rows per-LHS row, including TableOperations.join(TABLE) (referred to as simply join or "cross join") and a left outer join. The left outer join does not currently have any user visible API.

When there are zero keys, the result table uses BitShiftingColumnSources for the columns derived from the left table and a BitMaskingColumnSource for the columns on the right. The rowkey space has some number of bits for shifting, the low order bits are directly translated to the right table's rowset; the high order bits are shifted to the right and indexed into the left table. For example if the right table has a rowset of {0, 1, 7}; then 3 bits are required to address the right table and the remainder of the bits are used for the left table.

For a bucketed cross join, the number of RHS bits is determined by the size of the largest group. The LHS sources are similarly shifted using a BitShiftingColumnSource, but instead of using the rowkey space of the right hand side directly each group is flattened. The RHS ues a CrossJoinRightColumnSource to handle these flat indices. So in the case where we had a group containing an rowset of {0, 1, 7} it only requires 2 bits (for 3 values) not 3 bits (to represent the key 7). When values are added or removed from the RHS, the remaining values must appropriately shift to maintain the flat space. If the largest right hand side group increases, then we must increase the number of bits dedicated to the RHS and all of the groups require a shift.

The difference between a cross join and a left outer join is that in the case there are zero matching RHS rows, the cross join does not produce any output for that state. For the left outer join, a single row with null RHS values is produced. When a tick causes a transition from empty to non-empty or vice-versa the matched row is added or removed and the corresponding null row is removed or added in the downstream update (as opposed to being represented as a modification).

From a user-perspective, when the operation can be suitably performed using a TableOperations.naturalJoin(TABLE, java.lang.String), that operation should be preferred. The LHS columns and RowSet are passed through unchanged in a naturalJoin and the right columns have a simpler redirection. The simpler naturalJoin is likely to provide better performance, though can not handle results that require multiple RHS rows.

  • Field Details

    • DEFAULT_NUM_RIGHT_BITS_TO_RESERVE

      public static final int DEFAULT_NUM_RIGHT_BITS_TO_RESERVE
  • Method Details