update_by
update_by
performs one or more UpdateByOperations
grouped by zero or more key columns to calculate cumulative or window-based aggregations of columns in a source table. Operations include cumulative sums, moving averages, EMAs, etc.
The aggregations are defined by the provided operations, which support incremental aggregations over the corresponding rows in the source table. Cumulative aggregations use all rows in the source table, whereas rolling aggregations will apply position or time-based windowing relative to the current row. Calculations are performed over all rows or each row group as identified by the provided key columns.
Syntax
update_by(ops: list[UpdateByOperation], by: list[str] = []) -> Table
Parameters
Parameter | Type | Description |
---|---|---|
ops | list[UpdateByOperation] | A list containing one or more
|
by optional | list[str] | Zero or more key columns that group rows of the table. The default is |
Returns
A new table with rolling window operations applied to the specified column(s).
Examples
In the following example, a source
table is created. The source
table contains two columns: Letter
and X
. An update_by
is applied to the source
table, which calculates the cumulative sum of the X
column. The Letter
column is given as the by
column. Letter
is A
when RowIndex
is even, and B
when odd. Thus, the result
table contains a new column, SumX
, which contains the cumulative sum of the X
column, grouped by Letter
.
from deephaven.updateby import cum_sum
from deephaven import empty_table
source = empty_table(10).update(["Letter = (i % 2 == 0) ? `A` : `B`", "X = i"])
result = source.update_by(ops=cum_sum(cols=["SumX = X"]), by=["Letter"])
- source
- result
The following example takes the same source data, but instead computes a rolling sum using rolling_sum_tick
. The rolling sum is calculated given a window of two rows back, and two rows ahead. Thus, SumX
has the windowed sum of a five-row window, where each value is at the center of the window. Rows at the beginning and end of the table don't have enough data above and below them, respectively, so their summed values are smaller.
from deephaven.updateby import rolling_sum_tick
from deephaven import empty_table
source = empty_table(10).update(["Letter = (i % 2 == 0) ? `A` : `B`", "X = i"])
result = source.update_by(
ops=rolling_sum_tick(cols=["RollingSumX = X"], rev_ticks=3, fwd_ticks=2),
by=["Letter"],
)
- source
- result
The following example builds on the previous examples by adding a second data column, Y
, to the source
table. The cum_sum
UpdateByOperation
is then given two columns, so that the cumulative sum of the X
and Y
columns are both calculated.
from deephaven.updateby import cum_sum
from deephaven import empty_table
source = empty_table(10).update(
["Letter = (i % 2 == 0) ? `A` : `B`", "X = i", "Y = randomInt(0, 10)"]
)
result = source.update_by(ops=cum_sum(cols=["SumX = X", "SumY = Y"]), by=["Letter"])
- source
- result
The following example modifies the previous example by performing two separate UpdateByOperations
. The first uses cum_sum
on the X
column like the previous example, but instead performs a tick-based rolling sum on the Y
column with rolling_sum_tick
.
from deephaven.updateby import cum_sum, rolling_sum_tick
from deephaven import empty_table
source = empty_table(10).update(
["Letter = (i % 2 == 0) ? `A` : `B`", "X = i", "Y = randomInt(0, 10)"]
)
result = source.update_by(
ops=[
cum_sum(cols=["SumX = X"]),
rolling_sum_tick(cols=["RollingSumY = Y"], rev_ticks=2, fwd_ticks=1),
],
by=["Letter"],
)
- source
- result
The following example builds on previous examples by adding a second key column, Truth
, which contains boolean values. Thus, groups are defined by unique combinations of the Letter
and Truth
columns.
from deephaven.updateby import cum_sum, rolling_sum_tick
from deephaven import empty_table
source = empty_table(10).update(
[
"Letter = (i % 2 == 0) ? `A` : `B`",
"Truth = randomBool()",
"X = i",
"Y = randomInt(0, 10)",
]
)
result = source.update_by(
ops=[
cum_sum(cols=["SumX = X"]),
rolling_sum_tick(cols=["RollingSumY = Y"], rev_ticks=2, fwd_ticks=1),
],
by=["Letter", "Truth"],
)
- source
- result