agg_by
agg_by
applies a list of aggregations to table data.
Syntax
agg_by(
aggs: Union[Aggregation, Sequence[Aggregation]],
by: Union[str, list[str]] = None,
preserve_empty: bool = False,
initial_groups: Table = None,
) -> Table
Parameters
Parameter | Type | Description |
---|---|---|
aggs | Union[Aggregation, Sequence[Aggregation]] | A list of aggregations to compute. The following aggregations are available: |
by | Union[str, list[str]] | The names of column(s) by which to group data. Default is |
preserve_empty optional | bool | Whether to keep result rows for groups that are initially empty or become empty as a result of updates. Each aggregation operator defines its own value for empty groups. The default is |
initial_groups optional | Table | A table whose distinct combinations of values for the grouping column(s) should be used to create an initial set of aggregation groups. All other columns are ignored.
|
If an aggregation does not rename the resulting column, the aggregation column will appear in the output table, not the input column. If multiple aggregations on the same column do not rename the resulting columns, an error will result, because the aggregations are trying to create multiple columns with the same name. For example, in table.agg_by([agg.sum_(“X”), agg.avg(“X”)])
, both the sum and the average aggregators produce column X
, which results in an error.
Returns
Aggregated table data based on the aggregation types specified in the agg_list
.
Examples
In this example, agg.first
returns the first Y
value as grouped by X
.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col
from deephaven import agg as agg
source = new_table(
[
string_col("X", ["A", "B", "A", "C", "B", "A", "B", "B", "C"]),
string_col("Y", ["M", "N", "O", "N", "P", "M", "O", "P", "M"]),
int_col("Number", [55, 76, 20, 130, 230, 50, 73, 137, 214]),
]
)
result = source.agg_by([agg.first(cols=["Y"])], by=["X"])
- source
- result
In this example, agg.group
returns an array of values from the Number
column (Numbers
), and agg.max_
returns the maximum value from the Number
column (MaxNumber
), as grouped by X
.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col
from deephaven import agg as agg
source = new_table(
[
string_col("X", ["A", "B", "A", "C", "B", "A", "B", "B", None]),
string_col("Y", ["M", "N", None, "N", "P", "M", None, "P", "M"]),
int_col("Number", [55, 76, 20, 130, 230, 50, 73, 137, 214]),
]
)
result = source.agg_by(
[agg.group(cols=["Numbers = Number"]), agg.max_(cols=["MaxNumber = Number"])],
by=["X"],
)
- source
- result