rollup
The Deephaven rollup
method creates a rollup table from a source table given zero or more aggregations and zero or more grouping columns to create a hierarchy from.
Syntax
result = source.rollup(
aggs: list[Aggregation],
by: list[str] = None,
include_constituents: bool = False,
) -> RollupTable
Parameters
Parameter | Type | Description |
---|---|---|
aggs | list[Aggregation] | A list of aggregations. If |
by | list[str] | Zero or more column names to group on and create hierarchy from. If |
include_constituents optional | bool | Whether or not to include constituent rows at the leaf level. Default is False. |
Returns
A rollup table.
Examples
The following example creates two rollup tables from a source table of insurance expense data. The first performs no aggregations, but creates heirarchy from the region
and age
columns. The second performs two aggregations: the aggregated average of the bmi
and expenses
columns are calculated, then the same by
columns are given as the first. The optional argument include_constituents
is set to True
so that members of the lowest-level nodes (individual cells) can be expanded.
from deephaven import read_csv, agg
insurance = read_csv(
"https://media.githubusercontent.com/media/deephaven/examples/main/Insurance/csv/insurance.csv"
)
agg_list = [agg.avg(cols=["bmi", "expenses"])]
by_list = ["region", "age"]
test_rollup = insurance.rollup(aggs=None, by=by_list, include_constituents=True)
insurance_rollup = insurance.rollup(
aggs=agg_list, by=by_list, include_constituents=True
)
- insurance
- insurance_rollup
The following example creates a rollup table from real-time source data. The source data updates 10,000 times per second. The result
rollup table can be expanded by the N
column to show unique values of M
for each N
. The aggregated average and sum are calculated for the Value
and Weight
, respectively.
from deephaven import empty_table, time_table
from deephaven import agg
agg_list = [agg.avg(cols=["Value"]), agg.sum_(cols=["Weight"])]
by_list = ["N", "M"]
rows = empty_table(1_000_000).update_view(["Group = i", "N = i % 347", "M = i % 29"])
changes = (
time_table("PT0.0001S")
.view(
[
"Group = i % 1_000_000",
"LastModified = Timestamp",
"Value = (i * Math.sin(i)) % 6977",
"Weight = (i * Math.sin(i)) % 7151",
]
)
.last_by("Group")
)
source = rows.join(changes, "Group")
result = source.rollup(aggs=agg_list, by=by_list)