pct
agg.pct
returns an aggregator that computes the designated percentile of values, within an aggregation group, for each input column.
Syntax
pct(percentile: float, cols: Union[str, list[str]], average_evenly_divided = False) -> Aggregation
Parameters
Parameter | Type | Description |
---|---|---|
percentile | float | The percentile to calculate. |
cols | Union[str, list[str]] | The source column(s) for the calculations.
|
average_evenly_divided | bool | When the percentile splits the group into two halves, whether to average the two middle values for the output value.
|
If an aggregation does not rename the resulting column, the aggregation column will appear in the output table, not the input column. If multiple aggregations on the same column do not rename the resulting columns, an error will result, because the aggregations are trying to create multiple columns with the same name. For example, in table.agg_by([agg.sum_(cols=[“X”]), agg.avg(cols=["X"])
, both the sum and the average aggregators produce column X
, which results in an error.
Returns
An aggregator that computes the designated percentile value, within an aggregation group, for each input column.
Examples
In this example, agg.pct
returns the 68th percentile value Number
as grouped by X
.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col
from deephaven import agg as agg
source = new_table(
[
string_col("X", ["A", "B", "A", "C", "B", "A", "B", "B", "C"]),
string_col("Y", ["M", "N", "O", "N", "P", "M", "O", "P", "M"]),
int_col("Number", [55, 76, 20, 130, 230, 50, 73, 137, 214]),
]
)
result = source.agg_by(
[agg.pct(percentile=0.68, cols=["PctNumber = Number"])], by=["X"]
)
- source
- result
In this example, agg.pct
returns the 68th percentile value Number
and the 99th percentile value Number
as grouped by X
.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col
from deephaven import agg as agg
source = new_table(
[
string_col("X", ["A", "B", "A", "C", "B", "A", "B", "B", "C"]),
string_col("Y", ["M", "N", "O", "N", "P", "M", "O", "P", "M"]),
int_col("Number", [55, 76, 20, 130, 230, 50, 73, 137, 214]),
]
)
result = source.agg_by(
[
agg.pct(percentile=0.68, cols=["Pct68Number = Number"]),
agg.pct(percentile=0.99, cols=["Pct99Number = Number"]),
],
by=["X"],
)
- source
- result
In this example, agg.pct
returns the 97th percentile value Number
, and agg.median
returns the median Number
, as grouped by X
. A second result
table is then created to show the difference in output when average_evenly_divided
is set to True
.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col
from deephaven import agg as agg
source = new_table(
[
string_col("X", ["A", "B", "A", "C", "B", "A", "B", "B", "C"]),
string_col("Y", ["M", "P", "O", "N", "P", "M", "O", "P", "N"]),
int_col("Number", [55, 76, 20, 130, 230, 50, 73, 137, 214]),
]
)
result = source.agg_by(
[
agg.pct(percentile=0.97, cols=["Pct97Number = Number"]),
agg.median(cols=["MedNumber = Number"]),
],
by=["X"],
)
result = source.agg_by(
[
agg.pct(
percentile=0.97, cols=["Pct97Number = Number"], average_evenly_divided=True
),
agg.median(cols=["MedNumber = Number"]),
],
by=["X"],
)
- source
- result