pydeephaven.agg¶
This module defines the Aggregation class and provides factory functions to create specific Aggregation instances.
- class Aggregation[source]¶
Bases:
ABC
An Aggregation object represents an aggregation operation.
Note: It should not be instantiated directly by user code but rather through the factory functions in the module.
- abs_sum(cols=None)[source]¶
Creates an Absolute-sum aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- avg(cols=None)[source]¶
Creates an Average aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- count_(col)[source]¶
Creates a Count aggregation. This is not supported in ‘Table.agg_all_by’.
- Parameters:
col (str) – the column to hold the counts of each distinct group
- Return type:
- Returns:
an aggregation
- count_distinct(cols=None, count_nulls=False)[source]¶
Creates a Count Distinct aggregation which computes the count of distinct values within an aggregation group for each of the given columns.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
count_nulls (bool) – whether null values should be counted, default is False
- Return type:
- Returns:
an aggregation
- distinct(cols=None, include_nulls=False)[source]¶
Creates a Distinct aggregation which computes the distinct values within an aggregation group for each of the given columns and stores them as vectors.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
include_nulls (bool) – whether nulls should be included as distinct values, default is False
- Return type:
- Returns:
an aggregation
- first(cols=None)[source]¶
Creates a First aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- formula(formula, formula_param, cols=None)[source]¶
- Creates a user defined formula aggregation. This formula can contain a combination of any of the following:
- Built-in functions such as min, max, etc.Mathematical arithmetic such as *, +, /, etc.User-defined functions
- Parameters:
formula (str) – the user defined formula to apply to each group.
formula_param (str) – the parameter name for the input column’s vector within the formula. If formula is max(each), then each is the formula_param.
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- group(cols=None)[source]¶
Creates a Group aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- last(cols=None)[source]¶
Creates a Last aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- max_(cols=None)[source]¶
Creates a Max aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- median(cols=None, average_evenly_divided=True)[source]¶
Creates a Median aggregation which computes the median value within an aggregation group for each of the given columns.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
average_evenly_divided (bool) – when the group size is an even number, whether to average the two middle values for the output value. When set to True, average the two middle values. When set to False, use the smaller value. The default is True. This flag is only valid for numeric types.
- Return type:
- Returns:
an aggregation
- min_(cols=None)[source]¶
Creates a Min aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- partition(col, include_by_columns=True)[source]¶
Creates a Partition aggregation. This is not supported in ‘Table.agg_all_by’.
- Parameters:
col (str) – the column to hold the sub tables
include_by_columns (bool) – whether to include the group by columns in the result, default is True
- Return type:
- Returns:
an aggregation
- pct(percentile, cols=None, average_evenly_divided=False)[source]¶
Creates a Percentile aggregation which computes the percentile value within an aggregation group for each of the given columns.
- Parameters:
percentile (float) – the percentile used for calculation
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
average_evenly_divided (bool) – when the percentile splits the group into two halves, whether to average the two middle values for the output value. When set to True, average the two middle values. When set to False, use the smaller value. The default is False. This flag is only valid for numeric types.
- Return type:
- Returns:
an aggregation
- sorted_first(order_by, cols=None)[source]¶
Creates a SortedFirst aggregation.
- Parameters:
order_by (str) – the column to sort by
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- sorted_last(order_by, cols=None)[source]¶
Creates a SortedLast aggregation.
- Parameters:
order_by (str) – the column to sort by
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- std(cols=None)[source]¶
Creates a Std (sample standard deviation) aggregation.
Sample standard deviation is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- sum_(cols=None)[source]¶
Creates a Sum aggregation.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- unique(cols=None, include_nulls=False, non_unique_sentinel=None)[source]¶
Creates a Unique aggregation which computes the single unique value within an aggregation group for each of the given columns. If all values in a column are null, or if there is more than one distinct value in a column, the result is the specified non_unique_sentinel value (defaults to null).
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
include_nulls (bool) – whether null is treated as a value for the purpose of determining if the values in the aggregation group are unique, default is False.
non_unique_sentinel (Union[np.number, str, bool]) – the non-null sentinel value when no unique value exists, default is None. Must be a non-None value when include_nulls is True. When passed in as a numpy scalar number value, it must be of one of these types: np.int8, np.int16, np.uint16, np.int32, np.int64(int), np.float32, np.float64(float). Please note that np.uint16 is interpreted as a Deephaven/Java char.
- Raises:
TypeError –
- Return type:
- Returns:
an aggregation
- var(cols=None)[source]¶
Creates a sample Variance aggregation.
Sample variance is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.
- Parameters:
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- weighted_avg(wcol, cols=None)[source]¶
Creates a Weighted-average aggregation.
- Parameters:
wcol (str) – the name of the weight column
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation
- weighted_sum(wcol, cols=None)[source]¶
Creates a Weighted-sum aggregation.
- Parameters:
wcol (str) – the name of the weight column
cols (Union[str, List[str]]) – the column(s) to aggregate on, can be renaming expressions, i.e. “new_col = col”; default is None, only valid when used in Table agg_all_by operation
- Return type:
- Returns:
an aggregation