deephaven.table¶
This module implements the Table, PartitionedTable and PartitionedTableProxy classes which are the main instruments for working with Deephaven refreshing and static data.
- class MultiJoinInput(table, on, joins=None)[source]¶
Bases:
JObjectWrapper
A MultiJoinInput represents the input tables, key columns and additional columns to be used in the multi-table natural join.
Creates a new MultiJoinInput containing the table to include for the join, the key columns from the table to match with other table keys plus additional columns containing data from the table. Rows containing unique keys will be added to the output table, otherwise the data from these columns will be added to the existing output rows.
- Parameters:
table (Table) – the right table to include in the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Raises:
DHError –
- j_object_type¶
alias of
MultiJoinInput
- class MultiJoinTable(input, on=None)[source]¶
Bases:
JObjectWrapper
A MultiJoinTable is an object that contains the result of a multi-table natural join. To retrieve the underlying result Table, use the
table
property.Creates a new MultiJoinTable. The join can be specified in terms of either tables or MultiJoinInputs.
- Parameters:
input (Union[Table, Sequence[Table], MultiJoinInput, Sequence[MultiJoinInput]]) – the input objects specifying the tables and columns to include in the join.
on (Union[str, Sequence[str]], optional) – the column(s) to match, can be a common name or an equality expression that matches every input table, i.e. “col_a = col_b” to rename output column names. Note: When MultiJoinInput objects are supplied, this parameter must be omitted.
- Raises:
DHError –
- j_object_type¶
alias of
MultiJoinTable
- property table¶
Returns the Table containing the multi-table natural join output.
- class NodeType(value)[source]¶
Bases:
Enum
An enum of node types for RollupTable
- AGGREGATED = io.deephaven.engine.table.hierarchical.RollupTable$NodeType(objectRef=0x55eb66986bb2)¶
Nodes at an aggregated (rolled up) level in the RollupTable. An aggregated level is above the constituent ( leaf) level. These nodes have column names and types that result from applying aggregations on the source table of the RollupTable.
- CONSTITUENT = io.deephaven.engine.table.hierarchical.RollupTable$NodeType(objectRef=0x55eb66986bba)¶
Nodes at the leaf level when
rollup()
method is called with include_constituent=True. The constituent level is the lowest in a rollup table. These nodes have column names and types from the source table of the RollupTable.
- class PartitionedTable(j_partitioned_table)[source]¶
Bases:
JObjectWrapper
A partitioned table is a table containing tables, known as constituent tables. Each constituent table has the same schema.
The partitioned table contains: 1. one column containing constituent tables 2. key columns (optional) 3. non-key columns (optional)
Key values can be used to retrieve constituent tables from the partitioned table and can be used to perform operations with other like-keyed partitioned tables.
- property constituent_changes_permitted¶
Can the constituents of the underlying partitioned table change? Specifically, can the values of the constituent column change?
If constituent changes are not permitted, the underlying partitioned table: 1. has no adds 2. has no removes 3. has no shifts 4. has no modifies that include the constituent column
Note, it is possible for constituent changes to not be permitted even if constituent tables are refreshing or if the underlying partitioned table is refreshing. Also note that the underlying partitioned table must be refreshing if it contains any refreshing constituents.
- property constituent_column¶
The name of the column containing constituent tables.
- property constituent_table_columns¶
The column definitions for constituent tables. All constituent tables in a partitioned table have the same column definitions.
- property constituent_table_definition¶
The table definitions for constituent tables. All constituent tables in a partitioned table have the same table definitions.
- property constituent_tables¶
Returns all the current constituent tables.
- filter(filters)[source]¶
The filter method creates a new partitioned table containing only the rows meeting the filter criteria. Filters can not use the constituent column.
- classmethod from_constituent_tables(tables, constituent_table_columns=None)[source]¶
Creates a PartitionedTable with a single column named ‘__CONSTITUENT__’ containing the provided constituent tables.
The result PartitionedTable has no key columns, and both its unique_keys and constituent_changes_permitted properties are set to False. When constituent_table_definition isn’t provided, it will be set to the table definitions of the first table in the provided constituent tables.
- classmethod from_partitioned_table(table, key_cols=None, unique_keys=None, constituent_column=None, constituent_table_columns=None, constituent_changes_permitted=None)[source]¶
Creates a PartitionedTable from the provided underlying partitioned Table.
Note: key_cols, unique_keys, constituent_column, constituent_table_definition, constituent_changes_permitted must either be all None or all have values. When they are None, their values will be inferred as follows:
* key_cols: the names of all columns with a non-Table data type* unique_keys: False* constituent_column: the name of the first column with a Table data type* constituent_table_definition: the table definitions of the first cell (constituent table) in the constituent column. Consequently, the constituent column can’t be empty.* constituent_changes_permitted: the value of table.is_refreshing- Parameters:
table (Table) – the underlying partitioned table
key_cols (Union[str, List[str]]) – the key column name(s) of ‘table’
unique_keys (bool) – whether the keys in ‘table’ are guaranteed to be unique
constituent_column (str) – the constituent column name in ‘table’
constituent_table_columns (Optional[TableDefinitionLike]) – the table definitions of the constituent table
constituent_changes_permitted (bool) – whether the values of the constituent column can change
- Return type:
- Returns:
a PartitionedTable
- Raises:
DHError –
- get_constituent(key_values)[source]¶
Gets a single constituent table by its corresponding key column value(s). If there are no matching rows, the result is None. If there are multiple matching rows, a DHError is thrown.
- property is_refreshing¶
Whether the underlying partitioned table is refreshing.
- j_object_type¶
alias of
PartitionedTable
- property key_columns¶
The partition key column names.
- keys()[source]¶
Returns a Table containing all the keys of the underlying partitioned table.
- Return type:
- merge()[source]¶
Makes a new Table that contains all the rows from all the constituent tables. In the merged result, data from a constituent table is contiguous, and data from constituent tables appears in the same order the constituent table appears in the PartitionedTable. Basically, merge stacks constituent tables on top of each other in the same relative order as the partitioned table.
- partitioned_transform(other, func, dependencies=None)[source]¶
Join the underlying partitioned Tables from this PartitionedTable and other on the key columns, then apply the provided function to all pairs of constituent Tables with the same keys in order to produce a new PartitionedTable with the results as its constituents, with the same data for all other columns in the underlying partitioned Table from this.
Note that if the Tables underlying this PartitionedTable or other change, a corresponding change will propagate to the result.
- Parameters:
other (PartitionedTable) – the other Partitioned table whose constituent tables will be passed in as the 2nd argument to the provided function
func (Callable[[Table, Table], Table]) – a function which takes two Tables as input and returns a new Table
dependencies (Optional[Sequence[Union[Table, PartitionedTable]]]) – additional dependencies that must be satisfied before applying the provided transform function to added, modified, or newly-matched constituents during update processing. If the transform function uses any other refreshing Table or refreshing Partitioned Table, they must be included in this argument. Defaults to None.
- Return type:
- Returns:
a PartitionedTable
- Raises:
DHError –
- proxy(require_matching_keys=True, sanity_check_joins=True)[source]¶
Makes a proxy that allows table operations to be applied to the constituent tables of this PartitionedTable.
- Parameters:
require_matching_keys (bool) – whether to ensure that both partitioned tables have all the same keys present when an operation uses this PartitionedTable and another PartitionedTable as inputs for a
partitioned_transform()
, default is Truesanity_check_joins (bool) – whether to check that for proxied join operations, a given join key only occurs in exactly one constituent table of the underlying partitioned table. If the other table argument is also a PartitionedTableProxy, its constituents will also be subjected to this constraint.
- Return type:
- sort(order_by, order=None)[source]¶
The sort method creates a new partitioned table where the rows are ordered based on values in a specified set of columns. Sort can not use the constituent column.
- Parameters:
order_by (Union[str, Sequence[str]]) – the column(s) to be sorted on. Can’t include the constituent column.
order (Union[SortDirection, Sequence[SortDirection], optional) – the corresponding sort directions for each sort column, default is None, meaning ascending order for all the sort columns.
- Return type:
- Returns:
a new PartitionedTable
- Raises:
DHError –
- property table¶
The underlying partitioned table.
- transform(func, dependencies=None)[source]¶
Apply the provided function to all constituent Tables and produce a new PartitionedTable with the results as its constituents, with the same data for all other columns in the underlying partitioned Table. Note that if the Table underlying this PartitionedTable changes, a corresponding change will propagate to the result.
- Parameters:
func (Callable[[Table], Table]) – a function which takes a Table as input and returns a new Table
dependencies (Optional[Sequence[Union[Table, PartitionedTable]]]) – additional dependencies that must be satisfied before applying the provided transform function to added or modified constituents during update processing. If the transform function uses any other refreshing Table or refreshing Partitioned Table, they must be included in this argument. Defaults to None.
- Return type:
- Returns:
a PartitionedTable
- Raises:
DHError –
- property unique_keys¶
Whether the keys in the underlying table must always be unique. If keys must be unique, one can expect that self.table.select_distinct(self.key_columns) and self.table.view(self.key_columns) operations always produce equivalent tables.
- property update_graph¶
The underlying partitioned table’s update graph.
- class PartitionedTableProxy(j_pt_proxy)[source]¶
Bases:
JObjectWrapper
A PartitionedTableProxy is a table operation proxy for the underlying partitioned table. It provides methods that apply table operations to the constituent tables of the underlying partitioned table, produce a new partitioned table from the resulting constituent tables, and return a proxy of it.
- target¶
the underlying partitioned table of the proxy
- Type:
- require_matching_keys¶
whether to ensure that both partitioned tables have all the same keys present when an operation uses this PartitionedTable and another PartitionedTable as inputs for a
partitioned_transform()
, default is True- Type:
bool
- sanity_check_joins¶
whether to check that for proxied join operations, a given join key only occurs in exactly one constituent table of the underlying partitioned table. If the other table argument is also a PartitionedTableProxy, its constituents will also be subjected to this constraint.
- Type:
bool
- abs_sum_by(by=None)[source]¶
Applies the
abs_sum_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- agg_all_by(agg, by=None)[source]¶
Applies the
agg_all_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.Note, because agg_all_by applies the aggregation to all the columns of the table, it will ignore any column names specified for the aggregation.
- Parameters:
agg (Aggregation) – the aggregation
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- agg_by(aggs, by=None)[source]¶
Applies the
agg_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
aggs (Union[Aggregation, Sequence[Aggregation]]) – the aggregation(s)
by (Union[str, Sequence[str]]) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- aj(table, on, joins=None)[source]¶
Applies the
aj()
table operation to all constituent tables of the underlying partitioned table with the provided right table or PartitionedTableProxy, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.In the case of the right table being another PartitionedTableProxy, the
aj()
table operation is applied to the matching pairs of the constituent tables from both underlying partitioned tables.- Parameters:
table (Union[Table, PartitionedTableProxy]) – the right table or PartitionedTableProxy of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘>’ or ‘>=’. If a common name is used for the inexact match, ‘>=’ is used for the comparison.
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- avg_by(by=None)[source]¶
Applies the
avg_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- count_by(col, by=None)[source]¶
Applies the
count_by()
table operation to all constituent tables of the underlying partitioned table with the provided source table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
col (str) – the name of the column to store the counts
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- exact_join(table, on, joins=None)[source]¶
Applies the
exact_join()
table operation to all constituent tables of the underlying partitioned table with the provided right table or PartitionedTableProxy,and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.In the case of the right table being another PartitionedTableProxy, the
exact_join()
table operation is applied to the matching pairs of the constituent tables from both underlying partitioned tables.- Parameters:
table (Union[Table, PartitionedTableProxy]) – the right table or PartitionedTableProxy of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- first_by(by=None)[source]¶
Applies the
first_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- group_by(by=None)[source]¶
Applies the
group_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- head(num_rows)[source]¶
Applies the
head()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
num_rows (int) – the number of rows at the head of the constituent tables
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- property is_refreshing¶
Whether this proxy represents a refreshing partitioned table.
- j_object_type¶
alias of
PartitionedTable$Proxy
- join(table, on=None, joins=None)[source]¶
Applies the
join()
table operation to all constituent tables of the underlying partitioned table with the provided right table or PartitionedTableProxy, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.In the case of the right table being another PartitionedTableProxy, the
join()
table operation is applied to the matching pairs of the constituent tables from both underlying partitioned tables.- Parameters:
table (Union[Table, PartitionedTableProxy]) – the right table or PartitionedTableProxy of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names; default is None
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- last_by(by=None)[source]¶
Applies the
last_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- max_by(by=None)[source]¶
Applies the
max_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- median_by(by=None)[source]¶
Applies the
median_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- min_by(by=None)[source]¶
Applies the
min_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- natural_join(table, on, joins=None)[source]¶
Applies the
natural_join()
table operation to all constituent tables of the underlying partitioned table with the provided right table or PartitionedTableProxy, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.In the case of the right table being another PartitionedTableProxy, the
natural_join()
table operation is applied to the matching pairs of the constituent tables from both underlying partitioned tables.- Parameters:
table (Union[Table, PartitionedTableProxy]) – the right table or PartitionedTableProxy of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- raj(table, on, joins=None)[source]¶
Applies the
raj()
table operation to all constituent tables of the underlying partitioned table with the provided right table or PartitionedTableProxy, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.In the case of the right table being another PartitionedTableProxy, the
raj()
table operation is applied to the matching pairs of the constituent tables from both underlying partitioned tables.- Parameters:
table (Union[Table, PartitionedTableProxy]) – the right table or PartitionedTableProxy of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘<’ or ‘<=’. If a common name is used for the inexact match, ‘<=’ is used for the comparison.
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- reverse()[source]¶
Applies the
reverse()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- select(formulas=None)[source]¶
Applies the
select()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
formulas (Union[str, Sequence[str]], optional) – the column formula(s), default is None
- Return type:
- Returns:
A new PartitionedTableProxy
- Raises:
DHError –
- select_distinct(formulas=None)[source]¶
Applies the
select_distinct()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
formulas (Union[str, Sequence[str]], optional) – the column formula(s), default is None
- Return type:
- Returns:
A new PartitionedTableProxy
- Raises:
DHError –
- snapshot()[source]¶
Applies the
snapshot()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- snapshot_when(trigger_table, stamp_cols=None, initial=False, incremental=False, history=False)[source]¶
Applies the
snapshot_when()
table operation to all constituent tables of the underlying partitioned table with the provided trigger table or PartitionedTableProxy, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.In the case of the trigger table being another PartitionedTableProxy, the
snapshot_when()
table operation is applied to the matching pairs of the constituent tables from both underlying partitioned tables.- Parameters:
trigger_table (Union[Table, PartitionedTableProxy]) – the trigger Table or PartitionedTableProxy
stamp_cols (Union[str, Sequence[str]) – The columns from trigger_table that form the “stamp key”, may be renames. None, or empty, means that all columns from trigger_table form the “stamp key”.
initial (bool) – Whether to take an initial snapshot upon construction, default is False. When False, the resulting table will remain empty until trigger_table first updates.
incremental (bool) – Whether the resulting table should be incremental, default is False. When False, all rows of this table will have the latest “stamp key”. When True, only the rows of this table that have been added or updated will have the latest “stamp key”.
history (bool) – Whether the resulting table should keep history, default is False. A history table appends a full snapshot of this table and the “stamp key” as opposed to updating existing rows. The history flag is currently incompatible with initial and incremental: when history is True, incremental and initial must be False.
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- sort(order_by, order=None)[source]¶
Applies the
sort()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
order_by (Union[str, Sequence[str]]) – the column(s) to be sorted on
order (Union[SortDirection, Sequence[SortDirection], optional) – the corresponding sort directions for each sort column, default is None, meaning ascending order for all the sort columns.
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- sort_descending(order_by)[source]¶
Applies the
sort_descending()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
order_by (Union[str, Sequence[str]]) – the column(s) to be sorted on
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- std_by(by=None)[source]¶
Applies the
std_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- sum_by(by=None)[source]¶
Applies the
sum_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- tail(num_rows)[source]¶
Applies the
tail()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
num_rows (int) – the number of rows at the end of the constituent tables
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- update(formulas)[source]¶
Applies the
update()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
formulas (Union[str, Sequence[str]]) – the column formula(s)
- Return type:
- Returns:
A new PartitionedTableProxy
- Raises:
DHError –
- update_by(ops, by=None)[source]¶
Applies the
update_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
ops (Union[UpdateByOperation, List[UpdateByOperation]]) – the update-by operation definition(s)
by (Union[str, List[str]]) – the key column name(s) to group the rows of the table
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- property update_graph¶
The underlying partitioned table proxy’s update graph.
- update_view(formulas)[source]¶
Applies the
update_view()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
formulas (Union[str, Sequence[str]]) – the column formula(s)
- Return type:
- Returns:
A new PartitionedTableProxy
- Raises:
DHError –
- var_by(by=None)[source]¶
Applies the
var_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- view(formulas)[source]¶
Applies the
view()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
formulas (Union[str, Sequence[str]]) – the column formula(s)
- Return type:
- Returns:
A new PartitionedTableProxy
- Raises:
DHError –
- weighted_avg_by(wcol, by=None)[source]¶
Applies the
weighted_avg_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
wcol (str) – the name of the weight column
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- weighted_sum_by(wcol, by=None)[source]¶
Applies the
weighted_sum_by()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.- Parameters:
wcol (str) – the name of the weight column
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new PartitionedTableProxy
- Raises:
DHError –
- where(filters=None)[source]¶
Applies the
where()
table operation to all constituent tables of the underlying partitioned table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.
- where_in(filter_table, cols)[source]¶
Applies the
where_in()
table operation to all constituent tables of the underlying partitioned table with the provided filter table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.
- where_not_in(filter_table, cols)[source]¶
Applies the
where_not_in()
table operation to all constituent tables of the underlying partitioned table with the provided filter table, and produces a new PartitionedTableProxy with the result tables as the constituents of its underlying partitioned table.
- class RollupNodeOperationsRecorder(j_node_ops_recorder)[source]¶
Bases:
JObjectWrapper
,_FormatOperationsRecorder
,_SortOperationsRecorder
Recorder for node-level operations to be applied when gathering snapshots of RollupTable. Supported operations include column formatting and sorting.
Note: It should not be instantiated directly. User code must call
node_operation_recorder()
to create an instance of the recorder.- format_column(formulas)¶
Returns a new recorder with the
format_columns()
operation applied to nodes.
- format_column_where(col, cond, formula)¶
Returns a new recorder with the
format_column_where()
operation applied to nodes.
- format_row_where(cond, formula)¶
Returns a new recorder with the
format_row_where()
operation applied to nodes.
- j_object_type¶
alias of
RollupTable$NodeOperationsRecorder
- sort_descending(order_by)¶
Returns a new recorder with the
sort_descending()
applied to nodes.
- class RollupTable(j_rollup_table, aggs, include_constituents, by)[source]¶
Bases:
JObjectWrapper
A RollupTable is generated as a result of applying the
rollup()
operation on aTable
.A RollupTable aggregates by the grouping columns, and then creates a hierarchical table which re-aggregates using one less grouping column on each level.
Note: RollupTable should not be instantiated directly by user code.
- j_object_type¶
alias of
RollupTable
- node_operation_recorder(node_type)[source]¶
Creates a RollupNodeOperationsRecorder for per-node operations to apply during Deephaven UI driven snapshotting of this RollupTable. The recorded node operations will be applied only to the node of the provided NodeType. See
NodeType
for details.- Parameters:
node_type (NodeType) – the type of node tables that the recorded operations will be applied to; if it is
NodeType.CONSTITUENT
, the RollupTable must be created with include_constituents=True.- Return type:
- Returns:
a RollupNodeOperationsRecorder
- Raises:
DHError –
- with_filters(filters)[source]¶
Returns a new RollupTable by applying the given set of filters to the group-by columns of this RollupTable.
- with_node_operations(recorders)[source]¶
Returns a new RollupTable that will apply the recorded node operations to nodes when gathering snapshots requested by the Deephaven UI.
- Parameters:
recorders (List[RollupNodeOperationsRecorder]) – a list of RollupNodeOperationsRecorder containing the node operations to be applied, they must be ones created by calling the ‘node_operation_recorder’ method on the same table.
- Return type:
- Returns:
a new RollupTable
- Raises:
DHError –
- class SearchDisplayMode(value)[source]¶
Bases:
Enum
An enum of search display modes for layout hints
- DEFAULT = io.deephaven.engine.util.LayoutHintBuilder$SearchDisplayModes(objectRef=0x55eb6696e29a)¶
Use the system default. This may depend on your user and/or system settings.
- HIDE = io.deephaven.engine.util.LayoutHintBuilder$SearchDisplayModes(objectRef=0x55eb6696e2aa)¶
Hide the search bar, regardless of user or system settings.
- SHOW = io.deephaven.engine.util.LayoutHintBuilder$SearchDisplayModes(objectRef=0x55eb6696e2a2)¶
Permit the search bar to be displayed, regardless of user or system settings.
- class SortDirection(value)[source]¶
Bases:
Enum
An enum defining the sorting orders.
- ASCENDING = 2¶
- DESCENDING = 1¶
- class Table(j_table)[source]¶
Bases:
JObjectWrapper
A Table represents a Deephaven table. It allows applications to perform powerful Deephaven table operations.
Note: It should not be instantiated directly by user code. Tables are mostly created by factory methods, data ingestion operations, queries, aggregations, joins, etc.
- abs_sum_by(by=None)[source]¶
The abs_sum_by method creates a new table containing the absolute sum for each group.
- agg_all_by(agg, by=None)[source]¶
The agg_all_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation specified.
Note, because agg_all_by applies the aggregation to all the columns of the table, it will ignore any column names specified for the aggregation.
- Parameters:
agg (Aggregation) – the aggregation
by (Union[str, Sequence[str]], optional) – the group-by column name(s), default is None
- Return type:
- Returns:
a new table
- Raises:
DHError –
- agg_by(aggs, by=None, preserve_empty=False, initial_groups=None)[source]¶
The agg_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregations specified.
- Parameters:
aggs (Union[Aggregation, Sequence[Aggregation]]) – the aggregation(s)
by (Union[str, Sequence[str]]) – the group-by column name(s), if not provided, all rows from this table are grouped into a single group of rows before the aggregations are applied to the result, default is None.
preserve_empty (bool) – whether to keep result rows for groups that are initially empty or become empty as a result of updates. Each aggregation operator defines its own value for empty groups. Default is False.
initial_groups (Table) – a table whose distinct combinations of values for the group-by column(s) should be used to create an initial set of aggregation groups. All other columns are ignored. This is useful in combination with preserve_empty=True to ensure that particular groups appear in the result table, or with preserve_empty=False to control the encounter order for a collection of groups and thus their relative order in the result. Changes to this table are not expected or handled; if this table is a refreshing table, only its contents at instantiation time will be used. Default is None, the result will be the same as if a table is provided but no rows were supplied. When it is provided, the ‘by’ argument must be provided to explicitly specify the grouping columns.
- Return type:
- Returns:
a new table
- Raises:
DHError –
- aj(table, on, joins=None)[source]¶
The aj (as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going over. If there is no matching key in the right table, appended row values are NULL.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘>’ or ‘>=’. If a common name is used for the inexact match, ‘>=’ is used for the comparison.
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new table
- Raises:
DHError –
- avg_by(by=None)[source]¶
The avg_by method creates a new table containing the average for each group.
- await_update(timeout=None)[source]¶
Waits until either this refreshing Table is updated or the timeout elapses if provided.
- Parameters:
timeout (int) – the maximum time to wait in milliseconds, default is None, meaning no timeout
- Return type:
bool
- Returns:
True when the table is updated or False when the timeout has been reached.
- Raises:
DHError –
- property column_names¶
The column names of the table.
- property columns¶
The column definitions of the table.
- count_by(col, by=None)[source]¶
The count_by method creates a new table containing the number of rows for each group.
- property definition¶
The table definition.
- drop_columns(cols)[source]¶
The drop_columns method creates a new table with the same size as this table but omits any of specified columns.
- exact_join(table, on, joins=None)[source]¶
The exact_join method creates a new table containing all the rows and columns of this table plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new table
- Raises:
DHError –
- first_by(by=None)[source]¶
The first_by method creates a new table containing the first row for each group.
- flatten()[source]¶
Returns a new version of this table with a flat row set, i.e. from 0 to number of rows - 1.
- Return type:
- format_column_where(col, cond, formula)[source]¶
Applies color formatting to a column of the table conditionally.
- Parameters:
col (str) – the column name
cond (str) – the condition expression
formula (str) – the formatting string in the form of assignment expression “column=color expression” where color_expression can be a color name or a Java ternary expression that results in a color.
- Return type:
- Returns:
a new table
- Raises:
DHError –
- format_row_where(cond, formula)[source]¶
Applies color formatting to rows of the table conditionally.
- Parameters:
cond (str) – the condition expression
formula (str) – the formatting string in the form of assignment expression “column=color expression” where color_expression can be a color name or a Java ternary expression that results in a color.
- Return type:
- Returns:
a new table
- Raises:
DHError –
- group_by(by=None)[source]¶
The group_by method creates a new table containing grouping columns and grouped data, column content is grouped into vectors.
- has_columns(cols)[source]¶
Whether this table contains a column for each of the provided names, return False if any of the columns is not in the table.
- Parameters:
cols (Union[str, Sequence[str]]) – the column name(s)
- Returns:
bool
- head(num_rows)[source]¶
The head method creates a new table with a specific number of rows from the beginning of the table.
- head_by(num_rows, by=None)[source]¶
The head_by method creates a new table containing the first number of rows for each group.
- head_pct(pct)[source]¶
The head_pct method creates a new table with a specific percentage of rows from the beginning of the table.
- property is_blink¶
Whether this table is a blink table.
- property is_flat¶
Whether this table is guaranteed to be flat, i.e. its row set will be from 0 to number of rows - 1.
- property is_refreshing¶
Whether this table is refreshing.
- iter_chunk_dict(cols=None, chunk_size=2048)[source]¶
Returns a generator that reads one chunk of rows at a time from the table into a dictionary. The dictionary is a map of column names to numpy arrays of the column data type.
If the table is refreshing and no update graph locks are currently being held, the generator will try to acquire the shared lock of the update graph before reading the table data. This provides a consistent view of the data. The side effect of this is that the table will not be able to refresh while the table is being iterated on. Additionally, the generator internally maintains a fill context. The auto acquired shared lock and the fill context will be released after the generator is destroyed. That can happen implicitly when the generator is used in a for-loop. When the generator is not used in a for-loop, to prevent resource leaks, it must be closed after use by either (1) setting it to None, (2) using the del statement, or (3) calling the close() method on it.
- Parameters:
cols (Optional[Union[str, Sequence[str]]]) – The columns to read. If None, all columns are read.
chunk_size (int) – The number of rows to read at a time. Default is 2048.
- Return type:
Generator
[Dict
[str
,ndarray
],None
,None
]- Returns:
A generator that yields a dictionary of column names to numpy arrays.
- Raises
ValueError
- iter_chunk_tuple(cols=None, tuple_name='Deephaven', chunk_size=2048)[source]¶
Returns a generator that reads one chunk of rows at a time from the table into a named tuple. The named tuple is made up of fields with their names being the column names and their values being numpy arrays of the column data types.
If the table is refreshing and no update graph locks are currently being held, the generator will try to acquire the shared lock of the update graph before reading the table data. This provides a consistent view of the data. The side effect of this is that the table will not be able to refresh while the table is being iterated on. Additionally, the generator internally maintains a fill context. The auto acquired shared lock and the fill context will be released after the generator is destroyed. That can happen implicitly when the generator is used in a for-loop. When the generator is not used in a for-loop, to prevent resource leaks, it must be closed after use by either (1) setting it to None, (2) using the del statement, or (3) calling the close() method on it.
- Parameters:
cols (Optional[Union[str, Sequence[str]]]) – The columns to read. If None, all columns are read.
tuple_name (str) – The name of the named tuple. Default is ‘Deephaven’.
chunk_size (int) – The number of rows to read at a time. Default is 2048.
- Return type:
Generator
[Tuple
[ndarray
,...
],None
,None
]- Returns:
A generator that yields a named tuple for each row in the table.
- Raises:
ValueError –
- iter_dict(cols=None, *, chunk_size=2048)[source]¶
Returns a generator that reads one row at a time from the table into a dictionary. The dictionary is a map of column names to scalar values of the column data type.
If the table is refreshing and no update graph locks are currently being held, the generator will try to acquire the shared lock of the update graph before reading the table data. This provides a consistent view of the data. The side effect of this is that the table will not be able to refresh while the table is being iterated on. Additionally, the generator internally maintains a fill context. The auto acquired shared lock and the fill context will be released after the generator is destroyed. That can happen implicitly when the generator is used in a for-loop. When the generator is not used in a for-loop, to prevent resource leaks, it must be closed after use by either (1) setting it to None, (2) using the del statement, or (3) calling the close() method on it.
- Parameters:
cols (Optional[Union[str, Sequence[str]]]) – The columns to read. If None, all columns are read.
chunk_size (int) – The number of rows to read at a time internally to reduce the number of Java/Python boundary crossings. Default is 2048.
- Return type:
Generator
[Dict
[str
,Any
],None
,None
]- Returns:
A generator that yields a dictionary of column names to scalar values.
- Raises:
ValueError –
- iter_tuple(cols=None, *, tuple_name='Deephaven', chunk_size=2048)[source]¶
Returns a generator that reads one row at a time from the table into a named tuple. The named tuple is made up of fields with their names being the column names and their values being of the column data types.
If the table is refreshing and no update graph locks are currently being held, the generator will try to acquire the shared lock of the update graph before reading the table data. This provides a consistent view of the data. The side effect of this is that the table will not be able to refresh while the table is being iterated on. Additionally, the generator internally maintains a fill context. The auto acquired shared lock and the fill context will be released after the generator is destroyed. That can happen implicitly when the generator is used in a for-loop. When the generator is not used in a for-loop, to prevent resource leaks, it must be closed after use by either (1) setting it to None, (2) using the del statement, or (3) calling the close() method on it.
- Parameters:
cols (Optional[Union[str, Sequence[str]]]) – The columns to read. If None, all columns are read. Default is None.
tuple_name (str) – The name of the named tuple. Default is ‘Deephaven’.
chunk_size (int) – The number of rows to read at a time internally to reduce the number of Java/Python boundary crossings. Default is 2048.
- Return type:
Generator
[Tuple
[Any
,...
],None
,None
]- Returns:
A generator that yields a named tuple for each row in the table
- Raises:
ValueError –
- j_object_type¶
alias of
Table
- join(table, on=None, joins=None)[source]¶
The join method creates a new table containing rows that have matching values in both tables. Rows that do not have matching criteria will not be included in the result. If there are multiple matches between a row from the left table and rows from the right table, all matching combinations will be included. If no columns to match (on) are specified, every combination of left and right table rows is included.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names; default is None
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new table
- Raises:
DHError –
- last_by(by=None)[source]¶
The last_by method creates a new table containing the last row for each group.
- layout_hints(front=None, back=None, freeze=None, hide=None, column_groups=None, search_display_mode=None)[source]¶
Sets layout hints on the Table
- Parameters:
front (Union[str, List[str]]) – the columns to show at the front.
back (Union[str, List[str]]) – the columns to show at the back.
freeze (Union[str, List[str]]) – the columns to freeze to the front. These will not be affected by horizontal scrolling.
hide (Union[str, List[str]]) – the columns to hide.
column_groups (List[Dict]) –
A list of dicts specifying which columns should be grouped in the UI. The dicts can specify the following:
name (str): The group name
children (List[str]): The column names in the group
color (Optional[str]): The hex color string or Deephaven color name
search_display_mode (SearchDisplayMode) – set the search bar to explicitly be accessible or inaccessible, or use the system default.
SearchDisplayMode.SHOW
will show the search bar,SearchDisplayMode.HIDE
will hide the search bar, andSearchDisplayMode.DEFAULT
will use the default value configured by the user and system settings.
- Return type:
- Returns:
a new table with the layout hints set
- Raises:
DHError –
- lazy_update(formulas)[source]¶
The lazy_update method creates a new table containing a new, cached, formula column for each formula.
- max_by(by=None)[source]¶
The max_by method creates a new table containing the maximum value for each group.
- median_by(by=None)[source]¶
The median_by method creates a new table containing the median for each group.
- property meta_table¶
The column definitions of the table in a Table form.
- min_by(by=None)[source]¶
The min_by method creates a new table containing the minimum value for each group.
- move_columns(idx, cols)[source]¶
The move_columns method creates a new table with specified columns moved to a specific column index value. Columns may be renamed with the same semantics as rename_columns. The renames are simultaneous and unordered, enabling direct swaps between column names. Specifying a source or destination more than once is prohibited.
- move_columns_down(cols)[source]¶
The move_columns_down method creates a new table with specified columns appearing last in order, to the far right. Columns may be renamed with the same semantics as rename_columns. The renames are simultaneous and unordered, enabling direct swaps between column names. Specifying a source or destination more than once is prohibited.
- move_columns_up(cols)[source]¶
The move_columns_up method creates a new table with specified columns appearing first in order, to the far left. Columns may be renamed with the same semantics as rename_columns. The renames are simultaneous and unordered, enabling direct swaps between column names. Specifying a source or destination more than once is prohibited.
- natural_join(table, on, joins=None)[source]¶
The natural_join method creates a new table containing all the rows and columns of this table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal. If there is no matching key in the right table, appended row values are NULL.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new table
- Raises:
DHError –
- partition_by(by, drop_keys=False)[source]¶
Creates a PartitionedTable from this table, partitioned according to the specified key columns.
- Parameters:
by (Union[str, Sequence[str]]) – the column(s) by which to group data
drop_keys (bool) – whether to drop key columns in the constituent tables, default is False
- Return type:
- Returns:
A PartitionedTable containing a sub-table for each group
- Raises:
DHError –
- partitioned_agg_by(aggs, by=None, preserve_empty=False, initial_groups=None)[source]¶
The partitioned_agg_by method is a convenience method that performs an agg_by operation on this table and wraps the result in a PartitionedTable. If the argument ‘aggs’ does not include a partition aggregation created by calling
agg.partition()
, one will be added automatically with the default constituent column name __CONSTITUENT__.- Parameters:
aggs (Union[Aggregation, Sequence[Aggregation]]) – the aggregation(s)
by (Union[str, Sequence[str]]) – the group-by column name(s), default is None
preserve_empty (bool) – whether to keep result rows for groups that are initially empty or become empty as a result of updates. Each aggregation operator defines its own value for empty groups. Default is False.
initial_groups (Table) – a table whose distinct combinations of values for the group-by column(s) should be used to create an initial set of aggregation groups. All other columns are ignored. This is useful in combination with preserve_empty=True to ensure that particular groups appear in the result table, or with preserve_empty=False to control the encounter order for a collection of groups and thus their relative order in the result. Changes to this table are not expected or handled; if this table is a refreshing table, only its contents at instantiation time will be used. Default is None, the result will be the same as if a table is provided but no rows were supplied. When it is provided, the ‘by’ argument must be provided to explicitly specify the grouping columns.
- Return type:
- Returns:
a PartitionedTable
- Raises:
DHError –
- raj(table, on, joins=None)[source]¶
The reverse-as-of join method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going under. If there is no matching key in the right table, appended row values are NULL.
- Parameters:
table (Table) – the right-table of the join
on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘<’ or ‘<=’. If a common name is used for the inexact match, ‘<=’ is used for the comparison.
joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None
- Return type:
- Returns:
a new table
- Raises:
DHError –
- range_join(table, on, aggs)[source]¶
The range_join method creates a new table containing all the rows and columns of the left table, plus additional columns containing aggregated data from the right table. For columns appended to the left table (joins), cell values equal aggregations over vectors of values from the right table. These vectors are formed from all values in the right table where the right table keys fall within the ranges of keys defined by the left table (responsive ranges).
range_join is a join plus aggregation that (1) joins arrays of data from the right table onto the left table, and then (2) aggregates over the joined data. Oftentimes this is used to join data for a particular time range from the right table onto the left table.
Rows from the right table with null or NaN key values are discarded; that is, they are never included in the vectors used for aggregation. For all rows that are not discarded, the right table must be sorted according to the right range column for all rows within a group.
Join key ranges, specified by the ‘on’ argument, are defined by zero-or-more exact join matches and a single range join match. The range join match must be the last match in the list.
The exact match expressions are parsed as in other join operations. That is, they are either a column name common to both tables or a column name from the left table followed by an equals sign followed by a column name from the right table. .. rubric:: Examples
- Match on the same column name in both tables:
“common_column”
- Match on different column names in each table:
“left_column = right_column” or “left_column == right_column”
The range match expression is expressed as a ternary logical expression, expressing the relationship between the left start column, the right range column, and the left end column. Each column name pair is separated by a logical operator, either < or <=. Additionally, the entire expression may be preceded by a left arrow <- and/or followed by a right arrow ->. The arrows indicate that range match can ‘allow preceding’ or ‘allow following’ to match values outside the explicit range. ‘Allow preceding’ means that if no matching right range column value is equal to the left start column value, the immediately preceding matching right row should be included in the aggregation if such a row exists. ‘Allow following’ means that if no matching right range column value is equal to the left end column value, the immediately following matching right row should be included in the aggregation if such a row exists. .. rubric:: Examples
- For less than paired with greater than:
“left_start_column < right_range_column < left_end_column”
- For less than or equal paired with greater than or equal:
“left_start_column <= right_range_column <= left_end_column”
- For less than or equal (allow preceding) paired with greater than or equal (allow following):
“<- left_start_column <= right_range_column <= left_end_column ->”
- Special Cases
In order to produce aggregated output, range match expressions must define a range of values to aggregate over. There are a few noteworthy special cases of ranges.
Empty Range An empty range occurs for any left row with no matching right rows. That is, no non-null, non-NaN right rows were found using the exact join matches, or none were in range according to the range join match.
Single-value Ranges A single-value range is a range where the left row’s values for the left start column and left end column are equal and both relative matches are inclusive (<= and >=, respectively). For a single-value range, only rows within the bucket where the right range column matches the single value are included in the output aggregations.
Invalid Ranges An invalid range occurs in two scenarios:
When the range is inverted, i.e., when the value of the left start column is greater than the value of the left end column.
When either relative-match is exclusive (< or >) and the value in the left start column is equal to the value in the left end column.
For invalid ranges, the result row will be null for all aggregation output columns.
Undefined Ranges An undefined range occurs when either the left start column or the left end column is NaN. For rows with an undefined range, the corresponding output values will be null (as with invalid ranges).
Unbounded Ranges A partially or fully unbounded range occurs when either the left start column or the left end column is null. If the left start column value is null and the left end column value is non-null, the range is unbounded at the beginning, and only the left end column subexpression will be used for the match. If the left start column value is non-null and the left end column value is null, the range is unbounded at the end, and only the left start column subexpression will be used for the match. If the left start column and left end column values are null, the range is unbounded, and all rows will be included.
Note: At this time, implementations only support static tables. This operation remains under active development.
- Parameters:
table (Table) – the right table of the join
on (Union[str, List[str]]) – the match expression(s) that must include zero-or-more exact match expression, and exactly one range match expression as described above
aggs (Union[Aggregation, List[Aggregation]]) – the aggregation(s) to perform over the responsive ranges from the right table for each row from this Table
- Return type:
- Returns:
a new table
- Raises:
DHError –
- remove_blink()[source]¶
Returns a non-blink child table, or this table if it is not a blink table.
- Return type:
- rename_columns(cols)[source]¶
The rename_columns method creates a new table with the specified columns renamed. The renames are simultaneous and unordered, enabling direct swaps between column names. Specifying a source or
destination more than once is prohibited.
- restrict_sort_to(cols)[source]¶
The restrict_sort_to method adjusts the input table to produce an output table that only allows sorting on specified table columns. This can be useful to prevent users from accidentally performing expensive sort operations as they interact with tables in the UI.
- Parameters:
cols (Union[str, Sequence[str]]) – the column name(s)
- Returns:
a new table
- Raises:
DHError –
- reverse()[source]¶
The reverse method creates a new table with all of the rows from this table in reverse order.
- rollup(aggs, by=None, include_constituents=False)[source]¶
Creates a rollup table.
A rollup table aggregates by the specified columns, and then creates a hierarchical table which re-aggregates using one less by column on each level. The column that is no longer part of the aggregation key is replaced with null on each level.
Note some aggregations can not be used in creating a rollup tables, these include: group, partition, median, pct, weighted_avg
- Parameters:
aggs (Union[Aggregation, Sequence[Aggregation]]) – the aggregation(s)
by (Union[str, Sequence[str]]) – the group-by column name(s), default is None
include_constituents (bool) – whether to include the constituent rows at the leaf level, default is False
- Return type:
- Returns:
a new RollupTable
- Raises:
DHError –
- select(formulas=None)[source]¶
The select method creates a new in-memory table that includes one column for each formula. If no formula is specified, all columns will be included.
- select_distinct(formulas=None)[source]¶
The select_distinct method creates a new table containing all the unique values for a set of key columns. When the selectDistinct method is used on multiple columns, it looks for distinct sets of values in the selected columns.
- property size¶
The current number of rows in the table.
- slice(start, stop)[source]¶
Extracts a subset of a table by row positions into a new Table.
If both the start and the stop are positive, then both are counted from the beginning of the table. The start is inclusive, and the stop is exclusive. slice(0, N) is equivalent to
head()
(N) The start must be less than or equal to the stop.If the start is positive and the stop is negative, then the start is counted from the beginning of the table, inclusively. The stop is counted from the end of the table. For example, slice(1, -1) includes all rows but the first and last. If the stop is before the start, the result is an empty table.
If the start is negative, and the stop is zero, then the start is counted from the end of the table, and the end of the slice is the size of the table. slice(-N, 0) is equivalent to
tail()
(N).If the start is negative and the stop is negative, they are both counted from the end of the table. For example, slice(-2, -1) returns the second to last row of the table.
- Parameters:
start (int) – the first row position to include in the result
stop (int) – the last row position to include in the result
- Return type:
- Returns:
a new Table
- Raises:
DHError –
Examples
>>> table.slice(0, 5) # first 5 rows >>> table.slice(-5, 0) # last 5 rows >>> table.slice(2, 6) # rows from index 2 to 5 >>> table.slice(6, 2) # ERROR: cannot slice start after end >>> table.slice(-6, -2) # rows from 6th last to 2nd last (exclusive) >>> table.slice(-2, -6) # ERROR: cannot slice start after end >>> table.slice(2, -3) # all rows except the first 2 and the last 3 >>> table.slice(-6, 8) # rows from 6th last to index 8 (exclusive)
- slice_pct(start_pct, end_pct)[source]¶
Extracts a subset of a table by row percentages.
Returns a subset of table in the range [floor(start_pct * size_of_table), floor(end_pct * size_of_table)). For example, for a table of size 10, slice_pct(0.1, 0.7) will return a subset from the second row to the seventh row. Similarly, slice_pct(0, 1) would return the entire table (because row positions run from 0 to size - 1). The percentage arguments must be in range [0, 1], otherwise the function returns an error.
- snapshot_when(trigger_table, stamp_cols=None, initial=False, incremental=False, history=False)[source]¶
Returns a table that captures a snapshot of this table whenever trigger_table updates.
When trigger_table updates, a snapshot of this table and the “stamp key” from trigger_table form the resulting table. The “stamp key” is the last row of the trigger_table, limited by the stamp_cols. If trigger_table is empty, the “stamp key” will be represented by NULL values.
Note: the trigger_table must be append-only when the history flag is set to True. If the trigger_table is not append-only and has modified or removed rows in its updates, the result snapshot table will be put in a failure state and become unusable.
- Parameters:
trigger_table (Table) – the trigger table
stamp_cols (Union[str, Sequence[str]) – The columns from trigger_table that form the “stamp key”, may be renames. None, or empty, means that all columns from trigger_table form the “stamp key”.
initial (bool) – Whether to take an initial snapshot upon construction, default is False. When False, the resulting table will remain empty until trigger_table first updates.
incremental (bool) – Whether the resulting table should be incremental, default is False. When False, all rows of this table will have the latest “stamp key”. When True, only the rows of this table that have been added or updated will have the latest “stamp key”.
history (bool) – Whether the resulting table should keep history, default is False. A history table appends a full snapshot of this table and the “stamp key” as opposed to updating existing rows. The history flag is currently incompatible with initial and incremental: when history is True, incremental and initial must be False.
- Return type:
- Returns:
a new table
- Raises:
DHError –
- sort(order_by, order=None)[source]¶
The sort method creates a new table where the rows are ordered based on values in a specified set of columns.
- Parameters:
order_by (Union[str, Sequence[str]]) – the column(s) to be sorted on
order (Union[SortDirection, Sequence[SortDirection], optional) – the corresponding sort directions for each sort column, default is None, meaning ascending order for all the sort columns.
- Return type:
- Returns:
a new table
- Raises:
DHError –
- sort_descending(order_by)[source]¶
The sort_descending method creates a new table where rows in a table are sorted in descending order based on the order_by column(s).
- std_by(by=None)[source]¶
The std_by method creates a new table containing the sample standard deviation for each group.
Sample standard deviation is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.
- tail(num_rows)[source]¶
The tail method creates a new table with a specific number of rows from the end of the table.
- tail_by(num_rows, by=None)[source]¶
The tail_by method creates a new table containing the last number of rows for each group.
- tail_pct(pct)[source]¶
The tail_pct method creates a new table with a specific percentage of rows from the end of the table.
- to_string(num_rows=10, cols=None)[source]¶
Returns the first few rows of a table as a pipe-delimited string.
- Parameters:
num_rows (int) – the number of rows at the beginning of the table
cols (Union[str, Sequence[str]]) – the column name(s), default is None
- Return type:
str
- Returns:
string
- Raises:
DHError –
- tree(id_col, parent_col, promote_orphans=False)[source]¶
Creates a hierarchical tree table.
The structure of the table is encoded by an “id” and a “parent” column. The id column should represent a unique identifier for a given row, and the parent column indicates which row is the parent for a given row. Rows that have a None parent are part of the “root” table.
It is possible for rows to be “orphaned” if their parent is non-None and does not exist in the table. These rows will not be present in the resulting tree. If this is not desirable, they could be promoted to become children of the root table by setting ‘promote_orphans’ argument to True.
- Parameters:
id_col (str) – the name of a column containing a unique identifier for a particular row in the table
parent_col (str) – the name of a column containing the parent’s identifier, {@code null} for rows that are part of the root table
promote_orphans (bool) – whether to promote node tables whose parents don’t exist to be children of the root node, default is False
- Return type:
- Returns:
a new TreeTable organized according to the parent-child relationships expressed by id_col and parent_col
- Raises:
DHError –
- ungroup(cols=None)[source]¶
The ungroup method creates a new table in which array columns from the source table are unwrapped into separate rows.
- update(formulas)[source]¶
The update method creates a new table containing a new, in-memory column for each formula.
- update_by(ops, by=None)[source]¶
Creates a table with additional columns calculated from window-based aggregations of columns in this table. The aggregations are defined by the provided operations, which support incremental aggregations over the corresponding rows in the table. The aggregations will apply position or time-based windowing and compute the results over the entire table or each row group as identified by the provided key columns.
- Parameters:
ops (Union[UpdateByOperation, List[UpdateByOperation]]) – the update-by operation definition(s)
by (Union[str, List[str]]) – the key column name(s) to group the rows of the table
- Return type:
- Returns:
a new Table
- Raises:
DHError –
- property update_graph¶
The update graph of the table.
- update_view(formulas)[source]¶
The update_view method creates a new table containing a new, formula column for each formula.
- var_by(by=None)[source]¶
The var_by method creates a new table containing the sample variance for each group.
Sample variance is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.
- view(formulas)[source]¶
The view method creates a new formula table that includes one column for each formula.
- weighted_avg_by(wcol, by=None)[source]¶
The weighted_avg_by method creates a new table containing the weighted average for each group.
- weighted_sum_by(wcol, by=None)[source]¶
The weighted_sum_by method creates a new table containing the weighted sum for each group.
- where(filters=None)[source]¶
The where method creates a new table with only the rows meeting the filter criteria in the column(s) of the table.
- where_in(filter_table, cols)[source]¶
The where_in method creates a new table containing rows from the source table, where the rows match values in the filter table. The filter is updated whenever either table changes.
- where_not_in(filter_table, cols)[source]¶
The where_not_in method creates a new table containing rows from the source table, where the rows do not match values in the filter table.
- where_one_of(filters=None)[source]¶
The where_one_of method creates a new table containing rows from the source table, where the rows match at least one filter.
- with_attributes(attrs)[source]¶
Returns a new Table that has the provided attributes defined on it and shares the underlying data and schema with this table.
Note, the table attributes are immutable once defined, and are mostly used internally by the Deephaven engine. For advanced users, certain predefined plug-in attributes provide a way to extend Deephaven with custom-built plug-ins.
- class TableDefinition(table_definition)[source]¶
Bases:
JObjectWrapper
,Mapping
A Deephaven table definition, as a mapping from column name to ColumnDefinition.
Construct a TableDefinition.
- Parameters:
table_definition (TableDefinitionLike) – The table definition like object
- Returns:
A new TableDefinition
- Raises:
DHError –
- get(k[, d]) D[k] if k in D, else d. d defaults to None. ¶
- j_object_type¶
alias of
TableDefinition
- property table¶
This table definition as a table.
- TableDefinitionLike¶
A Union representing objects that can be coerced into a TableDefinition.
alias of
Union
[TableDefinition
,Mapping
[str
,DType
],Iterable
[ColumnDefinition
],JType
]
- class TreeNodeOperationsRecorder(j_node_ops_recorder)[source]¶
Bases:
JObjectWrapper
,_FormatOperationsRecorder
,_SortOperationsRecorder
,_FilterOperationsRecorder
Recorder for node-level operations to be applied when gathering snapshots of TreeTable. Supported operations include column formatting, sorting, and filtering.
Note: It should not be instantiated directly. User code must call
node_operation_recorder()
to create an instance of the recorder.- format_column(formulas)¶
Returns a new recorder with the
format_columns()
operation applied to nodes.
- format_column_where(col, cond, formula)¶
Returns a new recorder with the
format_column_where()
operation applied to nodes.
- format_row_where(cond, formula)¶
Returns a new recorder with the
format_row_where()
operation applied to nodes.
- j_object_type¶
alias of
TreeTable$NodeOperationsRecorder
- sort_descending(order_by)¶
Returns a new recorder with the
sort_descending()
applied to nodes.
- class TreeTable(j_tree_table, id_col, parent_col)[source]¶
Bases:
JObjectWrapper
A TreeTable is generated as a result of applying the
tree()
method on aTable
.A TreeTable presents a hierarchically structured “tree” view of a table where parent-child relationships are expressed by an “id” and a “parent” column. The id column should represent a unique identifier for a given row, and the parent column indicates which row is the parent for a given row.
Note: TreeTable should not be instantiated directly by user code.
- j_object_type¶
alias of
TreeTable
- node_operation_recorder()[source]¶
Creates a TreepNodeOperationsRecorder for per-node operations to apply during Deephaven UI driven snapshotting of this TreeTable.
- Return type:
- Returns:
a TreeNodeOperationsRecorder
- with_filters(filters)[source]¶
Returns a new TreeTable by applying the given set of filters to the columns of this TreeTable.
- with_node_operations(recorder)[source]¶
Returns a new TreeTable that will apply the recorded node operations to nodes when gathering snapshots requested by the Deephaven UI.
- Parameters:
recorder (TreeNodeOperationsRecorder) – the TreeNodeOperationsRecorder containing the node operations to be applied, it must be created by calling the ‘node_operation_recorder’ method on the same table.
- Return type:
- Returns:
a new TreeTable
- Raises:
DHError –
- multi_join(input, on=None)[source]¶
The multi_join method creates a new table by performing a multi-table natural join on the input tables. The result consists of the set of distinct keys from the input tables natural joined to each input table. Input tables need not have a matching row for each key, but they may not have multiple matching rows for a given key.
- Parameters:
input (Union[Table, Sequence[Table], MultiJoinInput, Sequence[MultiJoinInput]]) – the input objects specifying the tables and columns to include in the join.
on (Union[str, Sequence[str]], optional) – the column(s) to match, can be a common name or an equality expression that matches every input table, i.e. “col_a = col_b” to rename output column names. Note: When MultiJoinInput objects are supplied, this parameter must be omitted.
- Returns:
- the result of the multi-table natural join operation. To access the underlying Table, use the
table
property.
- Return type:
- table_diff(t1, t2, max_diffs=1, floating_comparison='exact', ignore_column_order=False)[source]¶
Returns the differences between this table and the provided table as a string. If the two tables are the same, an empty string is returned. The differences are returned in a human-readable format.
This method starts by comparing the table sizes, and then the schema of the two tables, such as the number of columns, column names, column types, column orders. If the schemas are different, the comparison stops and the differences are returned. If the schemas are the same, the method proceeds to compare the data in the tables. The method compares the data in the tables column by column (not row by row) and only records the first difference found in each column.
Note, inexact comparison of floating numbers may sometimes be desirable due to their inherent imprecision. When that is the case, the floating_comparison should be set to either ‘absolute’ or ‘relative’. When it is set to ‘absolute’, the absolute value of the difference between two floating numbers is used to compare against a threshold. The threshold is set to 0.0001 for Doubles and 0.005 for Floats. Only differences that are greater than the threshold are recorded. When floating_comparison is set to ‘relative’, the relative difference between two floating numbers is used to compare against the threshold. The relative difference is calculated as the absolute difference divided by the smaller absolute value between the two numbers.
- Parameters:
t1 (Table) – the table to compare
t2 (Table) – the table to compare against
max_diffs (int) – the maximum number of differences to return, default is 1
floating_comparison (Literal['exact', 'absolute', 'relative']) – the type of comparison to use for floating numbers, default is ‘exact’
ignore_column_order (bool) – whether columns that exist in both tables but in different orders are treated as differences. False indicates that column order matters (default), and True indicates that column order does not matter.
- Return type:
str
- Returns:
string
- Raises:
DHError –