pydeephaven.table

This module implements the Table and InputTable classes which are the main instruments to work with Deephaven data.

class InputTable(session, ticket, schema_header=b'', size=None, is_static=None, schema=None)[source]

Bases: Table

InputTable is a subclass of Table that allows the users to dynamically add/delete/modify data in it. There are two types of InputTable - append-only and keyed.

The append-only input table is not keyed, all rows are added to the end of the table, and deletions and edits are not permitted.

The keyed input tablet has keys for each row and supports addition/deletion/modification of rows by the keys.

add(table)[source]

Writes rows from the provided table to this input table. If this is a keyed input table, added rows with keys that match existing rows will replace those rows.

Parameters:

table (Table) – the table that provides the rows to write

Raises:

DHError

Return type:

None

agg_all_by(agg, by)

The agg_all_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation specified.

Note, because agg_all_by applies the aggregation to all the columns of the table, it will ignore any column names specified for the aggregation.

Parameters:
  • agg (Aggregation) – the aggregation to be applied

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

agg_by(aggs, by)

The agg_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation(s) specified.

Parameters:
  • aggs (Union[Aggregation, List[Aggregation]]) – the aggregation(s) to be applied

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

aj(table, on, joins=None)

The aj (as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going over. If there is no matching key in the right table, appended row values are NULL.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘>’ or ‘>=’. If a common name is used for the inexact match, ‘>=’ is used for the comparison.

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

avg_by(by=None)

The avg_by method creates a new table containing the average for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

close()

Close the table reference on the server.

Raises:

DHError

Return type:

None

count_by(col, by=None)

The count_by method creates a new table containing the number of rows for each group. The count of each group is stored in a new column named after the ‘col’ parameter.

Parameters:
  • col (str) – the name of the column to store the counts

  • by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

delete(table)[source]

Deletes the keys contained in the provided table from this keyed input table. If this method is called on an append-only input table, a PermissionError will be raised.

Parameters:

table (Table) – the table with the keys to delete

Raises:

DHError, PermissionError

Return type:

None

drop_columns(cols)

The drop_column method creates a new table with the same size as this table but omits any of the specified columns.

Parameters:

cols (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

exact_join(table, on, joins=None)

The exact_join method creates a new table containing all the rows and columns of this table plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

first_by(by=None)

The first_by method creates a new table which contains the first row of each distinct group.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

group_by(by=None)

The group_by method creates a new table containing grouping columns and grouped data, column content is grouped into arrays.

If no group-by column is given, the content of each column is grouped into its own array.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

head(num_rows)

The head method creates a new table with a specific number of rows from the beginning of the table.

Parameters:

num_rows (int) – the number of rows at the head of table

Return type:

Table

Returns:

a Table object

Raises:

DHError

head_by(num_rows, by)

The head_by method creates a new table containing the first number of rows for each group.

Parameters:
  • num_rows (int) – the number of rows at the beginning of each group

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

property is_closed

Whether this table is closed on the server.

property is_refreshing

Whether this table is refreshing.

join(table, on=None, joins=None, reserve_bits=10)

The join method creates a new table containing rows that have matching values in both tables. Rows that do not have matching criteria will not be included in the result. If there are multiple matches between a row from the left table and rows from the right table, all matching combinations will be included. If no columns to match (on) are specified, every combination of left and right table rows is included.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names, default is None

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

  • reserve_bits (int, optional) – the number of bits of key-space to initially reserve per group; default is 10

Return type:

Table

Returns:

a Table object

Raises:

DHError

last_by(by=None)

The last_by method creates a new table which contains the last row of each distinct group.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

lazy_update(formulas)

The lazy_update method creates a new table containing a new, cached, formula column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

max_by(by=None)

The max_by method creates a new table containing the maximum value for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

median_by(by=None)

The median_by method creates a new table containing the median for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

property meta_table

The column definitions of the table in a Table form.

min_by(by=None)

The min_by method creates a new table containing the minimum value for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

natural_join(table, on, joins=None)

The natural_join method creates a new table containing all the rows and columns of this table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal. If there is no matching key in the right table, appended row values are NULL.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

property pb_ticket

Returns the ticket as a gRPC protobuf ticket object.

property pb_typed_ticket

Returns a protobuf typed ticket, suitable for use in communicating with an ObjectType plugin on the server.

raj(table, on, joins=None)

The raj (reverse as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table ( joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going under. If there is no matching key in the right table, appended row values are NULL.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘<’ or ‘<=’. If a common name is used for the inexact match, ‘<=’ is used for the comparison.

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

select(formulas=None)

The select method creates a new in-memory table that includes one column for each formula. If no formula is specified, all columns will be included.

Parameters:

formulas (Union[str, List[str]], optional) – the column formula(s), default is None

Return type:

Table

Returns:

a Table object

Raises:

DHError

select_distinct(cols=None)

The select_distinct method creates a new table containing all the unique values for a set of key columns. When the selectDistinct method is used on multiple columns, it looks for distinct sets of values in the selected columns.

Parameters:

cols (Union[str, List[str]], optional) – the column name(s), default is None

Return type:

Table

Returns:

a Table object

Raises:

DHError

slice(start, stop)

Extracts a subset of a table by row positions into a new Table.

If both the start and the stop are positive, then both are counted from the beginning of the table. The start is inclusive, and the stop is exclusive. slice(0, N) is equivalent to head() (N) The start must be less than or equal to the stop.

If the start is positive and the stop is negative, then the start is counted from the beginning of the table, inclusively. The stop is counted from the end of the table. For example, slice(1, -1) includes all rows but the first and last. If the stop is before the start, the result is an empty table.

If the start is negative, and the stop is zero, then the start is counted from the end of the table, and the end of the slice is the size of the table. slice(-N, 0) is equivalent to tail() (N).

If the start is negative and the stop is negative, they are both counted from the end of the table. For example, slice(-2, -1) returns the second to last row of the table.

Parameters:
  • start (int) – the first row position to include in the result

  • stop (int) – the last row position to include in the result

Return type:

Table

Returns:

a new Table

Raises:

DHError

Examples

>>> table.slice(0, 5)    # first 5 rows
>>> table.slice(-5, 0)   # last 5 rows
>>> table.slice(2, 6)    # rows from index 2 to 5
>>> table.slice(6, 2)    # ERROR: cannot slice start after end
>>> table.slice(-6, -2)  # rows from 6th last to 2nd last (exclusive)
>>> table.slice(-2, -6)  # ERROR: cannot slice start after end
>>> table.slice(2, -3)   # all rows except the first 2 and the last 3
>>> table.slice(-6, 8)   # rows from 6th last to index 8 (exclusive)
snapshot()

The snapshot method creates a static snapshot table.

Return type:

Table

Returns:

a Table object

Raises:

DHError

snapshot_when(trigger_table, stamp_cols=None, initial=False, incremental=False, history=False)

The snapshot_when creates a table that captures a snapshot of this table whenever trigger_table updates.

When trigger_table updates, a snapshot of this table and the “stamp key” from trigger_table form the resulting table. The “stamp key” is the last row of the trigger_table, limited by the stamp_cols. If trigger_table is empty, the “stamp key” will be represented by NULL values.

Parameters:
  • trigger_table (Table) – the trigger table

  • stamp_cols (Union[str, List[str]]) – The column(s) from trigger_table that form the “stamp key”, may be renames, default is None, meaning that all columns from trigger_table form the “stamp key”.

  • initial (bool) – Whether to take an initial snapshot upon construction, default is False. When False, the resulting table will remain empty until trigger_table first updates.

  • incremental (bool) – Whether the resulting table should be incremental, default is False. When False, all rows of this table will have the latest “stamp key”. When True, only the rows of this table that have been added or updated will have the latest “stamp key”.

  • history (bool) – Whether the resulting table should keep history, default is False. A history table appends a full snapshot of this table and the “stamp key” as opposed to updating existing rows. The history flag is currently incompatible with initial and incremental: when history is True, incremental and initial must be False.

Return type:

Table

Returns:

a Table object

Raises:

DHError

sort(order_by, order=None)

The sort method creates a new table where the rows are ordered based on values in the specified set of columns.

Parameters:
  • order_by (Union[str, List[str]]) – the column(s) to be sorted on

  • order (Union[SortDirection, List[SortDirection]], optional) – the corresponding sort direction(s) for each sort column, default is None. In the absence of explicit sort directions, data will be sorted in the ascending order.

Return type:

Table

Returns:

a Table object

Raises:

DHError

sort_descending(order_by)

The sort_descending method creates a new table where rows in a table are sorted in descending order based on the order_by column(s).

Parameters:

order_by (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

std_by(by=None)

The std_by method creates a new table containing the sample standard deviation for each group. Columns not used in the grouping must be of numeric types.

Sample standard deviation is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.

Parameters:

by (Union[str, List[str]]) – the group-by column names(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

sum_by(by=None)

The sum_by method creates a new table containing the sum for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]]) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

tail(num_rows)

The tail method creates a new table with a specific number of rows from the end of the table.

Parameters:

num_rows (int) – the number of rows at the end of table

Return type:

Table

Returns:

a Table object

Raises:

DHError

tail_by(num_rows, by)

The tail_by method creates a new table containing the last number of rows for each group.

Parameters:
  • num_rows (int) – the number of rows at the end of each group

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

ticket

The ticket that points to the object on the server.

to_arrow()

Takes a snapshot of the table and returns a pyarrow Table.

Return type:

Table

Returns:

a pyarrow.Table

Raises:

DHError

type

The type of the object. May be None, indicating that the instance cannot be connected to or otherwise directly used from the client.

ungroup(cols=None, null_fill=True)

The ungroup method creates a new table in which array columns from the source table are unwrapped into separate rows. The ungroup columns should be of array types.

Parameters:
  • cols (Union[str, List[str]], optional) – the array column(s), default is None, meaning all array columns will be ungrouped, default is None, meaning all array columns will be ungrouped

  • null_fill (bool, optional) – indicates whether null should be used to fill missing cells, default is True

Return type:

Table

Returns:

a Table object

Raises:

DHError

update(formulas)

The update method creates a new table containing a new, in-memory column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

update_by(ops, by)

The update_by method creates a table with additional columns calculated from window-based aggregations of columns in this table. The aggregations are defined by the provided operations, which support incremental aggregations over the corresponding rows in the table. The aggregations will apply position or time-based windowing and compute the results over the entire table or each row group as identified by the provided key columns.

Parameters:
  • ops (Union[UpdateByOperatoin, List[UpdateByOperation]]) – the UpdateByOperation(s) to be applied

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

update_view(formulas)

The update_view method creates a new table containing a new, formula column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

var_by(by=None)

The var_by method creates a new table containing the sample variance for each group. Columns not used in the grouping must be of numeric types.

Sample variance is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

view(formulas)

The view method creates a new formula table that includes one column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

where(filters)

The where method creates a new table with only the rows meeting the filter criteria in the column(s) of the table.

Parameters:

filters (Union[str, List[str]]) – the filter condition expression(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

where_in(filter_table, cols)

The where_in method creates a new table containing rows from the source table, where the rows match values in the filter table.

Parameters:
  • filter_table (Table) – the table containing the set of values to filter on

  • cols (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

where_not_in(filter_table, cols)

The where_not_in method creates a new table containing rows from the source table, where the rows do not match values in the filter table.

Parameters:
  • filter_table (Table) – the table containing the set of values to filter on

  • cols (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

class MultiJoinInput(table, on, joins=None)[source]

Bases: object

A MultiJoinInput represents the input tables, key columns and additional columns to be used in the multi-table natural join.

Initializes a MultiJoinInput object.

Parameters:
  • table (Table) – the right table to include in the join

  • on (Union[str, Sequence[str]]) – the column(s) to match, can be a common name or an equality expression that matches every input table, i.e. “col_a = col_b” to rename output column names.

  • joins (Union[str, Sequence[str]], optional) – the column(s) to be added from the table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None

class MultiJoinTable(table)[source]

Bases: object

A MultiJoinTable is an object that contains the result of a multi-table natural join. To retrieve the underlying result Table, use the table property.

property table

Returns the Table containing the multi-table natural join output.

class Table(session, ticket, schema_header=b'', size=None, is_static=None, schema=None)[source]

Bases: TableInterface, ServerObject

A Table object represents a reference to a table on the server. It is the core data structure of Deephaven and supports a rich set of operations such as filtering, sorting, aggregating, joining, snapshotting etc.

Note, an application should never instantiate a Table object directly. Table objects are always provided through factory methods such as Session.empty_table(), or import/export methods such as Session.import_table(), open_table(), or any of the Table operations.

is_closed

check if the table has been closed on the server

Type:

bool

agg_all_by(agg, by)[source]

The agg_all_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation specified.

Note, because agg_all_by applies the aggregation to all the columns of the table, it will ignore any column names specified for the aggregation.

Parameters:
  • agg (Aggregation) – the aggregation to be applied

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

agg_by(aggs, by)[source]

The agg_by method creates a new table containing grouping columns and grouped data. The resulting grouped data is defined by the aggregation(s) specified.

Parameters:
  • aggs (Union[Aggregation, List[Aggregation]]) – the aggregation(s) to be applied

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

aj(table, on, joins=None)[source]

The aj (as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going over. If there is no matching key in the right table, appended row values are NULL.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘>’ or ‘>=’. If a common name is used for the inexact match, ‘>=’ is used for the comparison.

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

avg_by(by=None)[source]

The avg_by method creates a new table containing the average for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

close()[source]

Close the table reference on the server.

Raises:

DHError

Return type:

None

count_by(col, by=None)[source]

The count_by method creates a new table containing the number of rows for each group. The count of each group is stored in a new column named after the ‘col’ parameter.

Parameters:
  • col (str) – the name of the column to store the counts

  • by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

drop_columns(cols)[source]

The drop_column method creates a new table with the same size as this table but omits any of the specified columns.

Parameters:

cols (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

exact_join(table, on, joins=None)[source]

The exact_join method creates a new table containing all the rows and columns of this table plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

first_by(by=None)[source]

The first_by method creates a new table which contains the first row of each distinct group.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

group_by(by=None)[source]

The group_by method creates a new table containing grouping columns and grouped data, column content is grouped into arrays.

If no group-by column is given, the content of each column is grouped into its own array.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

head(num_rows)[source]

The head method creates a new table with a specific number of rows from the beginning of the table.

Parameters:

num_rows (int) – the number of rows at the head of table

Return type:

Table

Returns:

a Table object

Raises:

DHError

head_by(num_rows, by)[source]

The head_by method creates a new table containing the first number of rows for each group.

Parameters:
  • num_rows (int) – the number of rows at the beginning of each group

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

property is_closed

Whether this table is closed on the server.

property is_refreshing

Whether this table is refreshing.

join(table, on=None, joins=None, reserve_bits=10)[source]

The join method creates a new table containing rows that have matching values in both tables. Rows that do not have matching criteria will not be included in the result. If there are multiple matches between a row from the left table and rows from the right table, all matching combinations will be included. If no columns to match (on) are specified, every combination of left and right table rows is included.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names, default is None

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

  • reserve_bits (int, optional) – the number of bits of key-space to initially reserve per group; default is 10

Return type:

Table

Returns:

a Table object

Raises:

DHError

last_by(by=None)[source]

The last_by method creates a new table which contains the last row of each distinct group.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

lazy_update(formulas)[source]

The lazy_update method creates a new table containing a new, cached, formula column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

max_by(by=None)[source]

The max_by method creates a new table containing the maximum value for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

median_by(by=None)[source]

The median_by method creates a new table containing the median for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

property meta_table

The column definitions of the table in a Table form.

min_by(by=None)[source]

The min_by method creates a new table containing the minimum value for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

natural_join(table, on, joins=None)[source]

The natural_join method creates a new table containing all the rows and columns of this table, plus additional columns containing data from the right table. For columns appended to the left table (joins), row values equal the row values from the right table where the key values in the left and right tables are equal. If there is no matching key in the right table, appended row values are NULL.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or an equal expression, i.e. “col_a = col_b” for different column names

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

property pb_ticket

Returns the ticket as a gRPC protobuf ticket object.

property pb_typed_ticket

Returns a protobuf typed ticket, suitable for use in communicating with an ObjectType plugin on the server.

raj(table, on, joins=None)[source]

The raj (reverse as-of join) method creates a new table containing all the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table ( joins), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table without going under. If there is no matching key in the right table, appended row values are NULL.

Parameters:
  • table (Table) – the right-table of the join

  • on (Union[str, List[str]]) – the column(s) to match, can be a common name or a match condition of two columns, e.g. ‘col_a = col_b’. The first ‘N-1’ matches are exact matches. The final match is an inexact match. The inexact match can use either ‘<’ or ‘<=’. If a common name is used for the inexact match, ‘<=’ is used for the comparison.

  • joins (Union[str, List[str]], optional) – the column(s) to be added from the right table to the result table, can be renaming expressions, i.e. “new_col = col”; default is None, which means all the columns from the right table, excluding those specified in ‘on’

Return type:

Table

Returns:

a Table object

Raises:

DHError

select(formulas=None)[source]

The select method creates a new in-memory table that includes one column for each formula. If no formula is specified, all columns will be included.

Parameters:

formulas (Union[str, List[str]], optional) – the column formula(s), default is None

Return type:

Table

Returns:

a Table object

Raises:

DHError

select_distinct(cols=None)[source]

The select_distinct method creates a new table containing all the unique values for a set of key columns. When the selectDistinct method is used on multiple columns, it looks for distinct sets of values in the selected columns.

Parameters:

cols (Union[str, List[str]], optional) – the column name(s), default is None

Return type:

Table

Returns:

a Table object

Raises:

DHError

slice(start, stop)[source]

Extracts a subset of a table by row positions into a new Table.

If both the start and the stop are positive, then both are counted from the beginning of the table. The start is inclusive, and the stop is exclusive. slice(0, N) is equivalent to head() (N) The start must be less than or equal to the stop.

If the start is positive and the stop is negative, then the start is counted from the beginning of the table, inclusively. The stop is counted from the end of the table. For example, slice(1, -1) includes all rows but the first and last. If the stop is before the start, the result is an empty table.

If the start is negative, and the stop is zero, then the start is counted from the end of the table, and the end of the slice is the size of the table. slice(-N, 0) is equivalent to tail() (N).

If the start is negative and the stop is negative, they are both counted from the end of the table. For example, slice(-2, -1) returns the second to last row of the table.

Parameters:
  • start (int) – the first row position to include in the result

  • stop (int) – the last row position to include in the result

Return type:

Table

Returns:

a new Table

Raises:

DHError

Examples

>>> table.slice(0, 5)    # first 5 rows
>>> table.slice(-5, 0)   # last 5 rows
>>> table.slice(2, 6)    # rows from index 2 to 5
>>> table.slice(6, 2)    # ERROR: cannot slice start after end
>>> table.slice(-6, -2)  # rows from 6th last to 2nd last (exclusive)
>>> table.slice(-2, -6)  # ERROR: cannot slice start after end
>>> table.slice(2, -3)   # all rows except the first 2 and the last 3
>>> table.slice(-6, 8)   # rows from 6th last to index 8 (exclusive)
snapshot()[source]

The snapshot method creates a static snapshot table.

Return type:

Table

Returns:

a Table object

Raises:

DHError

snapshot_when(trigger_table, stamp_cols=None, initial=False, incremental=False, history=False)[source]

The snapshot_when creates a table that captures a snapshot of this table whenever trigger_table updates.

When trigger_table updates, a snapshot of this table and the “stamp key” from trigger_table form the resulting table. The “stamp key” is the last row of the trigger_table, limited by the stamp_cols. If trigger_table is empty, the “stamp key” will be represented by NULL values.

Parameters:
  • trigger_table (Table) – the trigger table

  • stamp_cols (Union[str, List[str]]) – The column(s) from trigger_table that form the “stamp key”, may be renames, default is None, meaning that all columns from trigger_table form the “stamp key”.

  • initial (bool) – Whether to take an initial snapshot upon construction, default is False. When False, the resulting table will remain empty until trigger_table first updates.

  • incremental (bool) – Whether the resulting table should be incremental, default is False. When False, all rows of this table will have the latest “stamp key”. When True, only the rows of this table that have been added or updated will have the latest “stamp key”.

  • history (bool) – Whether the resulting table should keep history, default is False. A history table appends a full snapshot of this table and the “stamp key” as opposed to updating existing rows. The history flag is currently incompatible with initial and incremental: when history is True, incremental and initial must be False.

Return type:

Table

Returns:

a Table object

Raises:

DHError

sort(order_by, order=None)[source]

The sort method creates a new table where the rows are ordered based on values in the specified set of columns.

Parameters:
  • order_by (Union[str, List[str]]) – the column(s) to be sorted on

  • order (Union[SortDirection, List[SortDirection]], optional) – the corresponding sort direction(s) for each sort column, default is None. In the absence of explicit sort directions, data will be sorted in the ascending order.

Return type:

Table

Returns:

a Table object

Raises:

DHError

sort_descending(order_by)[source]

The sort_descending method creates a new table where rows in a table are sorted in descending order based on the order_by column(s).

Parameters:

order_by (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

std_by(by=None)[source]

The std_by method creates a new table containing the sample standard deviation for each group. Columns not used in the grouping must be of numeric types.

Sample standard deviation is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.

Parameters:

by (Union[str, List[str]]) – the group-by column names(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

sum_by(by=None)[source]

The sum_by method creates a new table containing the sum for each group. Columns not used in the grouping must be of numeric types.

Parameters:

by (Union[str, List[str]]) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

tail(num_rows)[source]

The tail method creates a new table with a specific number of rows from the end of the table.

Parameters:

num_rows (int) – the number of rows at the end of table

Return type:

Table

Returns:

a Table object

Raises:

DHError

tail_by(num_rows, by)[source]

The tail_by method creates a new table containing the last number of rows for each group.

Parameters:
  • num_rows (int) – the number of rows at the end of each group

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

ticket

The ticket that points to the object on the server.

to_arrow()[source]

Takes a snapshot of the table and returns a pyarrow Table.

Return type:

Table

Returns:

a pyarrow.Table

Raises:

DHError

type

The type of the object. May be None, indicating that the instance cannot be connected to or otherwise directly used from the client.

ungroup(cols=None, null_fill=True)[source]

The ungroup method creates a new table in which array columns from the source table are unwrapped into separate rows. The ungroup columns should be of array types.

Parameters:
  • cols (Union[str, List[str]], optional) – the array column(s), default is None, meaning all array columns will be ungrouped, default is None, meaning all array columns will be ungrouped

  • null_fill (bool, optional) – indicates whether null should be used to fill missing cells, default is True

Return type:

Table

Returns:

a Table object

Raises:

DHError

update(formulas)[source]

The update method creates a new table containing a new, in-memory column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

update_by(ops, by)[source]

The update_by method creates a table with additional columns calculated from window-based aggregations of columns in this table. The aggregations are defined by the provided operations, which support incremental aggregations over the corresponding rows in the table. The aggregations will apply position or time-based windowing and compute the results over the entire table or each row group as identified by the provided key columns.

Parameters:
  • ops (Union[UpdateByOperatoin, List[UpdateByOperation]]) – the UpdateByOperation(s) to be applied

  • by (Union[str, List[str]]) – the group-by column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

update_view(formulas)[source]

The update_view method creates a new table containing a new, formula column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

var_by(by=None)[source]

The var_by method creates a new table containing the sample variance for each group. Columns not used in the grouping must be of numeric types.

Sample variance is computed using Bessel’s correction, which ensures that the sample variance will be an unbiased estimator of population variance.

Parameters:

by (Union[str, List[str]], optional) – the group-by column name(s), default is None, meaning grouping all the rows into one group

Return type:

Table

Returns:

a Table object

Raises:

DHError

view(formulas)[source]

The view method creates a new formula table that includes one column for each formula.

Parameters:

formulas (Union[str, List[str]]) – the column formula(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

where(filters)[source]

The where method creates a new table with only the rows meeting the filter criteria in the column(s) of the table.

Parameters:

filters (Union[str, List[str]]) – the filter condition expression(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

where_in(filter_table, cols)[source]

The where_in method creates a new table containing rows from the source table, where the rows match values in the filter table.

Parameters:
  • filter_table (Table) – the table containing the set of values to filter on

  • cols (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

where_not_in(filter_table, cols)[source]

The where_not_in method creates a new table containing rows from the source table, where the rows do not match values in the filter table.

Parameters:
  • filter_table (Table) – the table containing the set of values to filter on

  • cols (Union[str, List[str]]) – the column name(s)

Return type:

Table

Returns:

a Table object

Raises:

DHError

multi_join(input, on=None)[source]

The multi_join method creates a new table by performing a multi-table natural join on the input tables. The result consists of the set of distinct keys from the input tables natural joined to each input table. Input tables need not have a matching row for each key, but they may not have multiple matching rows for a given key.

Parameters:
  • input (Union[Table, Sequence[Table], MultiJoinInput, Sequence[MultiJoinInput]]) – the input objects specifying the tables and columns to include in the join.

  • on (Union[str, Sequence[str]], optional) – the column(s) to match, can be a common name or an equality expression that matches every input table, i.e. “col_a = col_b” to rename output column names. Note: When MultiJoinInput objects are supplied, this parameter must be omitted.

Returns:

the result of the multi-table natural join operation. To access the underlying Table, use the

table property.

Return type:

MultiJoinTable

Raises:

DHError