deephaven.experimental.table_data_service¶
This module defines a table service backend interface TableDataServiceBackend that users can implement to provide external data in the format of pyarrow Table to Deephaven tables. The backend service implementation should be passed to the TableDataService constructor to create a new TableDataService instance. The TableDataService instance can then be used to create Deephaven tables backed by the backend service.
- class TableDataService(backend, *, chunk_reader_factory=None, stream_reader_options=None, page_size=None)[source]¶
Bases:
JObjectWrapper
A TableDataService serves as a wrapper around a tightly-coupled Deephaven TableDataService implementation (Java class PythonTableDataService) that delegates to a Python TableDataServiceBackend for TableKey creation, TableLocationKey discovery, and data subscription/retrieval operations. It supports the creation of Deephaven tables from the Python backend service that provides table data and table data locations to the Deephaven tables.
Creates a new TableDataService with the given user-implemented backend service.
- Parameters:
backend (TableDataServiceBackend) – the user-implemented backend service implementation
chunk_reader_factory (Optional[jpy.JType]) – the Barrage chunk reader factory, default is None
stream_reader_options (Optional[jpy.JType]) – the Barrage stream reader options, default is None
page_size (int) – the page size for the table service, default is None, meaning to use the configurable jvm property: PythonTableDataService.defaultPageSize which defaults to 64K.
- j_object_type¶
alias of
PythonTableDataService
- class TableDataServiceBackend[source]¶
Bases:
ABC
An interface for a backend service that provides access to table data.
- abstract column_values(table_key, table_location_key, col, offset, min_rows, max_rows, values_cb, failure_cb)[source]¶
Provides the data values for the column with the given name for the table location with the given table key and table location key via the values_cb callback. The column values are provided as a pyarrow.Table that contains the data values for the column within the specified range requirement. The values_cb callback should be called with a single column pyarrow.Table that contains the data values for the given column within the specified range requirement.
The failure callback should be invoked when a failure to provide the column values occurs.
The column_values caller will block until one of the values or failure callbacks is called.
Note that asynchronous calls to any callback may block until this method has returned.
- Parameters:
table_key (TableKey) – the table key
table_location_key (TableLocationKey) – the table location key
col (str) – the column name
offset (int) – the starting row index
min_rows (int) – the minimum number of rows to return, min_rows is always <= page size
max_rows (int) – the maximum number of rows to return
values_cb (Callable[[pa.Table], None]) – the callback function with one argument: the pyarrow.Table that contains the data values for the column within the specified range
failure_cb (Callable[[Exception], None]) – the failure callback function
- Return type:
None
- abstract subscribe_to_table_location_size(table_key, table_location_key, size_cb, success_cb, failure_cb)[source]¶
Provides the current and future sizes of the table location with the given table key and table location key via the size_cb callback. The size is the number of rows in the table location.
The success callback should be called when the subscription is established successfully and after the current table location size has been delivered to the size callback.
The failure callback should be invoked at initial failure to establish a subscription, or on a permanent failure to keep the subscription active (e.g. failure with no reconnection possible, or failure to reconnect/resubscribe before a timeout).
This is called for tables created when :meth:``TableDataService.make_table` is called with refreshing=True
Note that asynchronous calls to any callback will block until this method has returned.
- Parameters:
table_key (TableKey) – the table key
table_location_key (TableLocationKey) – the table location key
size_cb (Callable[[int], None]) – the table location size callback function
success_cb (Callable[[], None]) – the success callback function
failure_cb (Callable[[Exception], None]) – the failure callback function
- Returns:
a function that can be called to unsubscribe from this subscription
- Return type:
Callable[[], None]
- abstract subscribe_to_table_locations(table_key, location_cb, success_cb, failure_cb)[source]¶
Provides the table locations, existing and new, for the table with the given table key via the location_cb callback.
The location callback should be called with the table location key and an optional pyarrow.Table that contains the partitioning values for the location. The schema of the table must match the optional partitioning column schema returned by
table_schema()
for the table_key. The table must have a single row for the particular table location key provided in the 1st argument, with values for each partitioning column in the row.The success callback should be called when the subscription is established successfully and after all existing table locations have been delivered to the table location callback.
The failure callback should be invoked at initial failure to establish a subscription, or on a permanent failure to keep the subscription active (e.g. failure with no reconnection possible, or failure to reconnect/resubscribe before a timeout).
This is called for tables created when
TableDataService.make_table()
is called with refreshing=True.Note that asynchronous calls to any callback will block until this method has returned.
- Parameters:
table_key (TableKey) – the table key
location_cb (Callable[[TableLocationKey, Optional[pa.Table]], None]) – the table location callback function
success_cb (Callable[[], None]) – the success callback function
failure_cb (Callable[[Exception], None]) – the failure callback function
- Returns:
a function that can be called to unsubscribe from this subscription
- Return type:
Callable[[], None]
- abstract table_location_size(table_key, table_location_key, size_cb, failure_cb)[source]¶
Provides the size of the table location with the given table key and table location key via the size_cb callback. The size is the number of rows in the table location.
The failure callback should be invoked when a failure to provide the table location size occurs.
The table_location_size caller will block until one of the size or failure callbacks is called.
This is called for tables created when
TableDataService.make_table()
is called with refreshing=False.Note that asynchronous calls to any callback may block until this method has returned.
- Parameters:
table_key (TableKey) – the table key
table_location_key (TableLocationKey) – the table location key
size_cb (Callable[[int], None]) – the callback function
- Return type:
None
- abstract table_locations(table_key, location_cb, success_cb, failure_cb)[source]¶
Provides the existing table locations for the table with the given table via the location_cb callback.
The location callback should be called with the table location key and an optional pyarrow.Table that contains the partitioning values for the location. The schema of the table must match the optional partitioning column schema returned by
table_schema()
for the table_key. The table must have a single row for the particular table location key provided in the 1st argument, with values for each partitioning column in the row.The success callback should be called when all existing table locations have been delivered to the table location callback.
The failure callback should be invoked when failure to provide existing table locations occurs.
The table_locations caller will block until one of the success or failure callbacks is called.
This is called for tables created when
TableDataService.make_table()
is called with refreshing=FalseNote that asynchronous calls to any callback may block until this method has returned.
- Parameters:
table_key (TableKey) – the table key
location_cb (Callable[[TableLocationKey, Optional[pa.Table]], None]) – the callback function
success_cb (Callable[[], None]) – the success callback function
failure_cb (Callable[[Exception], None]) – the failure callback function
- Return type:
None
- abstract table_schema(table_key, schema_cb, failure_cb)[source]¶
Provides the table data schema and the partitioning column schema for the table with the given table key via the schema_cb callback. The table data schema is not required to include the partitioning columns defined in the partitioning column schema.
The failure callback should be invoked when a failure to provide the schemas occurs.
The table_schema caller will block until one of the schema or failure callbacks is called.
Note that asynchronous calls to any callback may block until this method has returned.
- Parameters:
table_key (TableKey) – the table key
schema_cb (Callable[[pa.Schema, Optional[pa.Schema]], None]) – the callback function with two arguments: the table data schema and the optional partitioning column schema
failure_cb (Callable[[Exception], None]) – the failure callback function
- Return type:
None
- class TableKey[source]¶
Bases:
ABC
A key that identifies a table. The key should be unique for each table. The key can be any Python object and should include sufficient information to uniquely identify the table for the backend service. The __hash__ method must be implemented to ensure that the key is hashable.
- class TableLocationKey[source]¶
Bases:
ABC
A key that identifies a specific location of a table. The key should be unique for each table location of the table. The key can be any Python object and should include sufficient information to uniquely identify the location for the backend service to fetch the data values and data size. The __hash__ method must be implemented to ensure that the key is hashable.