pydeephaven¶
- agg
- dherror
- experimental
- query
Query
Query.agg_all_by()
Query.agg_by()
Query.aj()
Query.avg_by()
Query.count_by()
Query.drop_columns()
Query.exact_join()
Query.exec()
Query.first_by()
Query.group_by()
Query.head()
Query.head_by()
Query.join()
Query.last_by()
Query.lazy_update()
Query.max_by()
Query.median_by()
Query.min_by()
Query.natural_join()
Query.raj()
Query.select()
Query.select_distinct()
Query.snapshot()
Query.snapshot_when()
Query.sort()
Query.std_by()
Query.sum_by()
Query.tail()
Query.tail_by()
Query.ungroup()
Query.update()
Query.update_by()
Query.update_view()
Query.var_by()
Query.view()
Query.where()
Query.where_in()
Query.where_not_in()
- session
Session
Session.tables
Session.is_alive
Session.bind_table()
Session.close()
Session.empty_table()
Session.fetch_table()
Session.import_table()
Session.input_table()
Session.is_alive
Session.merge_tables()
Session.open_table()
Session.plugin_client()
Session.publish_table()
Session.query()
Session.run_script()
Session.time_table()
SharedTicket
random()
- table
InputTable
InputTable.add()
InputTable.agg_all_by()
InputTable.agg_by()
InputTable.aj()
InputTable.avg_by()
InputTable.close()
InputTable.count_by()
InputTable.delete()
InputTable.drop_columns()
InputTable.exact_join()
InputTable.first_by()
InputTable.group_by()
InputTable.head()
InputTable.head_by()
InputTable.is_closed
InputTable.is_refreshing
InputTable.join()
InputTable.last_by()
InputTable.lazy_update()
InputTable.max_by()
InputTable.median_by()
InputTable.meta_table
InputTable.min_by()
InputTable.natural_join()
InputTable.raj()
InputTable.select()
InputTable.select_distinct()
InputTable.snapshot()
InputTable.snapshot_when()
InputTable.sort()
InputTable.sort_descending()
InputTable.std_by()
InputTable.sum_by()
InputTable.tail()
InputTable.tail_by()
InputTable.ticket
InputTable.to_arrow()
InputTable.type_
InputTable.typed_ticket()
InputTable.ungroup()
InputTable.update()
InputTable.update_by()
InputTable.update_view()
InputTable.var_by()
InputTable.view()
InputTable.where()
InputTable.where_in()
InputTable.where_not_in()
Table
Table.is_closed
Table.agg_all_by()
Table.agg_by()
Table.aj()
Table.avg_by()
Table.close()
Table.count_by()
Table.drop_columns()
Table.exact_join()
Table.first_by()
Table.group_by()
Table.head()
Table.head_by()
Table.is_closed
Table.is_refreshing
Table.join()
Table.last_by()
Table.lazy_update()
Table.max_by()
Table.median_by()
Table.meta_table
Table.min_by()
Table.natural_join()
Table.raj()
Table.select()
Table.select_distinct()
Table.snapshot()
Table.snapshot_when()
Table.sort()
Table.sort_descending()
Table.std_by()
Table.sum_by()
Table.tail()
Table.tail_by()
Table.ticket
Table.to_arrow()
Table.type_
Table.typed_ticket()
Table.ungroup()
Table.update()
Table.update_by()
Table.update_view()
Table.var_by()
Table.view()
Table.where()
Table.where_in()
Table.where_not_in()
- updateby
BadDataBehavior
DeltaControl
MathContext
OperationControl
UpdateByOperation
cum_max()
cum_min()
cum_prod()
cum_sum()
delta()
ema_tick()
ema_time()
emmax_tick()
emmax_time()
emmin_tick()
emmin_time()
ems_tick()
ems_time()
emstd_tick()
emstd_time()
forward_fill()
rolling_avg_tick()
rolling_avg_time()
rolling_count_tick()
rolling_count_time()
rolling_formula_tick()
rolling_formula_time()
rolling_group_tick()
rolling_group_time()
rolling_max_tick()
rolling_max_time()
rolling_min_tick()
rolling_min_time()
rolling_prod_tick()
rolling_prod_time()
rolling_std_tick()
rolling_std_time()
rolling_sum_tick()
rolling_sum_time()
rolling_wavg_tick()
rolling_wavg_time()
- utils
Deephaven Python Client (pydeephaven) is a Python API built on top of Deephaven’s highly efficient Open API which is based on gRPC and Apache Arrow. It allows Python applications to remotely connect to Deephaven data servers, export/import data with the server, run Python scripts on the server, and execute powerful queries on data tables.
Because Deephaven data servers and Deephaven clients including pydeephaven exchange data in the Apache Arrow format, pydeephaven is able to leverage ‘pyarrow’ - the Python bindings of Arrow (https://arrow.apache.org/docs/python/) for data representation and integration with other data analytic tools such as NumPy, Pandas, etc.
Examples
>>> from pydeephaven import Session
>>> from pyarrow import csv
>>> session = Session() # assuming Deephaven Community Edition is running locally with the default configuration
>>> table1 = session.import_table(csv.read_csv("data1.csv"))
>>> table2 = session.import_table(csv.read_csv("data2.csv"))
>>> joined_table = table1.join(table2, on=["key_col_1", "key_col_2"], joins=["data_col1"])
>>> df = joined_table.to_arrow().to_pandas()
>>> print(df)
>>> session.close()
- exception DHError(message='')[source]¶
Bases:
Exception
A custom exception class used by pydeephaven.
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class Session(host=None, port=None, auth_type='Anonymous', auth_token='', never_timeout=True, session_type='python', use_tls=False, tls_root_certs=None, client_cert_chain=None, client_private_key=None, client_opts=None, extra_headers=None)[source]¶
Bases:
object
A Session object represents a connection to the Deephaven data server. It contains a number of convenience methods for asking the server to create tables, import Arrow data into tables, merge tables, run Python scripts, and execute queries.
Session objects can be used in Python with statement so that whatever happens in the with statement block, they are guaranteed to be closed upon exit.
- tables¶
names of the global tables available in the server after running scripts
- Type:
list[str]
- is_alive¶
check if the session is still alive (may refresh the session)
- Type:
bool
Initializes a Session object that connects to the Deephaven server
- Parameters:
host (str) – the host name or IP address of the remote machine, default is ‘localhost’
port (int) – the port number that Deephaven server is listening on, default is 10000
auth_type (str) – the authentication type string, can be “Anonymous’, ‘Basic”, or any custom-built authenticator in the server, such as “io.deephaven.authentication.psk.PskAuthenticationHandler”, default is ‘Anonymous’.
auth_token (str) – the authentication token string. When auth_type is ‘Basic’, it must be “user:password”; when auth_type is “Anonymous’, it will be ignored; when auth_type is a custom-built authenticator, it must conform to the specific requirement of the authenticator
never_timeout (bool) – never allow the session to timeout, default is True
session_type (str) – the Deephaven session type. Defaults to ‘python’
use_tls (bool) – if True, use a TLS connection. Defaults to False
tls_root_certs (bytes) – PEM encoded root certificates to use for TLS connection, or None to use system defaults. If not None implies use a TLS connection and the use_tls argument should have been passed as True. Defaults to None
client_cert_chain (bytes) – PEM encoded client certificate if using mutual TLS. Defaults to None, which implies not using mutual TLS.
client_private_key (bytes) – PEM encoded client private key for client_cert_chain if using mutual TLS. Defaults to None, which implies not using mutual TLS.
client_opts (List[Tuple[str,Union[int,str]]) –
list of tuples for name and value of options to the underlying grpc channel creation. Defaults to None, which implies not using any channel options. See https://grpc.github.io/grpc/cpp/group__grpc__arg__keys.html for a list of valid options. Example options:
- [ (‘grpc.target_name_override’, ‘idonthaveadnsforthishost’),
(‘grpc.min_reconnect_backoff_ms’, 2000) ]
extra_headers (Dict[bytes, bytes]) – additional headers (and values) to add to server requests. Defaults to None, which implies not using any extra headers.
- Raises:
DHError –
- bind_table(name, table)[source]¶
Binds a table to the given name on the server so that it can be referenced by that name.
- close()[source]¶
Closes the Session object if it hasn’t timed out already.
- Raises:
DHError –
- Return type:
None
- fetch_table(ticket)[source]¶
Fetches a table by ticket.
- Parameters:
ticket (SharedTicket) – a ticket
- Return type:
- Returns:
a Table object
- Raises:
DHError –
- import_table(data)[source]¶
Imports the pyarrow table as a new Deephaven table on the server.
Deephaven supports most of the Arrow data types. However, if the pyarrow table contains any field with a data type not supported by Deephaven, the import operation will fail.
- input_table(schema=None, init_table=None, key_cols=None, blink_table=False)[source]¶
Creates an InputTable from either Arrow schema or initial table. When blink_table is True, the InputTable will be a blink table. When blink_table is False (default), the InputTable will be keyed if key columns are provided, otherwise it will be append-only.
- Parameters:
schema (pa.Schema) – the schema for the InputTable
init_table (Table) – the initial table
key_cols (Union[str, Sequence[str]) – the name(s) of the key column(s)
blink_table (bool) – whether the InputTable should be a blink table, default is False
- Return type:
- Returns:
an InputTable
- Raises:
DHError, ValueError –
- property is_alive¶
Whether the session is alive.
- plugin_client(exportable_obj)[source]¶
Wraps a ticket as a PluginClient. Capabilities here vary based on the server implementation of the ObjectType, but most will at least send a response payload to the client, possibly including references to other objects. In some cases, depending on the server implementation, the client will also be able to send the same sort of messages back to the server.
Part of the experimental plugin API.
- Return type:
- publish_table(ticket, table)[source]¶
Publishes a table to the given shared ticket. The ticket can then be used by another session to fetch the table.
Note that, the shared ticket can be fetched by other sessions to access the table as long as the table is not released. When the table is released either through an explicit call of the close method on it, or implicitly through garbage collection, or through the closing of the publishing session, the shared ticket will no longer be valid.
- Parameters:
ticket (SharedTicket) – a SharedTicket object
table (Table) – a Table object
- Raises:
DHError –
- Return type:
None
- query(table)[source]¶
Creates a Query object to define a sequence of operations on a Deephaven table.
- run_script(script)[source]¶
Runs the supplied Python script on the server.
- Parameters:
script (str) – the Python script code
- Raises:
DHError –
- Return type:
None
- time_table(period, start_time=None, blink_table=False)[source]¶
Creates a time table on the server.
- Parameters:
period (Union[int, str]) – the interval at which the time table ticks (adds a row); units are nanoseconds or a time interval string, e.g. “PT00:00:.001” or “PT1S”
start_time (Union[int, str]) – the start time for the time table in nanoseconds or as a date time formatted string; default is None (meaning now)
blink_table (bool, optional) – if the time table should be a blink table, defaults to False
- Return type:
- Returns:
a Table object
- Raises:
DHError –