pydeephaven#

Deephaven Python Client (pydeephaven) is a Python API built on top of Deephaven’s highly efficient Open API which is based on gRPC and Apache Arrow. It allows Python applications to remotely connect to Deephaven data servers, export/import data with the server, run Python scripts on the server, and execute powerful queries on data tables.

Because Deephaven data servers and Deephaven clients including pydeephaven exchange data in the Apache Arrow format, pydeephaven is able to leverage ‘pyarrow’ - the Python bindings of Arrow (https://arrow.apache.org/docs/python/) for data representation and integration with other data analytic tools such as NumPy, Pandas, etc.

Examples

>>> from pydeephaven import Session
>>> from pyarrow import csv
>>> session = Session() # assuming Deephaven Community Edition is running locally with the default configuration
>>> table1 = session.import_table(csv.read_csv("data1.csv"))
>>> table2 = session.import_table(csv.read_csv("data2.csv"))
>>> joined_table = table1.join(table2, on=["key_col_1", "key_col_2"], joins=["data_col1"])
>>> df = joined_table.to_arrow().to_pandas()
>>> print(df)
>>> session.close()
exception DHError(message='')[source]#

Bases: Exception

A custom exception class used by pydeephaven.

with_traceback()#

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class Session(host=None, port=None, auth_type='Anonymous', auth_token='', never_timeout=True, session_type='python', use_tls=False, tls_root_certs=None, client_cert_chain=None, client_private_key=None, client_opts=None, extra_headers=None)[source]#

Bases: object

A Session object represents a connection to the Deephaven data server. It contains a number of convenience methods for asking the server to create tables, import Arrow data into tables, merge tables, run Python scripts, and execute queries.

Session objects can be used in Python with statement so that whatever happens in the with statement block, they are guaranteed to be closed upon exit.

tables#

names of the global tables available in the server after running scripts

Type:

list[str]

is_alive#

check if the session is still alive (may refresh the session)

Type:

bool

Initializes a Session object that connects to the Deephaven server

Parameters:
  • host (str) – the host name or IP address of the remote machine, default is ‘localhost’

  • port (int) – the port number that Deephaven server is listening on, default is 10000

  • auth_type (str) – the authentication type string, can be “Anonymous’, ‘Basic”, or any custom-built authenticator in the server, such as “io.deephaven.authentication.psk.PskAuthenticationHandler”, default is ‘Anonymous’.

  • auth_token (str) – the authentication token string. When auth_type is ‘Basic’, it must be “user:password”; when auth_type is “Anonymous’, it will be ignored; when auth_type is a custom-built authenticator, it must conform to the specific requirement of the authenticator

  • never_timeout (bool) – never allow the session to timeout, default is True

  • session_type (str) – the Deephaven session type. Defaults to ‘python’

  • use_tls (bool) – if True, use a TLS connection. Defaults to False

  • tls_root_certs (bytes) – PEM encoded root certificates to use for TLS connection, or None to use system defaults. If not None implies use a TLS connection and the use_tls argument should have been passed as True. Defaults to None

  • client_cert_chain (bytes) – PEM encoded client certificate if using mutual TLS. Defaults to None, which implies not using mutual TLS.

  • client_private_key (bytes) – PEM encoded client private key for client_cert_chain if using mutual TLS. Defaults to None, which implies not using mutual TLS.

  • client_opts (List[Tuple[str,Union[int,str]]) –

    list of tuples for name and value of options to the underlying grpc channel creation. Defaults to None, which implies not using any channel options. See https://grpc.github.io/grpc/cpp/group__grpc__arg__keys.html for a list of valid options. Example options:

    [ (‘grpc.target_name_override’, ‘idonthaveadnsforthishost’),

    (‘grpc.min_reconnect_backoff_ms’, 2000) ]

  • extra_headers (Dict[bytes, bytes]) – additional headers (and values) to add to server requests. Defaults to None, which implies not using any extra headers.

Raises:

DHError

bind_table(name, table)[source]#

Binds a table to the given name on the server so that it can be referenced by that name.

Parameters:
  • name (str) – name for the table

  • table (Table) – a Table object

Raises:

DHError

Return type:

None

close()[source]#

Closes the Session object if it hasn’t timed out already.

Raises:

DHError

Return type:

None

empty_table(size)[source]#

Creates an empty table on the server.

Parameters:

size (int) – the size of the empty table in number of rows

Return type:

Table

Returns:

a Table object

Raises:

DHError

import_table(data)[source]#

Imports the pyarrow table as a new Deephaven table on the server.

Deephaven supports most of the Arrow data types. However, if the pyarrow table contains any field with a data type not supported by Deephaven, the import operation will fail.

Parameters:

data (pa.Table) – a pyarrow Table object

Return type:

Table

Returns:

a Table object

Raises:

DHError

input_table(schema=None, init_table=None, key_cols=None)[source]#

Creates an InputTable from either Arrow schema or initial table. When key columns are provided, the InputTable will be keyed, otherwise it will be append-only.

Parameters:
  • schema (pa.Schema) – the schema for the InputTable

  • init_table (Table) – the initial table

  • key_cols (Union[str, Sequence[str]) – the name(s) of the key column(s)

Return type:

InputTable

Returns:

an InputTable

Raises:

DHError, ValueError

property is_alive#

Whether the session is alive.

merge_tables(tables, order_by=None)[source]#

Merges several tables into one table on the server.

Parameters:
  • tables (list[Table]) – the list of Table objects to merge

  • order_by (str, optional) – if specified the resultant table will be sorted on this column

Return type:

Table

Returns:

a Table object

Raises:

DHError

open_table(name)[source]#

Opens a table in the global scope with the given name on the server.

Parameters:

name (str) – the name of the table

Return type:

Table

Returns:

a Table object

Raises:

DHError

plugin_client(exportable_obj)[source]#

Wraps a ticket as a PluginClient. Capabilities here vary based on the server implementation of the ObjectType, but most will at least send a response payload to the client, possibly including references to other objects. In some cases, depending on the server implementation, the client will also be able to send the same sort of messages back to the server.

Part of the experimental plugin API.

Return type:

PluginClient

query(table)[source]#

Creates a Query object to define a sequence of operations on a Deephaven table.

Parameters:

table (Table) – a Table object

Return type:

Query

Returns:

a Query object

Raises:

DHError

run_script(script)[source]#

Runs the supplied Python script on the server.

Parameters:

script (str) – the Python script code

Raises:

DHError

Return type:

None

time_table(period, start_time=None, blink_table=False)[source]#

Creates a time table on the server.

Parameters:
  • period (Union[int, str]) – the interval at which the time table ticks (adds a row); units are nanoseconds or a time interval string, e.g. “PT00:00:.001” or “PT1S”

  • start_time (Union[int, str]) – the start time for the time table in nanoseconds or as a date time formatted string; default is None (meaning now)

  • blink_table (bool, optional) – if the time table should be a blink table, defaults to False

Return type:

Table

Returns:

a Table object

Raises:

DHError

class SortDirection(value)[source]#

Bases: Enum

An enum defining the sort ordering.

ASCENDING = 1#

Ascending sort direction

DESCENDING = -1#

Descending sort direction