deephaven¶
- agg
- appmode
- arrow
- barrage
- calendar
- column
- constants
MAX_BYTE
MAX_CHAR
MAX_DOUBLE
MAX_FINITE_DOUBLE
MAX_FINITE_FLOAT
MAX_FLOAT
MAX_INT
MAX_LONG
MAX_SHORT
MIN_BYTE
MIN_CHAR
MIN_DOUBLE
MIN_FINITE_DOUBLE
MIN_FINITE_FLOAT
MIN_FLOAT
MIN_INT
MIN_LONG
MIN_POS_DOUBLE
MIN_POS_FLOAT
MIN_SHORT
NAN_DOUBLE
NAN_FLOAT
NEG_INFINITY_DOUBLE
NEG_INFINITY_FLOAT
NULL_BOOLEAN
NULL_BYTE
NULL_CHAR
NULL_DOUBLE
NULL_FLOAT
NULL_INT
NULL_LONG
NULL_SHORT
POS_INFINITY_DOUBLE
POS_INFINITY_FLOAT
- csv
- dbc
- dherror
- dtypes
BigDecimal
BigInteger
BusinessCalendar
Character
DType
Duration
Instant
JObject
LocalDate
LocalTime
Period
PyObject
StringSet
TimeZone
ZonedDateTime
array()
bool_
bool_array
boolean_array
byte
byte_array
char
char_array
double
double_array
float32
float32_array
float64
float64_array
from_jtype()
from_np_dtype()
instant_array
int16
int16_array
int32
int32_array
int64
int64_array
int8
int8_array
long
long_array
null_remap()
short
short_array
single
single_array
string
string_array
zdt_array
- execution_context
- experimental
- filters
- html
- jcompat
- json
- jackson
JsonValue
JsonValueType
ObjectField
RepeatedFieldBehavior
any_val()
array_val()
big_decimal_val()
big_integer_val()
bool_val()
byte_val()
char_val()
double_val()
float_val()
instant_val()
int_val()
json_val()
long_val()
object_entries_val()
object_val()
short_val()
skip_val()
string_val()
tuple_val()
typed_object_val()
- learn
- liveness_scope
- numpy
- pandas
- pandasplugin
- parquet
- perfmon
metrics_get_counters()
metrics_reset_counters()
process_info()
process_info_log()
process_metrics_log()
query_operation_performance()
query_operation_performance_log()
query_operation_performance_tree_table()
query_performance()
query_performance_log()
query_performance_tree_table()
query_update_performance()
query_update_performance_map()
server_state()
server_state_log()
update_performance_log()
- plot
- axisformat
- axistransform
- color
- figure
Figure
Figure.axes()
Figure.axis()
Figure.chart()
Figure.chart_legend()
Figure.chart_title()
Figure.figure()
Figure.figure_title()
Figure.func()
Figure.j_object_type
Figure.line()
Figure.new_axes()
Figure.new_chart()
Figure.plot_cat()
Figure.plot_cat_hist()
Figure.plot_ohlc()
Figure.plot_pie()
Figure.plot_treemap()
Figure.plot_xy()
Figure.plot_xy_hist()
Figure.point()
Figure.save()
Figure.series()
Figure.show()
Figure.ticks()
Figure.ticks_minor()
Figure.twin()
Figure.x_axis()
Figure.x_ticks()
Figure.x_ticks_minor()
Figure.x_twin()
Figure.y_axis()
Figure.y_ticks()
Figure.y_ticks_minor()
Figure.y_twin()
- font
- linestyle
- plotstyle
- selectable_dataset
- shape
- plugin
- query_library
- replay
- server
- stream
- kafka
- cdc
- consumer
- producer
topics()
- table_publisher
blink_to_append_only()
stream_to_append_only()
- kafka
- table
MultiJoinInput
MultiJoinTable
NodeType
PartitionedTable
PartitionedTable.constituent_changes_permitted
PartitionedTable.constituent_column
PartitionedTable.constituent_table_columns
PartitionedTable.constituent_table_definition
PartitionedTable.constituent_tables
PartitionedTable.filter()
PartitionedTable.from_constituent_tables()
PartitionedTable.from_partitioned_table()
PartitionedTable.get_constituent()
PartitionedTable.is_refreshing
PartitionedTable.j_object_type
PartitionedTable.key_columns
PartitionedTable.keys()
PartitionedTable.merge()
PartitionedTable.partitioned_transform()
PartitionedTable.proxy()
PartitionedTable.sort()
PartitionedTable.table
PartitionedTable.transform()
PartitionedTable.unique_keys
PartitionedTable.update_graph
PartitionedTableProxy
PartitionedTableProxy.target
PartitionedTableProxy.require_matching_keys
PartitionedTableProxy.sanity_check_joins
PartitionedTableProxy.abs_sum_by()
PartitionedTableProxy.agg_all_by()
PartitionedTableProxy.agg_by()
PartitionedTableProxy.aj()
PartitionedTableProxy.avg_by()
PartitionedTableProxy.count_by()
PartitionedTableProxy.exact_join()
PartitionedTableProxy.first_by()
PartitionedTableProxy.group_by()
PartitionedTableProxy.head()
PartitionedTableProxy.is_refreshing
PartitionedTableProxy.j_object_type
PartitionedTableProxy.join()
PartitionedTableProxy.last_by()
PartitionedTableProxy.max_by()
PartitionedTableProxy.median_by()
PartitionedTableProxy.min_by()
PartitionedTableProxy.natural_join()
PartitionedTableProxy.raj()
PartitionedTableProxy.reverse()
PartitionedTableProxy.select()
PartitionedTableProxy.select_distinct()
PartitionedTableProxy.snapshot()
PartitionedTableProxy.snapshot_when()
PartitionedTableProxy.sort()
PartitionedTableProxy.sort_descending()
PartitionedTableProxy.std_by()
PartitionedTableProxy.sum_by()
PartitionedTableProxy.tail()
PartitionedTableProxy.update()
PartitionedTableProxy.update_by()
PartitionedTableProxy.update_graph
PartitionedTableProxy.update_view()
PartitionedTableProxy.var_by()
PartitionedTableProxy.view()
PartitionedTableProxy.weighted_avg_by()
PartitionedTableProxy.weighted_sum_by()
PartitionedTableProxy.where()
PartitionedTableProxy.where_in()
PartitionedTableProxy.where_not_in()
RollupNodeOperationsRecorder
RollupTable
SearchDisplayMode
SortDirection
Table
Table.abs_sum_by()
Table.agg_all_by()
Table.agg_by()
Table.aj()
Table.attributes()
Table.avg_by()
Table.await_update()
Table.coalesce()
Table.column_names
Table.columns
Table.count_by()
Table.definition
Table.drop_columns()
Table.exact_join()
Table.first_by()
Table.flatten()
Table.format_column_where()
Table.format_columns()
Table.format_row_where()
Table.group_by()
Table.has_columns()
Table.head()
Table.head_by()
Table.head_pct()
Table.is_blink
Table.is_flat
Table.is_refreshing
Table.iter_chunk_dict()
Table.iter_chunk_tuple()
Table.iter_dict()
Table.iter_tuple()
Table.j_object_type
Table.join()
Table.last_by()
Table.layout_hints()
Table.lazy_update()
Table.max_by()
Table.median_by()
Table.meta_table
Table.min_by()
Table.move_columns()
Table.move_columns_down()
Table.move_columns_up()
Table.natural_join()
Table.partition_by()
Table.partitioned_agg_by()
Table.raj()
Table.range_join()
Table.rename_columns()
Table.restrict_sort_to()
Table.reverse()
Table.rollup()
Table.select()
Table.select_distinct()
Table.size
Table.slice()
Table.slice_pct()
Table.snapshot()
Table.snapshot_when()
Table.sort()
Table.sort_descending()
Table.std_by()
Table.sum_by()
Table.tail()
Table.tail_by()
Table.tail_pct()
Table.to_string()
Table.tree()
Table.ungroup()
Table.update()
Table.update_by()
Table.update_graph
Table.update_view()
Table.var_by()
Table.view()
Table.weighted_avg_by()
Table.weighted_sum_by()
Table.where()
Table.where_in()
Table.where_not_in()
Table.where_one_of()
Table.with_attributes()
Table.without_attributes()
TableDefinition
TableDefinitionLike
TreeNodeOperationsRecorder
TreeTable
multi_join()
table_diff()
- table_factory
DynamicTableWriter
InputTable
InputTable.abs_sum_by()
InputTable.add()
InputTable.agg_all_by()
InputTable.agg_by()
InputTable.aj()
InputTable.attributes()
InputTable.avg_by()
InputTable.await_update()
InputTable.coalesce()
InputTable.column_names
InputTable.columns
InputTable.count_by()
InputTable.definition
InputTable.delete()
InputTable.drop_columns()
InputTable.exact_join()
InputTable.first_by()
InputTable.flatten()
InputTable.format_column_where()
InputTable.format_columns()
InputTable.format_row_where()
InputTable.group_by()
InputTable.has_columns()
InputTable.head()
InputTable.head_by()
InputTable.head_pct()
InputTable.is_blink
InputTable.is_flat
InputTable.is_refreshing
InputTable.iter_chunk_dict()
InputTable.iter_chunk_tuple()
InputTable.iter_dict()
InputTable.iter_tuple()
InputTable.j_object_type
InputTable.join()
InputTable.key_names
InputTable.last_by()
InputTable.layout_hints()
InputTable.lazy_update()
InputTable.max_by()
InputTable.median_by()
InputTable.meta_table
InputTable.min_by()
InputTable.move_columns()
InputTable.move_columns_down()
InputTable.move_columns_up()
InputTable.natural_join()
InputTable.partition_by()
InputTable.partitioned_agg_by()
InputTable.raj()
InputTable.range_join()
InputTable.rename_columns()
InputTable.restrict_sort_to()
InputTable.reverse()
InputTable.rollup()
InputTable.select()
InputTable.select_distinct()
InputTable.size
InputTable.slice()
InputTable.slice_pct()
InputTable.snapshot()
InputTable.snapshot_when()
InputTable.sort()
InputTable.sort_descending()
InputTable.std_by()
InputTable.sum_by()
InputTable.tail()
InputTable.tail_by()
InputTable.tail_pct()
InputTable.to_string()
InputTable.tree()
InputTable.ungroup()
InputTable.update()
InputTable.update_by()
InputTable.update_graph
InputTable.update_view()
InputTable.value_names
InputTable.var_by()
InputTable.view()
InputTable.weighted_avg_by()
InputTable.weighted_sum_by()
InputTable.where()
InputTable.where_in()
InputTable.where_not_in()
InputTable.where_one_of()
InputTable.with_attributes()
InputTable.without_attributes()
empty_table()
function_generated_table()
input_table()
merge()
merge_sorted()
new_table()
ring_table()
time_table()
- table_listener
- time
dh_now()
dh_time_zone()
dh_today()
simple_date_format()
time_zone_alias_add()
time_zone_alias_rm()
to_date()
to_datetime()
to_j_duration()
to_j_instant()
to_j_local_date()
to_j_local_time()
to_j_period()
to_j_time_zone()
to_j_zdt()
to_np_datetime64()
to_np_timedelta64()
to_pd_timedelta()
to_pd_timestamp()
to_time()
to_timedelta()
- update_graph
- updateby
BadDataBehavior
DeltaControl
MathContext
OperationControl
UpdateByOperation
cum_max()
cum_min()
cum_prod()
cum_sum()
delta()
ema_tick()
ema_time()
emmax_tick()
emmax_time()
emmin_tick()
emmin_time()
ems_tick()
ems_time()
emstd_tick()
emstd_time()
forward_fill()
rolling_avg_tick()
rolling_avg_time()
rolling_count_tick()
rolling_count_time()
rolling_formula_tick()
rolling_formula_time()
rolling_group_tick()
rolling_group_time()
rolling_max_tick()
rolling_max_time()
rolling_min_tick()
rolling_min_time()
rolling_prod_tick()
rolling_prod_time()
rolling_std_tick()
rolling_std_time()
rolling_sum_tick()
rolling_sum_time()
rolling_wavg_tick()
rolling_wavg_time()
- uri
Deephaven Python Integration Package provides the ability to access the Deephaven’s query engine natively and thus unlocks the unique power of Deephaven to the Python community.
- exception DHError(cause=None, message='')[source]¶
Bases:
Exception
The custom exception class for the Deephaven Python package.
This exception can be raised due to user errors or system errors when Deephaven resources and functions are accessed, for example, during reading a CSV/Parquet file into a Deephaven table or performing an aggregation or join operation on Deephaven tables. It is a good practice for Python code to catch this exception and handle it appropriately.
- property compact_traceback¶
The compact traceback of the exception.
- property root_cause¶
The root cause of the exception.
- property traceback¶
The traceback of the exception.
- with_traceback()¶
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class DynamicTableWriter(col_defs)[source]¶
Bases:
JObjectWrapper
The DynamicTableWriter creates a new in-memory table and supports writing data to it.
This class implements the context manager protocol and thus can be used in with statements.
Initializes the writer and creates a new in-memory table.
- Parameters:
col_defs (Dict[str, DTypes]) – a map of column names and types of the new table
- Raises:
DHError –
- j_object_type¶
alias of
DynamicTableWriter
- write_row(*values)[source]¶
Writes a row to the newly created table.
The type of a value must be convertible (safely or unsafely, e.g. lose precision, overflow, etc.) to the type of the corresponding column.
- Parameters:
*values (Any) – the values of the new row, the data types of these values must match the column definitions of the table
- Raises:
DHError –
- Return type:
None
- class SortDirection(value)[source]¶
Bases:
Enum
An enum defining the sorting orders.
- ASCENDING = 2¶
- DESCENDING = 1¶
- class TableReplayer(start_time, end_time)[source]¶
Bases:
JObjectWrapper
The TableReplayer is used to replay historical data.
Tables to be replayed are registered with the replayer. The resulting dynamic replay tables all update in sync, using the same simulated clock. Each registered table must contain a timestamp column.
Initializes the replayer.
- Parameters:
start_time (Union[dtypes.Instant, int, str, datetime.datetime, np.datetime64, pd.Timestamp]) – replay start time. Integer values are nanoseconds since the Epoch.
end_time (Union[dtypes.Instant, int, str, datetime.datetime, np.datetime64, pd.Timestamp]) – replay end time. Integer values are nanoseconds since the Epoch.
- Raises:
DHError –
- add_table(table, col)[source]¶
Registers a table for replaying and returns the associated replay table.
- j_object_type¶
alias of
Replayer
- function_generated_table(table_generator, source_tables=None, refresh_interval_ms=None, exec_ctx=None, args=(), kwargs={})[source]¶
Creates an abstract table that is generated by running the table_generator() function. The function will first be run to generate the table when this method is called, then subsequently either (a) whenever one of the ‘source_tables’ ticks or (b) after refresh_interval_ms have elapsed. Either ‘refresh_interval_ms’ or ‘source_tables’ must be set (but not both).
Function-generated tables can be used to produce dynamic tables from sources outside Deephaven. For example, function-generated tables can create tables that are produced by arbitrary Python logic (including using Pandas or numpy). They can also be used to retrieve data from external sources (such as files or websites).
The table definition must not change between invocations of the ‘table_generator’ function, or an exception will be raised.
Note that the ‘table_generator’ may access data in the sourceTables but should not perform further table operations on them without careful handling. Table operations may be memoized, and it is possible that a table operation will return a table created by a previous invocation of the same operation. Since that result will not have been included in the ‘source_table’, it’s not automatically treated as a dependency for purposes of determining when it’s safe to invoke ‘table_generator’, allowing races to exist between accessing the operation result and that result’s own update processing. It’s best to include all dependencies directly in ‘source_table’, or only compute on-demand inputs under a LivenessScope.
- Parameters:
table_generator (Callable[..., Table]) – The table generator function. This function must return a Table.
source_tables (Union[Table, List[Table]]) – Source tables used by the ‘table_generator’ function. The ‘table_generator’ is rerun when any of these tables tick.
refresh_interval_ms (int) – Interval (in milliseconds) at which the ‘table_generator’ function is rerun.
exec_ctx (ExecutionContext) – A custom execution context. If ‘None’, the current execution context is used. If there is no current execution context, a ValueError is raised.
args (Tuple) – Optional tuple of positional arguments to pass to table_generator. Defaults to ()
kwargs (Dict) – Optional dictionary of keyword arguments to pass to table_generator. Defaults to {}
- Return type:
- Returns:
a new table
- Raises:
DHError –
- garbage_collect()[source]¶
Runs full garbage collection in Python first and then requests the JVM to run its garbage collector twice due to the cross-referencing nature of the Python/Java integration in Deephaven. Since there is no way to force the Java garbage collector to run, the effect of calling this function is non-deterministic. Users also need to be mindful of the overhead that running garbage collection generally incurs.
- Raises:
DHError –
- Return type:
None
- input_table(col_defs=None, init_table=None, key_cols=None)[source]¶
Creates an in-memory InputTable from either column definitions or an initial table. When key columns are provided, the InputTable will be keyed, otherwise it will be append-only.
There are two types of in-memory InputTable - append-only and keyed.
The append-only input table is not keyed, all rows are added to the end of the table, and deletions and edits are not permitted.
The keyed input table has keys for each row and supports addition/deletion/modification of rows by the keys.
- merge(tables)[source]¶
Combines two or more tables into one aggregate table. This essentially appends the tables one on top of the other. Null tables are ignored.
- merge_sorted(tables, order_by)[source]¶
Combines two or more tables into one sorted, aggregate table. This essentially stacks the tables one on top of the other and sorts the result. Null tables are ignored. mergeSorted is more efficient than using merge followed by sort.
- new_table(cols)[source]¶
Creates an in-memory table from a list of input columns or a Dict (mapping) of column names and column data. Each column must have an equal number of elements.
When the input is a mapping, an intermediary Pandas DataFrame is created from the mapping, which then is converted to an in-memory table. In this case, as opposed to when the input is a list of InputColumns, the column types are determined by Pandas’ type inference logic.
- Parameters:
cols (Union[List[InputColumn], Mapping[str, Sequence]]) – a list of InputColumns or a mapping of columns names and column data.
- Return type:
- Returns:
a Table
- Raises:
DHError –
- read_csv(path, header=None, headless=False, header_row=0, skip_rows=0, num_rows=9223372036854775807, ignore_empty_lines=False, allow_missing_columns=False, ignore_excess_columns=False, delimiter=',', quote='"', ignore_surrounding_spaces=True, trim=False)¶
Read the CSV data specified by the path parameter as a table.
- Parameters:
path (str) – a file path or a URL string
header (Dict[str, DType]) – a dict to define the table columns with key being the name, value being the data type
headless (bool) – whether the csv file doesn’t have a header row, default is False
header_row (int) – the header row number, all the rows before it will be skipped, default is 0. Must be 0 if headless is True, otherwise an exception will be raised
skip_rows (long) – number of data rows to skip before processing data. This is useful when you want to parse data in chunks. Defaults to 0
num_rows (long) – max number of rows to process. This is useful when you want to parse data in chunks. Defaults to the maximum 64bit integer value
ignore_empty_lines (bool) – whether to ignore empty lines, default is False
allow_missing_columns (bool) – whether the library should allow missing columns in the input. If this flag is set, then rows that are too short (that have fewer columns than the header row) will be interpreted as if the missing columns contained the empty string. Defaults to false.
ignore_excess_columns (bool) – whether the library should allow excess columns in the input. If this flag is set, then rows that are too long (that have more columns than the header row) will have those excess columns dropped. Defaults to false.
delimiter (str) – the delimiter used by the CSV, default is the comma
quote (str) – the quote character for the CSV, default is double quote
ignore_surrounding_spaces (bool) – Indicates whether surrounding white space should be ignored for unquoted text fields, default is True
trim (bool) – indicates whether to trim white space inside a quoted string, default is False
- Return type:
- Returns:
a table
- Raises:
DHError –
- read_sql(conn, query, driver='connectorx')[source]¶
Executes the provided SQL query via a supported driver and returns a Deephaven table.
- Parameters:
conn (Any) – must either be a connection string for the given driver or a Turbodbc/ADBC DBAPI Connection object; when it is a Connection object, the driver argument will be ignored.
query (str) – SQL query statement
driver (
Literal
['odbc'
,'adbc'
,'connectorx'
]) – (str): the driver to use, supported drivers are “odbc”, “adbc”, “connectorx”, default is “connectorx”
- Return type:
- Returns:
a new Table
- Raises:
DHError –
- ring_table(parent, capacity, initialize=True)[source]¶
Creates a ring table that retains the latest ‘capacity’ number of rows from the parent table. Latest rows are determined solely by the new rows added to the parent table, deleted rows are ignored, and updated rows are not expected and will raise an exception.
Ring table is mostly used with blink tables which do not retain their own data for more than an update cycle.
- time_table(period, start_time=None, blink_table=False)[source]¶
Creates a table that adds a new row on a regular interval.
- Parameters:
period (Union[dtypes.Duration, int, str, datetime.timedelta, np.timedelta64, pd.Timedelta]) – time interval between new row additions, can be expressed as an integer in nanoseconds, a time interval string, e.g. “PT00:00:00.001” or “PT1s”, or other time duration types.
start_time (Union[None, Instant, int, str, datetime.datetime, np.datetime64, pd.Timestamp], optional) – start time for adding new rows, defaults to None which means use the current time as the start time.
blink_table (bool, optional) – if the time table should be a blink table, defaults to False
- Return type:
- Returns:
a Table
- Raises:
DHError –