Query Monitoring

Internal Tables

Core+

Deephaven's internal tables contain details about processes and workers and are intended to aid in troubleshooting and performance monitoring. In Community Core, these tables are only available in the memory of one worker, but in Enterprise Core+, they are persisted into the database. This enables performance analysis from another worker, to avoid perturbing your workload or to analyze a terminated worker.

The following snippet shows how to retrieve the raw data for the current date. Typically, you should filter by ProcessUniqueId to get relevant data and use the performance overview tools for analysis.

date = today(timeZone("ET"))
qpl = db.liveTable("DbInternal", "QueryPerformanceLogCommunity").where("Date=date")
qopl = db.liveTable("DbInternal", "QueryOperationPerformanceLogCommunity").where("Date=date")
ssl = db.liveTable("DbInternal", "ServerStateLogCommunity").where("Date=date")
pil = db.liveTable("DbInternal", "ProcessInfoLogCommunity").where("Date=date")
pml = db.liveTable("DbInternal", "ProcessMetricsLogCommunity").where("Date=date")
upl = db.liveTable("DbInternal", "UpdatePerformanceLogCommunity").where("Date=date")
from deephaven.time import format_date, now, time_zone

date = format_date(now(), time_zone("ET"))
qpl = db.live_table("DbInternal", "QueryPerformanceLogCommunity").where("Date=date")
qopl = db.live_table("DbInternal", "QueryOperationPerformanceLogCommunity").where("Date=date")
ssl = db.live_table("DbInternal", "ServerStateLogCommunity").where("Date=date")
pil = db.live_table("DbInternal", "ProcessInfoLogCommunity").where("Date=date")
pml = db.live_table("DbInternal", "ProcessMetricsLogCommunity").where("Date=date")
upl = db.live_table("DbInternal", "UpdatePerformanceLogCommunity").where("Date=date")

Common to Core+ and Enterprise Legacy

Enterprise Core+ and Enterprise Legacy workers both write to the Process Event Log and Audit Event Log tables.

Process Event Log

The Process Event Log contains text logs for Deephaven processes and workers. The output from Core+ workers is also included in the ProcessEventLog (together with Enterprise Legacy workers), which is useful to investigate behavior and diagnose failures or crashes. You should filter by the Process or ProcessInfoId columns to retrieve rows of interest. You must also sort by the Timestamp column to view data in order, as data logged by the worker is separate from STDOUT and STDERR data captured by the query or merge servers.

date = today(timeZone("ET"))
pel = db.liveTable("DbInternal", "ProcessEventLog").where("Date=date").sort("Timestamp")
from deephaven.time import format_date, now, time_zone

date = format_date(now(), time_zone("ET"))
pel = db.live_table("DbInternal", "ProcessEventLog").where("Date=date").sort("Timestamp")

Audit Event Log

Core+ workers also log their own kinds of audit events to the Audit Event Log.

date = today(timeZone("ET"))
ael = db.liveTable("DbInternal", "AuditEventLog").where("Date=date").sort("Timestamp")
from deephaven.time import format_date, now, time_zone

date = format_date(now(), time_zone("ET"))
ael = db.live_table("DbInternal", "AuditEventLog").where("Date=date").sort("Timestamp")

The types of Core+ audit events are specified below by the values found in the "Event" column:

EventDescription
Historical Table AccessRequesting a historical table
Live Table AccessRequesting a live table
Historical Partitioned Table AccessRequesting a historical partitioned table
Live Partitioned Table AccessRequesting a live partitioned table
Unpartitioned User Table WriteWriting an unpartitioned user table
Unpartitioned User Table DeletionDeleting an unpartitioned user table
Partitioned User Table Schema AdditionAdding a partitioned user table schema
Partitioned User Table Schema UpdateUpdating a partitioned user table's schema
Partitioned User Table Partition WriteWriting a direct partition to a partitioned user table
Partitioned User Table Partition DeletionDeleting a direct partition from a partitioned user table
Live User Table AppendAppending rows to a live partition from a partitioned user table
Live User Table Incremental UpdatesAppending rows incrementally to a live partition from a partitioned user table
Live User Table Partition DeletionDeleting a live partition from a partitioned user table
Partitioned User Table DeletionDeleting a partitioned user table, including its schema, direct partitions, and live partitions

The "Details" column provides additional information, such as whether an operation was allowed, rejected, or completed, or the column partition value associated with an operation.

Performance Overview

Use the Core+ performance overview tools to retrieve, process, decorate, and visualize the DHC internal tables for a given worker.

Run the following to retrieve the performance overview by process info id, which can be found from the Code Studio the Core+ worker was launched from, the Core+ console, or the logs of the parent dispatcher:

performanceOverviewByPiid("52e806dd-af75-412c-a286-ec29aa5571d2")
performance_overview("52e806dd-af75-412c-a286-ec29aa5571d2")

Note

The full process info id is not required. Any unique substring should be sufficient. When more than one process info id matches, an error is raised which shows the possible alternatives.

Run the following to retrieve the performance overview by worker name, which can be found from the Code Studio the Core+ worker was launched from, the Core+ console, or the logs of the parent dispatcher:

performanceOverviewByWorkerName("worker_12")
performance_overview(worker_name="worker_12")

Run the following to retrieve the performance overview by PQ name:

performanceOverviewByPqName("Core PQ")
performance_overview(pq_name="Core PQ")

If the same PQ ran multiple times in the day, provide a date time string representing an as-of time, which will narrow the search to the latest PQ run at or before the specified time:

performanceOverviewByPqName("Core PQ", "2023-04-28T15:57:45 NY")
performance_overview(pq_name="Core PQ", as_of_time_string="2023-04-28T15:57:45 NY")

Note

By default, the current date will be used, as well as live, intraday tables (as opposed to historical). These behaviors can be modified, as seen in the Python Notes and Groovy Notes sections.

Important

You must use the Core+ performance overview to analyze a Core+ worker, as opposed to the Enterprise performance overview.

Python Notes

There is only one single function to retrieve the performance overview via Python. The calls differ by positional and/or keyword args. The following demonstrates the parameter names and default values:

def performance_overview(process_info_id=None, worker_name=None, host_name=None, pq_name=None, as_of_time=None, as_of_time_string=None, owner=None, date=None, is_intraday=True, is_live=True): ...

Below are explanations of the parameters not yet mentioned:

  • host_name refers to the host running the Core+ worker.
  • as_of_time will take an actual DateTime object as opposed to a string representation.
  • owner refers to a PQ owner; this is only relevant if pq_name is specified.
  • date. If not specified, this will default to the current NY date.
  • is_intraday determines whether table retrieval is intraday or historical.
  • is_live determines whether an intraday table is live or static. This is only relevant when is_intraday is True.

Groovy Notes

See below for the full closure signatures and default parameter values for each variant.

By process info id:

performanceOverviewByPiid = { final String processInfoId, final String date = null, final boolean isIntraday = true, final boolean isLive = true -> ...

By worker name:

performanceOverviewByWorkerName = { final String workerName, final String hostName = null, final String date = null, final boolean isIntraday = true, final boolean isLive = true -> ...

By PQ name:

performanceOverviewByPqName = { final String pqName, final String asOfTimeString = null, final String owner = null, final String date = null, final boolean isIntraday = true, final boolean isLive = true -> ...

Index Tables

Certain high-volume internal tables have associated index tables used to optimize administrative queries commonly performed against those tables. These Index tables are produced by the Data Import Server as it imports data, making it possible to identify the set of internal partitions that have data for a given key. For further details please see Indexing Intraday Partitions.

Core+

The following snippet shows how to retrieve the raw data for the current date.

date = today(timeZone("ET"))
qpli = db.liveTable("DbInternal", "QueryPerformanceLogCommunityIndex").where("Date=date")
qopli = db.liveTable("DbInternal", "QueryOperationPerformanceLogCommunityIndex").where("Date=date")
ssli = db.liveTable("DbInternal", "ServerStateLogCommunityIndex").where("Date=date")
upli = db.liveTable("DbInternal", "UpdatePerformanceLogCommunityIndex").where("Date=date")
from deephaven.time import format_date, now, time_zone

date = format_date(now(), time_zone("ET"))
qpli = db.live_table("DbInternal", "QueryPerformanceLogCommunityIndex").where("Date=date")
qopli = db.live_table("DbInternal", "QueryOperationPerformanceLogCommunityIndex").where("Date=date")
ssli = db.live_table("DbInternal", "ServerStateLogCommunityIndex").where("Date=date")
upli = db.live_table("DbInternal", "UpdatePerformanceLogCommunityIndex").where("Date=date")

Common to Core+ and Enterprise

See Log Index Tables for details regarding Enterprise Index Tables.