Unexpected results
Although your query may not have crashed, you may still get unexpected results. Debugging these situations takes advantage of the Deephaven query engine's composable nature. The most effective way to understand why your query is not producing the desired results is to examine the results of each query operation independently by assigning intermediate results to variables.
Consider the following query that shows the in-use heap for each Persistent Query (PQ) at a specific timestamp:
stateLog = db.liveTable("DbInternal", "PersistentQueryStateLog").lastBy("SerialNumber")
withState = stateLog.where("Status in `Running`, `Initializing`, `AcquiringWorker`, `Connecting`, `Stopping`, `Executing`").view("SerialNumber", "Name", "VersionNumber", "Status")
withConfiguration = withState.naturalJoin(db.liveTable("DbInternal", "PersistentQueryConfigurationLogV2").lastBy("SerialNumber", "VersionNumber"), "SerialNumber,VersionNumber", "HeapSizeInGB")
state_log = db.live_table("DbInternal", "PersistentQueryStateLog").last_by(
"SerialNumber"
)
with_state = state_log.where(
"Status in `Running`, `Initializing`, `AcquiringWorker`, `Connecting`, `Stopping`, `Executing`"
).view(["SerialNumber", "Name", "VersionNumber", "Status"])
with_configuration = with_state.natural_join(
db.live_table("DbInternal", "PersistentQueryConfigurationLogV2").last_by(
["SerialNumber", "VersionNumber"]
),
"SerialNumber,VersionNumber",
"HeapSizeInGB",
)
This produces the following table in a Code Studio:
Notice that the HeapSizeInGB
column contains a null
value for a running query. We expect that every running query has a heap size. To debug this issue, we break the query into constituent operations to find the first operation that does not contain our expected data. Here is an equivalent query that splits out the PersistentQueryConfigurationLogV2
into a separately named variable called config_log
:
stateLog = db.liveTable("DbInternal", "PersistentQueryStateLog").lastBy("SerialNumber")
withState = stateLog.where("Status in `Running`, `Initializing`, `AcquiringWorker`, `Connecting`, `Stopping`, `Executing`").view("SerialNumber", "Name", "VersionNumber", "Status")
configLog = db.liveTable("DbInternal", "PersistentQueryConfigurationLogV2").lastBy("SerialNumber", "VersionNumber")
withConfiguration = withState.naturalJoin(configLog, "SerialNumber,VersionNumber", "HeapSizeInGB")
state_log = db.live_table("DbInternal", "PersistentQueryStateLog").last_by(
"SerialNumber"
)
with_state = state_log.where(
"Status in `Running`, `Initializing`, `AcquiringWorker`, `Connecting`, `Stopping`, `Executing`"
).view(["SerialNumber", "Name", "VersionNumber", "Status"])
config_log = db.live_table("DbInternal", "PersistentQueryConfigurationLogV2").last_by(
["SerialNumber", "VersionNumber"]
)
with_configuration = with_state.natural_join(
config_log, "SerialNumber,VersionNumber", "HeapSizeInGB"
)
From this table, we can see that "Sleep on Startup" is not present, so the error is that config_log
does not have the desired data. There is no filter for the table, but only data for today's date is present. The easiest way to see that there is only one value is to hover over the Date column header in the UI to display the statistics, which confirms there is only one unique value:
The live_table
method we used for retrieving the DbInternal.PersistentQueryConfigurationLogV2
table only contains data served by the Data Import Server, not data that has been merged to historical storage. If we combine the historical configuration data with live data, then we achieve the desired result:
stateLog = db.liveTable("DbInternal", "PersistentQueryStateLog").lastBy("SerialNumber")
withState = stateLog.where("Status in `Running`, `Initializing`, `AcquiringWorker`, `Connecting`, `Stopping`, `Executing`").view("SerialNumber", "Name", "VersionNumber", "Status")
configLog =merge(db.historicalTable("DbInternal", "PersistentQueryConfigurationLogV2"), db.liveTable("DbInternal", "PersistentQueryConfigurationLogV2")).lastBy("SerialNumber", "VersionNumber")
withConfiguration = withState.naturalJoin(configLog, "SerialNumber,VersionNumber", "HeapSizeInGB")
from deephaven import merge
state_log = db.live_table("DbInternal", "PersistentQueryStateLog").last_by(
"SerialNumber"
)
with_state = state_log.where(
"Status in `Running`, `Initializing`, `AcquiringWorker`, `Connecting`, `Stopping`, `Executing`"
).view(["SerialNumber", "Name", "VersionNumber", "Status"])
config_log = merge(
[
db.historical_table("DbInternal", "PersistentQueryConfigurationLogV2"),
db.live_table("DbInternal", "PersistentQueryConfigurationLogV2"),
]
).last_by(["SerialNumber", "VersionNumber"])
with_configuration = with_state.natural_join(
config_log, "SerialNumber,VersionNumber", "HeapSizeInGB"
)
Rerunning the query with the historical data included results in a value for our heap size.