Query Monitor
The Query Monitor is used to manage and create persistent queries on the web. Persistent queries operate just like console-based queries. However, persistent queries are created and saved, thereby enabling the automation of repetitive and/or timed operations. Also, persistent queries can be shared with other individuals or teams.
The Query Monitor Table stores information about each query you are authorized to view, and the the panel on the right displays summary information about the highlighted query, as well as tabs for configuration options and the query's script.
Tables and plots generated by Persistent Queries are available in Code Studios and Dashboards from the Panels menu.
Users can also open any Persistent Query script available to them in a new Notebook panel using the PQ Explorer in the Code Studio.
Query Monitor Table
The Query Monitor table contains information about all of the persistent queries you are authorized to view in Deephaven. To open a Query Monitor, click the New button, then the New Query Monitor button.
In the Query Monitor, you can view, manage, and create persistent queries. The Query Monitor table below provides information about each persistent query you are authorized to view, including the owner's name, the query name, whether it is enabled, its status and type, etc. The content in the table can be sorted and filtered like any other table. You can sort through these queries by clicking on the column name - once to sort in ascending order, twice to sort in descending order. Right-clicking on a column header will open the Deephaven header context menu, with options to hide, filter, or sort columns. Right-clicking on an individual cell opens the Deephaven cell data context menu, which allows you to filter the table by that cell's value, or copy data.
Note
Depending on the user's permissions, the Query Monitor may show three queries that help Deephaven run properly: ImportHelperQuery
, RevertHelperQuery
, and WebClientData
.
- The Import Helper query assists with import, merge, and validation queries.
- The Revert Helper query assists Deephaven when a query is reverted to a previous version.
- The Web Client Data query must be running for the Deephaven web console to function. If this query is stopped, a Restart WebClientData button will appear for any superuser. The button allows you to restart the query and thus reinitialize your console session.
Query Monitor Buttons
The three buttons at the top allow users to create new queries, and start or stop existing queries.
Note that the Restart button will restart all selected queries, whereas the Start button only starts queries with Non-Running Status:
- Uninitialized
- Stopped
- Error
- Failed
- Disconnected
- Completed
- None (null)
Convenience Filters
Users can take advantage of convenience filters built into the header of the Query Monitor.
Clicking on a displayed status will filter the Status column for that status. Multiple statuses can be selected:
Below the statuses, the Quick Filter bar includes the following search fields:
- Owner / Name
- My Queries
- Query Type
- Worker Host
- DB Server Name
- Drafts
- Enabled Only
Query Monitor Table Data Menu
The Query Monitor table data menu includes several options to start and stop your queries, as well as to copy the selected query's script into a new persistent query, or delete the selected query. Right-click within the table data to open this menu:
Query Summary / Configuration Panel
The panel at the right of the Query Monitor displays summary information about the highlighted query, as well as tabs for configuration options and the query's script. These tabs will be discussed in detail in the subsequent sections.
Query Monitors persist in the All Dashboard menu, and will retain any filters or sorts applied to the Query Monitor table, as well as any Markdown Widgets associated with the panel.
To rename, close, or make an identical copy (including applied filters and sorts) of the Query Monitor panel, right-click its tab, as shown below:
Query Summary/Configuration
When a query name is selected in the table, the panel on the right side of the Query Monitor will reveal further information about that query. This information is presented in the following five tabs:
Note
If you are not authorized to edit the query, only the query summary will be available.
Summary
In the example below, the Summary tab displays details for the highlighted query named "DemoQuery", presenting the same information as in the Query Monitor table one one easy to read screen. Details include the query's start and end time, the last time the query was updated or modified, worker settings, scheduling information, its permissions, as well as full exception details when applicable.
Creating Persistent Queries
To create a new Persistent Query, select the New button at the top of the Query Monitor.
This opens a new instance of the Persistent Query Configuration Editor.
Settings
Configuration Fields
The Settings tab includes the following configuration fields:
Enabled
This button indicates whether the query is enabled. When Enabled is set to "On", the query will attempt to run according to its schedule. If you do not want the query to run, toggle the button to "Off".
Name
Query names can be any combination of characters and numbers. However, each query in a given installation of Deephaven must have a unique name.
Type
This is the query configuration type. At this time, only the BatchQuery and LiveQuery configuration types are available in Deephaven on the web.
DB Server
The DB Server setting allows you to select the database server associated with your installation of Deephaven and the chosen query configuration type.
Server classes determine what types of queries can be run on each server, and are used by the console to determine which database servers a user can select for each query type.
A Deephaven installation has two server classes configured by default:
- Merge - this class is for servers and queries that need to load data into the database, either intraday (such as the import queries), or historical (the merge query).
- Query - this class is for queries that query data but do not need to write it, such as scripts.
When creating or editing a persistent query, the Persistent Query Configuration Editor will automatically populate the DB server field with appropriate options: e.g., "Query_1" for Live Query script queries.
Heap Size
How much memory (in GB) to provision for the query.
Show Advanced
This opens advanced settings for the query:
- Data Memory Ratio - Data Memory Ratio is a memory tuning parameter that specifies what percent of the memory heap is reserved for caching data read from persistent storage. This enables Deephaven to cache data frames into RAM on an as-needed basis, which in turn speeds up the processing. The default value of ".25" in that field means 25% of the memory noted in the setting for Memory (Heap) Usage is allocated to the database buffer cache.
- JVM Profile - The Java Virtual Machine (JVM) contains a garbage collector to automatically free unreachable memory. There are multiple garbage collection algorithms, and many tuning parameters available for each algorithm. See Remote Processing Profiles for more information. The JVM Profile drop-down menu provides the following four options:
- Default - use the default garbage collection parameters for your Deephaven system. This defaults to CMS GC, but the value can be changed by the Deephaven administrator.
- CMS GC - use Java's CMS (Concurrent Mark Sweep) garbage collection
- G1 GC - use Java's newer G1 (Garbage First) garbage collection
- GI MarkStackSize 128M - use the Custom MarkStackSize profile
- None - do not use any garbage collection parameters. Desired garbage collection parameters must be manually defined in the query's "Extra JVM Args" settings (see next topic).
- Log GC Details - Garbage Collection (GC) is a JVM memory management program that frees unreachable memory by getting rid of objects not being used by a Java application. When the check box is selected (the default setting), Garbage Collection information will be included in the server-side dispatcher logs.
- Extra JVM Arguments - This field allows users to access different Java utilities that are not included in core Deephaven installation. For example, one may want to run a different profiler or debugging processor. These items can be included in the Deephaven configuration by typing the extra JVM arguments in this field.
- Extra Environment Vars - Extra Environment Variables are used to pass additional configuration information to Deephaven.
- Extra Classpaths - The Extra Classpaths field is used to tell Deephaven where to look on the file system (server) for additional class files.
Buttons
At the bottom of the Settings tab are the following buttons:
Delete
This deletes the persistent query.
Copy
This copies the configuration settings and the script, which creates a draft of a new, identical query in the Query Monitor table.
Revert
This will undo any changes made since saving the settings, and restore the settings to the last saved version.
Save
This saves the current configuration settings.
Permissions
The Permissions tab opens the Access Control Settings, where users or user groups can be authorized to view and/or restart queries.
When creating a new query, your username is automatically added as the Query Owner for that query.
Note
This field can be edited by superusers only.
To add additional viewers to the query, select the Enter Name field. Type or select a name from the list, and click Add.
Once the user or user group has been added to the authorized list, you can assign Viewer or Admin privileges using the drop-down next to the name.
As the name implies, a viewer will be able to view the name of the query and its associated tables and plots, but they will not be able to edit the query code nor start/stop the query (unless granted permission).
To the right of the authorized user list, you can choose which group(s) can restart the query:
Scheduling
The Scheduling tab is where you set the conditions upon which the query will run.
Schedule Type
The first section in the panel allows you to set the schedule for your query and set the respective parameters based on the scheduling option chosen. These options include:
Daily
When Daily is selected, the top right part of the panel will show options for each day of the week. You can click on each day individually to include them in or remove them from the schedule, or you can select Weekdays or Toggle All to update the options accordingly. If Business Days is selected, the appropriate business calendar from the adjacent drop-down menu will also need to be selected.
If Live Query is selected as the query configuration type, the middle of the panel will show a range slider for start and end time, as well as Timezone options and the Overnight checkbox. To adjust the the start time and end time, you can drag and drop the sliders to the desired position, or type values in directly.
If Batch Query is selected as the query configuration type, the middle of the panel will show the Repeat on Interval checkbox, and Start Time and Timezone options.
The Repeat on Interval setting enables you to set a timed interval (in minutes) for when the query should run again. When selected, the following options also appear:
When Skips Repeats If Unsuccessful is selected, this tells Deephaven to skip the Repeat on Interval process if the query did not process completely.
Monthly
When Monthly is selected, the top right part of the panel will show options for each month. The bottom of the Schedule Type section shows options for Specific Days of the Month. There are also options for First Business Day and Last Business Day, which require an appropriate business-calendar to be selected from the accompanying drop-down menu.
If Live Query is selected as the query configuration type, the middle of the panel will show a range slider for start and end time, as well as Timezone options and the Overnight checkbox. To adjust the the start time and end time, you can drag and drop the sliders to the desired position, or type values in directly.
If Batch Query is selected as the query configuration type, the middle of the panel will show the Repeat on Interval checkbox, and Start Time and Timezone options.
The Repeat on Interval setting enables you to set a timed interval (in minutes) for when the query should run again. When selected, the following options also appear:
When Skips Repeats If Unsuccessful is selected, this tells Deephaven to skip the Repeat on Interval process if the query did not process completely.
Dependent
Dependent scheduling means the running of this query is dependent upon another query. The first part of the panel prompts you to select a query and then configure the conditions on which the dependent query should run.
In the Dependency field, select the appropriate query. To select one or more additional dependencies, click the Add button at the bottom of the drop-down.
Additional dependencies can be deleted by clicking the trash can icon next to the query name:
The panel provides options to configure when to run or repeat the query.
The Run if field provides the following options:
- Any dependency succeeds - This means the dependent query will run when any one of the selected queries has successfully completed.
- Any dependency fails - This means the dependent query will run if any of the selected queries tries to run but fails.
- All dependencies succeed - This means the dependent query will run when all of the selected queries have successfully completed.
- All dependencies fail - This means the dependent query will run if all of the selected queries tried to run but failed.
The Repeat field provides the following options:
- Run only once
- Run each time conditions are met
The Restart... checkbox will restart the dependent query upon the selected trigger condition if it is already running.
The Delay start... checkbox will re-enable Query Availability options for the dependent queries configuration type. The query will not start until its dependent condition is met AND the start time has passed. If Overnight is checked, the live query will be scheduled the same way as a typical overnight query.
Temporary
Note
This schedule type currently applies to Batch Queries only.
A Temporary query is one that runs one time based on available resources and once completed, permanently deletes itself after a set time. These are especially useful when copying queries that import or merge data, as a temporary query can be defined to load data into a specific partition.
When Temporary is selected, the panel will present two drop-down menus:
- Temporary Queue - the name of the temporary queue that will run the query. Temporary queues are allocated by the system administrator and will run their queries when resources are available.
- Deletion Delay - the length of time before this query will delete itself once it has been run (whether the run was successful or not).
If you choose to add an optional dependency, you will be prompted to select a query as shown:
Selecting a dependency means that the query will only run after the chosen query completes successfully. A temporary query can only be dependent on another temporary query, not a query with another scheduling type. The trash can icon can be used to delete dependencies.
Continuous
Note
This schedule type currently applies to Live Queries only.:
Continuous scheduling is an option for queries that would normally have an end time, such as Live Query (Script). With continuous scheduling, the query does not have a defined start and stop time, but is instead defined to run continuously, with an optional daily restart.
If the Restart Daily option is chosen, then a restart time must be selected. The query will be stopped and restarted at this time every day.
Disabled
When Disabled is selected, the rest of the options in this panel will become unselectable. You will then need to manually start and stop the query anytime.
Overnight Scheduling
Overnight scheduling is an option for queries that require an end time and are scheduled to run Daily or Monthly.
When Overnight is selected, the day(s) selected under Daily/Monthly Options (days of the week or calendar days) apply to when the query starts. Also, the Start Time selected for the query must be later than the End Time.
For example, if an overnight query is scheduled to run on Monday and Tuesday from 17:00 - 16:00, the query would run on the following schedule every week:
- It will start Monday at 17:00
- It will stop Tuesday at 16:00
- It will start Tuesday at 17:00
- It will stop Wednesday at 16:00
Timeout
For Live Queries, the Initialization Timeout setting enables you to set a period of time that the query is allowed to spend initializing; if it does not hit the Running state within that period, it will be stopped. For Live Queries, this defaults to No Timeout (or a value of 0), which means no timeout is enforced. You may want to change this setting if you want to limit the amount of time your query is allowed to spend initializing.
For Batch Queries, which require a maximum run time, this means how long the query can execute, and a value of 0 is not allowed. The Scheduling dialog will prompt you to select an appropriate value. A query that executes for longer than this timeout is stopped.
Click the drop-down menu to choose a preset option or enter a Custom Timeout value:
Error Restart Attempts
The Error Restart Attempts drop-down allows the query to automatically restart if it fails. An optional delay in minutes between restarts can be entered; if a query fails, the next restart attempt will be delayed by this time. The error count resets next time the query is started due to its scheduling, or 24 hours after the last failure.
If no delay is specified and a query has attempted to restart more than 10 times since the last time the error count was reset, the query will wait for a minute between retry attempts. This is to prevent a rapidly-failing query from consuming excessive system resources.
Script
The Script tab allows you to write a new persistent query script, or - if you have the appropriate permissions - edit an existing script.
Runtime
Choose your preferred programming language, Groovy or Python.
Use Git For Script Source
While query scripts are often written and stored within the Script Editor window in Deephaven, you can also store your query script in a Git Repository that is accessible to Deephaven. When checked, you can select from the drop-down menu the appropriate script in that repository to associate with this persistent query.
Queries are read-only in the UI while the "Use GIT for source" checkbox is enabled.
Note
Please consult with your system administrator to see if access to a Git Repository is possible for your installation. See also Integrating Git with Deephaven.
Revert
This will undo any changes made since the previous save, and restore the script to the last saved version of the query.
Save
This button saves the current query configuration.
Caution
For new queries, you will need to click the Start button at the top of the Query Monitor to begin initialization.
Editing Persistent Queries
If an existing query has been edited, clicking the drop-down arrow within the Save button provides options to restart the query immediately or apply the changes upon the next scheduled restart. This option will only be available as long as the following are true:
- The Owner has not changed.
- The query type has not changed.
- The scheduler and scheduler properties have not changed.
- The Admin Groups have not been removed.
- The query is not a Temporary query.