Persistent Query Controller configuration

Persistent queries are one of the core functions in Deephaven. These queries are defined by a user through the Deephaven console and then stored for future use. All persistent queries are stored by and under the control of the Persistent Query Controller process. This process stores all persistent queries, and is responsible for starting and stopping them at the appropriate times.

Note

All further references to "controller" in this document refer to the Persistent Query Controller.

Controller configuration exists in both property files and XML files. The base property file for any Deephaven process is specified by the Configuration.rootFile property. XML files have their locations specified differently, which is explained later.

See the Persistent Query Controller runbook for further details.

Some aspects of the controller's configuration can be dynamically reloaded, without the need to restart the controller. These include the list of query and merge servers, the list of temporary queues, new JVM profiles, and the configuration types. To issue a reload command, see Controller Configuration Reload section of the Persistent Query Controller tool documentation. To issue a reload command, a user must belong to a group defined by the property:

configuration.reload.userGroups

By default, only superusers can issue reload commands.

Query and Merge Servers

All persistent queries must run on a database server, and the controller must have details on these servers. Each server is a Remote Query Dispatcher process which listens on a specific port for connections from the controller. Based on a query type and the defined server classes, the controller should refuse to start queries on an inappropriate query server.

These servers (dispatchers) are defined by a set of properties that the controller reads on startup. This list of available servers may need to be updated dynamically - for example, to add a new server or to change the address of a failed server - and is one of the configuration pieces that can be dynamically reloaded with the controller tool's reload option.

The list of query servers is always started by defining the number of available servers:

iris.db.nservers=<N>

This is followed by a list of server properties in the format:

iris.db.<server number>.<property>=<value>

where the server numbers start at 1 and increment to N from the value defined by iris.db.nservers. Following are the properties that can be defined for each server:

  • iris.db.<server number>.host - This required value defines the host name or IP for the server.
  • iris.db.<server number>.port - The port on which the dispatcher is listening. If not defined, the default port from the property RemoteQueryDispatcherParameters.queryPort is used; this usually points to 22013.
  • iris.db.<server number>.class - The server class, usually Query or Merge. If not defined, it uses the value from the iris.db.defaultServerClass property, which defaults to Query.
  • iris.db.<server number>.name - A name for the server, displayed in the console and stored with the queries. If it is not provided, it will be automatically generated by using <server class>_<number>, where the number starts with 1 and increments by 1 for each server of a given class.
  • iris.db.<server number>.consoleGroups - an optional list of ACL groups to which a user must belong for this server to be visible in the server list for console and Code Studio creation. In combination with RemoteQueryDispatcher.allowedGroups, this can be used to restrict the ability to start consoles (or web Code Studios) on dispatchers. If it is not provided, then no ACL restrictions are enforced in the console-creation dialogs.

For example, a basic configuration with one query server and one merge server, each running on the same local host, might look like the following. Since the first server is a query server listening on the default dispatcher port, only the host needs to be specified:

iris.db.nservers=2

iris.db.1.host=localhost
# default server name will be Query_1

iris.db.2.host=localhost
iris.db.2.port=30002
iris.db.2.class=Merge
# default server name will be Merge_1

Console Server Classes

The server class referred to by the iris.db.<server number>.class property can also refer to a console-specific server class. These classes are for servers that should be available to appropriately privileged users in their interactive consoles, but not available for persistent queries.

Server classes are defined by adding properties that must be available to the persistent query controller, as shown below:

ConsoleServerClass.<server class>.allowedGroups=<ACL groups allowed>

For example, the following property defines a Historical query class that is available to users with the HistorialQuery group:

ConsoleServerClass.Historical.allowedGroups=HistoricalQuery

Users can be added to the defined console server group by using the User/Groups tab in the ACL Editor. Adding users to the group ensures they have access to these servers in the console's Query Server box. To make the server available for everybody, use the "allusers" group as the ACL group in the server class property.

Dispatcher Group Restrictions

Remote Query Dispatchers (query servers and merge servers) can be restricted so that only clients that are members of specified ACL groups are allowed to start workers on them. On the dispatcher, use the following property to specify which groups can start workers: RemoteQueryDispatcher.allowedGroups=<comma-delimited list of ACL groups>

If a client is not a member of this group, they won't be able to start workers on this dispatcher. This may be useful even if GUI restrictions are in place, since authenticated clients can still directly ask a dispatcher to start a worker.

Note

If this property is used, and the group iris-superusers is not in the list of groups, persistent queries won't be able to start on that dispatcher.

See Query and Merge Servers for details on adding ACL group restrictions to the list of available servers when users create interactive consoles.

Example Configuration

The following example shows the properties for a 3-dispatcher configuration. The properties iris.db.<n>.host, iris.db.<n>.port, and iris.db.<n>.name are not shown here, as they will be different for each configuration.

The iris.db.<n> properties are usually in iris-endpoints.prop, and the other properties are usually in iris-environment.prop.

Server 1 is a query server with no restrictions. Anybody can create and start persistent queries and consoles on this server. In the iris_controller stanza, the following property may be defined for clarity, although it is optional because Query is the default server class:

  iris.db.1.class=Query

Optionally define a name that the dispatcher understands. If this is not defined, the dispatcher will use a default value.

[service.name=query_server] {
  RemoteQueryDispatcherParameters.name=query_server_1
}

Server 2 is a merge server. As well as standard Merge-class restrictions, it only allows users in the merge-console-group ACL group (as well as superusers) to create consoles. In the iris_controller stanza:

  iris.db.2.class=Merge
  iris.db.2.consoleGroups=merge-console-group,iris-superusers

In the stanza for the merge server (usually dbmerge) add a restriction to prevent anybody who's not in the worker start groups from starting a worker. The dispatcher is also given a unique name here. Since restrictions are being added, create a second group (merge-worker-group) to which to add users allowed to start persistent queries. iris-schemamanagers is included because it's the default allowed-group for merge servers.

  RemoteQueryDispatcher.allowedGroups=iris-schemamanagers,merge-console-group,merge-worker-group
  RemoteQueryDispatcherParameters.name=merge_server

Server 3 is another query server. It is intended for persistent queries, and only superusers and members of the query-console-group ACL group may create consoles on this server, as well as user2 since it is specifically listed. In the iris_controller stanza:

    iris.db.3.consoleGroups=query-console-group,user2,iris-superusers

For the second query server (server 3), we only want specific groups to be allowed to start workers, so a second group (query-worker-group) will define the group that users allowed to start persistent queries will be added to. Assuming that this query server has the service name query_server2:

[service.name=query_server2] {
  RemoteQueryDispatcher.allowedGroups=query-console-group,query-worker-group,user2,iris-superusers,query-worker-group
  RemoteQueryDispatcherParameters.name=query_server_2
}

Set up three users with the following ACL group memberships:

  • user1 - iris-dataimporters, merge-worker-group, query-worker-group
  • user2 - query-console-group, query-worker-group
  • user3 - merge-console-group, merge-worker-group, query-worker-group

Following are the available privileges for these users.

  • user1:
    • Create and run merge persistent queries of the types that iris-dataimporters can create (Live Query - Merge Server, In-Worker Service, Batch Query - Import Server, and several Import - PQ types (for these, the only available server in the DB Server list should be the merge server)
    • Create and run batch queries or live queries on either query server
    • Should only have the option to start consoles on Query_Server_1 since they’re not in either of the console groups
  • user2:
    • Create and run batch queries or live queries on either query server
    • Create consoles on either query server
    • Should not be able to do anything on the merge server
  • user3:
    • Create and run batch queries or live queries on either query server
    • Create consoles on Query_Server_1 and Merge_Server_1

Automated Server Selection

The persistent query controller provides for automated server (dispatcher) selection through the use of a Server Selection Provider (SSP), for load-balancing and worker distribution among servers. The SSP provides a list of additional group names to the controller; these groups are available for a user to pick when they're selecting the server on which to run a persistent query or console.

The customer can create their own SSP to implement an algorithm of their choice. An SSP must implement the com.illumon.iris.controller.IServerSelectionProvider interface, including the required constructor. Full documentation of the required methods is provided in the IServerSelectionProvider class. A simple implementation is supplied in the com.illumon.iris.controller.SimpleServerSelectionProvider class.

A provider is specified by using the PersistentQueryController.ServerSelectionProvider property. For example:

PersistentQueryController.ServerSelectionProvider=com.illumon.iris.controller.SimpleServerSelectionProvider

Dispatcher Configuration

For server selection logic to work correctly, it is important to have each dispatcher's available heap configured correctly, as that's what the algorithms use. Each dispatcher uses the property RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB to determine the maximum amount of heap all of its workers are allowed to use. For installations where each dispatcher runs on its own dedicated server, this will typically be most of that server's memory; for installations where multiple dispatchers run on a single server, each dispatcher should have a reasonable amount configured. For example, the following stanza specifies that the merge servers should be allocated 40GB of heap for workers:

[service.name=dbmerge] {
  RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB=40960
}

In addition, each dispatcher can specify the maximum allowed heap for a worker through the property RemoteQueryDispatcher.maxPerWorkerHeapMB.

Simple Server Selection Provider

The simple server selection provider uses a basic resource-comparison algorithm, comparing the percentage of heap utilization on each server, and using the number of running workers on the servers if there is a tie. The simple server selection provider is driven by the following properties.

The following property defines the groups for the provider. It is a comma-delimited list of group names; these names will be used to derive the rest of the properties:

SimpleServerSelectionProvider.ActiveGroups

The following example defines two groups called AutoQuery and AutoMerge:

SimpleServerSelectionProvider.ActiveGroups=AutoQuery,AutoMerge

The simple server selection provider then uses these group names to find the other properties:

SimpleServerSelectionProvider.Group.<group name>.<property name>=<value>

The following table defines the properties.

Property SuffixMeaning
ServerClassSpecifies the server class for this group. Only servers of this class will be chosen when a server is requested. Typical values are Query and Merge. If this is not supplied, then the value specified by the property iris.db.defaultServerClass will be used, which is typically Query.
ServersSpecifies the servers in this group. If it's not specified, all servers of the specified class will be in this group.
MaxHeapMBPerWorkerSpecifies the maximum heap in MB allowed for a worker for this group. If it's not specified or is 0, then the value specified in this group's DefaultHeapMBPerServer property will be used.
DefaultHeapMBPerServerSpecifies the default total heap for each server for this group. This will be updated by the controller after it gets a connection to each dispatcher with the dispatcher's real configured value.
ConsoleGroupsIf defined, specifies a list of ACL groups. A user must belong to one of these groups to specify this server group when starting a console or Code Studio.

Here is an example configuration:

# Use the SimpleServerSelectionProvider
PersistentQueryController.ServerSelectionProvider=com.illumon.iris.controller.SimpleServerSelectionProvider

# Specify two groups, named AutoQuery and AutoMerge
SimpleServerSelectionProvider.ActiveGroups=AutoQuery,AutoMerge

# Properties for the AutoQuery group:
# It will use all available servers of the Query class
# The default total heap MB per server is 65536
# No workers can use this group if they request over 32768M in heap
SimpleServerSelectionProvider.Group.AutoQuery.ServerClass=Query
SimpleServerSelectionProvider.Group.AutoQuery.MaxHeapMBPerWorker=32768
SimpleServerSelectionProvider.Group.AutoQuery.DefaultHeapMBPerServer=65536

# Properties for the AutoMerge group:
# It will only use the servers named Merge_Server_1 and Merge_Server_2
# The default total heap MB per server is 65536
# No workers can use this group if they request over 32768M  in heap
SimpleServerSelectionProvider.Group.AutoMerge.ServerClass=Merge
SimpleServerSelectionProvider.Group.AutoMerge.Servers=Merge_Server_1,Merge_Server_2
SimpleServerSelectionProvider.Group.AutoMerge.MaxHeapMBPerWorker=32768
SimpleServerSelectionProvider.Group.AutoMerge.DefaultHeapMBPerServer=65536

Controller Configuration Reloading

While the server selection provider type is not dynamically reloadable (i.e. you can't dynamically change the PersistentQueryController.ServerSelectionProvider property), the properties used by the SimpleServerSelectionProvider are reloadable, and providers should automatically adjust to any controller server changes dynamically reloaded (using the controller tool's reload capability). For instance, a newly added server should be made available to the algorithm as soon as it's running, and new groups and changes to the servers allowed within a class will be updated when the controller is told to reload its configuration.

Temporary Query Queues

Temporary queries are used for queries that run once, such as batch imports of historical data. Temporary queues are defined through controller properties. Temporary query queue properties can be dynamically reloaded by using the controller tool's reload command.

Each temporary query queue is defined by setting two properties in the controller's property file. As many temporary query queues can be defined as needed, and each one will have its own properties, based on the query queue's name. These properties define the resources that the temporary queue is allowed to consume. Both properties are required for each temporary query queue:

  • PersistentQueryController.temporaryQueryQueue.<queue_name>.maxConcurrentQueries - This defines the maximum number of concurrent queries allowed to run on the named temporary query queue. PersistentQueryController.temporaryQueryQueue.<queue_name>.maxHeapMB - This defines the maximum heap in MB that the temporary queries are allowed to use for the temporary query queue.

These resource restrictions are both applied when determining whether or not the next temporary query can run on a queue - the next query must not cause either the maximum concurrent queries or the maximum heap to be exceeded. If either is exceeded, the query will not be run until sufficient resources are available on its queue. Queries are run in the order in which they were submitted to the queue.

The property PersistentQueryController.defaultTemporaryQueryQueue defines the default temporary query queue presented to the user when temporary scheduling is chosen.

Following is a simple default configuration, defining a single queue which allows one query to run at a time with a maximum heap of 20000 MB.

PersistentQueryController.temporaryQueryQueue.DefaultTemporaryQueue.maxConcurrentQueries=1
PersistentQueryController.temporaryQueryQueue.DefaultTemporaryQueue.maxHeapMB=20000
PersistentQueryController.defaultTemporaryQueryQueue=DefaultTemporaryQueue

Persistent Query startup

The persistent query controller may be required to start a large number of queries at the same time at the start of a business day. It maintains a thread pool for this, and while extra threads will be added as needed, it may be helpful to increase these values on systems where the controller is expected to start and stop large numbers of queries at the same time. The following properties control this thread pool.

  • PersistentQueryController.queryStartThreadPoolCoreSize - this defines the minimum number of threads maintained for persistent query startup. The number of available threads will never drop below this value.
  • PersistentQueryController.queryStartThreadPoolKeepAliveMinutes - extra threads added for query startup will be removed if they are idle for this number of minutes. For example, on a system with a lot of queries that run every hour, this value can be updated to ensure that threads remain available for an hour or more.

Query types

Query configuration types (such as Live Query (Script), Batch Query (RunAndDone), and Data Merge) are defined in an XML configuration file, which is used by the controller and console to understand how to handle each type of query. The behavior of these query types is configurable, and new query types can be added by customers.

Warning

Modification of the existing Deephaven query types is not recommended.

The property iris.controller.configurationTypesXml defines a comma-delimited list of XML files that contain the query configuration type definitions. By default it uses PersistentQueryConfigurationTypes.xml.

Query type attributes

Each query type is defined in an XML ConfigurationType element, defining the following attributes:

  • allowedGroups - if defined, this is a comma-delimited list that restricts owners of the query to the specified user groups. If a user is not a member of one of the specified groups (or a superuser), the user will not be able to create a query of this type. If it is not defined, then there are no group restrictions on this query type. Suggested defaults for the existing query types are:
    • Script - all users
    • ReplayScript - all users
    • RevertHelper - superusers
    • RunAndDone - all users
    • ImportHelper - superusers
    • JdbcImport - schema-managers
    • CsvImport - schema-managers
    • XmlImport - schema-managers
    • BinaryImport - schema-managers
    • Merge - schema-managers
    • Validate - schema-managers
  • displayable - Whether or not the query type should be displayed in a console. If it is defined and false, this query type is only displayed to superusers. This is useful for internal query types such as the Deephaven helper queries.
  • enabled - If defined and false, this query type is disabled. A disabled query type is not available to users.
  • hasScript - Defines whether or not the query has a script. If a query does not have a script (hasScript="false"), then the console will not display a script panel when editing a query of this type. The default value is true.
  • name - The name of the query type (for example "Live Query (Script)"). This is required.
  • serverTypes - An optional comma-delimited list which restricts the server types on which a query can run. If it is not defined, then the default server types from the property iris.db.defaultServerClass will be used; unless changed, the default is server type Query.
  • stopTimeRequired - An optional attribute which defines whether scheduling of the query requires a stop time. Query types such as Live Query (Script) that run continuously require stop times, while query types such as Batch Query (RunAndDone), Data Merge, and Import do not require stop times as they terminate automatically when complete. The default value is true.

Query sub-elements

Each ConfigurationType element can define the following sub-elements to further define behavior. The classes defined within these elements are dynamically created during the creation of queries by the console, controller, and dispatcher.

  • <SetupQuery name="Java setup class"> - This required element defines a Java class that will be used to create an instance of the query type. This class must extend the com.illumon.iris.db.tables.remotequery.ContextAwareRemoteQuery<com.illumon.iris.controller.PersistentQueryState> class. A query type is not valid without this setup class.
  • <ConfigChecker class="Java configuration checker class" /> - An optional Java configuration checker class that will be run to validate data before a query of this type can be saved. This class must implement the com.illumon.iris.controller.ConfigChecker interface. If it is not provided, no extra validation is performed on a query of this type before it is saved.
  • <ConfigPanelFactoryClass class="Java configuration panel factory class" /> - An optional Java class to provide a type-specific configuration panel to the console. This type-specific panel contains configuration-specific details. For example, a merge query requires parameters such as the table's namespace and table name. If this is not provided, then no type-specific panel will be created. The factory class must implement the com.illumon.iris.controller.TypeSpecificConfigPanelFactory interface, and the panels it creates must implement the com.illumon.iris.controller.TypeSpecificConfigPanel interface.
  • <PopupProvider class="Java pop-up provider class" /> - An optional Java class to provide additional context-sensitive (right-click) menu options in the console's query configuration view. For example, an import query in the query configuration panel that is right-clicked provides the option to create the corresponding merge query. The pop-up class must implement the com.illumon.iris.controller.PersistentQueryPanelPopupProvider interface. If it is not provided, no additional pop-up menu options are created.
  • <ExtraColumnGetter class="Java extra column getter class" /> - An optional Java class to provide extra columns to be displayed in the console's configuration panel for this configuration type. For example, a merge query displays the namespace and table name in the configuration panel. This Java class must implement the com.illumon.iris.controller.ExtraColumnGetter interface. If it is not provided, no extra columns are displayed.
  • <ExtraPanelColumn name="<column name" /> - If ExtraColumnGetter is provided, one or more ExtraPanelColumn elements should be provided. These are the extra columns to be provided to the ExtraColumnGetter's getExtraColumnValue method.
  • <CustomActionProvider class="Java custom action provider class" /> - An optional Java class to produce pop-up menu items for the Custom Actions attached to a table in a script. This Java class must implement the com.illumon.iris.controller.CustomActionProvider interface.

Controller Cache

All persistent queries created by users are stored by the controller in etcd. See the Cache Backup and Restore Process runbook and Persistent Query Controller Tool documentation for details on how to back up and restore this cache.

Accessing the Controller Cache Directly

In normal operation this should never be needed. However, if the controller cache contains a persistent query that prevents the controller from starting, it may be necessary to delete a persistent query directly from the cache.

For example, an exception like the following could prevent the controller from starting.

com.fishlib.base.verify.AssertionFailure: Assertion failed: asserted configNameToConfig.put(config.getName(), config) == null, instead configNameToConfig.put(config.getName(), config) == WebClientData[1660929719014000000.1].
        at com.fishlib.base.verify.Assert.fail(Assert.java:101)
        at com.fishlib.base.verify.Assert.eqNull(Assert.java:1404)
        at com.illumon.iris.controller.PersistentQueryController.start(PersistentQueryController.java:869)
        at com.illumon.iris.controller.PersistentQueryController.main(PersistentQueryController.java:3091)

Every persistent query is uniquely identified by a serial number, and the serial of the problematic persistent query should be shown in the exception; in this case it's 1660929719014000000. When interacting with etcd, serials are always expressed in hexadecimal with lowercase letters, so that is 170ccec78ab80580 for this example. Most of the commands use the serial, so update the examples with the actual serial.

All interaction with etcd will be using the Deephaven administration system account, typically irisadmin. All the examples assume that etcd is configured with its default Deephaven configuration, so the appropriate keys are visible to irisadmin in the directory /etc/sysconfig/illumon.d/etcd/client/controller. You can see a list of persistent query serials in hexadecimal with the following command.

sudo -u irisadmin \
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/controller /usr/illumon/latest/bin/etcdctl.sh get \
 --prefix /main/data/persistent-query-v2/ \
 --keys-only

The persistent query's details can be displayed with a command like the following. This will print out an XML view of the persistent query.

sudo -u irisadmin \
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/controller /usr/illumon/latest/bin/etcdctl.sh get \
 /main/data/persistent-query-v2/configuration/17cd90520125cec9 \
 --print-value-only

A persistent query can be deleted with the following command. This is for emergencies only, and should only be done when the persistent query controller is stopped.

sudo -u irisadmin \
DH_ETCD_DIR=/etc/sysconfig/illumon.d/etcd/client/controller /usr/illumon/latest/bin/etcdctl.sh del \
 --prefix /main/data/persistent-query-v2/configuration/17cd90520125cec9

Git configuration

By default, the source code for a persistent query in the Script Editor tab is stored by the Persistent Query Controller as part of the query's configuration. However, with Git integration, a persistent query's source code is not stored in Deephaven. Rather, it is loaded directly from its associated Git repository.

Deephaven's use of Git for maintaining persistent query scripts is entirely as a consumer. When a persistent query is configured to use Git for the script source, the Controller will read the script file from the Git repository when the persistent query starts, or to display in the UI when creating or editing the persistent query configuration. Since Deephaven is a consumer of the Git-managed content, the script itself can not be edited in the Deephaven UI. To create or edit Git-managed persistent query scripts, save them to Groovy or Python (.groovy or .py) files, and push them to the Git repository using a Git client such as the git command line tool or a Git-integrated IDE.

Warning

Controller Git repositories should not be stored on NFS.

For more information on configuring persistent queries, see the Query Monitor documentation.

To enable Git integration, several properties must be set in the Deephaven controller's configuration. One global property must be set, followed by several properties for each repository.

The global property is:

iris.scripts.repos — a comma-separated list of Git repositories the controller should use

The additional properties for each repository are listed below:

  • iris.scripts.repo.<repo_name>.groups — a comma-separated list of the Deephaven groups who may access the repository.
  • iris.scripts.repo.<repo_name>.updateEnabled — Set to true to automatically update the repository (i.e., run a git pull) once per minute. This helps ensure that when a query runs, it uses the most recent version of the script available in the repository's remote origin.
  • iris.scripts.repo.<repo_name>.branch — the Git branch to check out; if this is not set, the controller's PersistentQueryController.defaultBranch property value is used.
  • iris.scripts.repo.<repo_name>.prefixDisplayPathsWithRepoName — If true, the "Choose Script" dialog of the Persistent Query Configuration Editor will include the repository's name next to each script path. This helps disambiguate scripts for users who have access to multiple repositories.
  • iris.scripts.repo.<repo_name>.root — the directory on the filesystem into which Deephaven will clone the Git repository. Each repository must have a distinct root directory. If a relative path is used, the path will be relative to the workspace directory of the Controller process. On Deephaven servers, this will normally be /db/TempFiles/irisadmin/iris_controller.
  • iris.scripts.repo.<repo_name>.paths=IrisQueries/groovy — the paths, relative to the repository's root directory, to include. Files in all other paths will not be available to Deephaven queries.
  • iris.scripts.repo.<repo_name>.uri — the SSH URI used to access the Git repository, such as git@git.illumon.com:illumon/iris.git.
  • iris.scripts.repo.<repo_name>.remote — sets the name of the remote alias. Defaults to origin.
  • iris.scripts.repo.<repo_name>.resetGitLockFiles — whether the controller should reset if it finds Git locks when it starts a sync. Defaults to true. Since, normally, only the controller should be running Git commands in the irisadmin git directory path, left-over lock files should only happen if the controller is stopped during a Git operation, and it is beneficial to allow the controller to clear these locks automatically.

Note

The default configuration is for Deephaven to use a local Git repository. For repository updates (i.e., git pull) to be enabled, the Persistent Query Controller must be configured to not use a local Git repository.

The setting PersistentQueryController.useLocalGit must be added to change the default behavior. Set to true to use a local repository as the script source. This disables repository updates globally, regardless of each repository's updateEnabled setting. The default value is true, causing Deephaven to use a local repository, not checking out a configured branch from the remote. Set to false to enable the Deephaven Controller to clone and fetch from a remote Git server.

Git garbage collection is controlled by the following property:

PersistentQueryController.gitGcEnabled - if defined and set to false, this disables Git garbage collection.

Git authentication

If a keypair is being used to authenticate with the Git server, a common way to configure this is to create the keypair in the .ssh directory under the irisadmin home directory - usually /db/TempFiles/irisadmin. The host on which Git is running also needs to be added to the known_hosts file for irisadmin. By default, there is no .ssh directory for irisadmin, so the full process of setting up irisadmin for ssh authentication to Git is:

  1. sudo su - irisadmin to switch to the irisadmin user context (or other name if a custom name is used instead of irisadmin).
  2. mkdir .ssh
  3. chmod 700 .ssh
  4. ssh-keygen -t rsa (accept all defaults - press enter each time - to create a new key pair with no passphrase.)
  5. ssh-keyscan -t rsa <fqdn_of_git_server> >> .ssh/known_hosts
  6. cat .ssh/known_hosts to verify that the host from step 5 was added; if not, try ssh-keyscan -H <fqdn_of_git_server> >> .ssh/known_hosts.

If the .ssh directory already exists, then steps 1 through 3 are not needed.

Note

Since .ssh starts with a . for its name, it is a hidden directory which will not be displayed by default by ls. Use ls -a or ls -la to display hidden directories along with non-hidden ones.

The newly created public key for the irisadmin account can then be added to the Git server to allow ssh authentication for Git commands:

cat ./ssh/id_rsa.pub to get the public key that needs to be provided to the Git server. This may be through a UI, for tools like GitLab, or, if the server is a simple Linux server, the public key can be appended to the .ssh/authorized_keys file for the Git user on the Git server.

Important

Some Git servers will not accept authentication using the JSch authentication module used by JGit by default. For better compatibility, it is recommended to configure the controller's Git client to use the system's ssh for authentication.

This change is included in newer builds of release 1.20221001. To check whether it is included, look for a change log entry titled Update jgit SshSessionFactory to a more modern/supported version. For builds that do not include the change, it can be manually implemented by editing /usr/illumon/latest/bin/launch - note that this file is replaced any time an upgrade of Deephaven is run, so the change may need to be reapplied after an upgrade if the upgrade is to a build that does not yet include it.

sudo vi /usr/illumon/latest/bin/launch

On (or near) line 451 (:set number to enable line numbers) there is a line (EXEC_ARGS="-a $PROCESS_NAME")

Above that line, inside the else block, add these lines:

    if [ "$PROCESS_NAME" == "iris_controller" ]; then
       export GIT_SSH="/usr/bin/ssh"
    fi

Example configuration

The example configuration below configures Deephaven to read scripts from three repositories: team1, team2, and shared. All users can access scripts in the shared repository, but the team1 and team2 repositories are restricted to specific users. All three repositories will use a branch called master. Also, the shared repository uses a different Git server than the other two.

PersistentQueryController.useLocalGit=false

iris.scripts.repos=shared,team1,team2

iris.scripts.repo.shared.groups=*
iris.scripts.repo.shared.updateEnabled=true
iris.scripts.repo.shared.branch=master
iris.scripts.repo.shared.prefixDisplayPathsWithRepoName=false
iris.scripts.repo.shared.root=../git/shared
iris.scripts.repo.shared.paths=IrisQueries/groovy,IrisUtils/groovy
iris.scripts.repo.shared.uri=git@git.mycompany.net:common-libs/shared.git

iris.scripts.repo.team1.groups=user1,user2,user3
iris.scripts.repo.team1.updateEnabled=true
iris.scripts.repo.team1.branch=master
iris.scripts.repo.team1.prefixDisplayPathsWithRepoName=false
iris.scripts.repo.team1.root=../git/team1
iris.scripts.repo.team1.paths=IrisQueries/groovy
iris.scripts.repo.team1.uri=git@gitlab.mycompany.net:team1/team1.git

iris.scripts.repo.team2.groups=user1,user4,user5
iris.scripts.repo.team2.updateEnabled=true
iris.scripts.repo.team2.branch=master
iris.scripts.repo.team2.prefixDisplayPathsWithRepoName=false
iris.scripts.repo.team2.root=../git/team2
iris.scripts.repo.team2.paths=IrisQueries/groovy
iris.scripts.repo.team2.uri=git@gitlab.mycompany.net:team2/team2.git

Troubleshooting

In most cases, if updates are enabled, and there is some problem with the Git integration configuration, or in connecting to Git, the Deephaven controller process fails to start, or shuts down shortly after starting. Details of such issues will be logged to the process startup log: /var/log/deephaven/iris_controller/iris_controller.log.<yyyy-mm-dd>.

If a non-fatal Git error occurs, this is generally logged to the controller process log: /var/log/deephaven/iris_controller/PersistentQueryController.log.<yyyy-mm-dd-hhmmss.mmm+/-hhmm>.

Note that ssh is used for authentication, but the URI should not include ssh:// as a prefix. https is not currently supported.

ssh -v <user>@<fqdn_of_git_server> may provide additional diagnostic information about the connection and authencation processes.

Other properties

Other properties related to the controller's operation follow. These parameters are not reloadable.

  • critEmail - The email distribution list to which critical alerts will be sent. Currently, the only critical alert is for a hung script update job, which is used to refresh scripts from Git.
  • iris.authentication.keyfile - The keyfile used to authenticate the controller to the dispatchers.
  • PersistentQueryController.binaryLogTimeZone - If specified, the time zone that determines column partition values for the controller's data (PersistentQueryStateLog and PersistentQueryConfigurationLogV2). If not specified, then the server's default value is used.
  • PersistentQueryController.commitCheckpointImmediate - If specified as true, then the controller will force a cache checkpoint immediately following startup.
  • PersistentQueryController.defaultMaxHeapSizeGB - This is the maximum heap the controller will allow for a query before a dispatcher is contacted. By default it is 1024GB (1TB).
  • PersistentQueryController.host - The hostname on which the controller is running. This is not used by the controller, but is read by other processes that need to connect to the controller.
  • PersistentQueryController.keyPairFile - The keypair file used to encrypt sensitive information for the controller's use. This file should not be visible to users of the system.
  • PersistentQueryController.port - The port on which the controller listens for client connections, from user consoles or the controller tool.
  • queryScheduler.restartWhenRunning=Yes|No - This defines the default value populated in the persistent query scheduler’s “Restart when running” option. The default value is Yes, restart when running. This also defines the default behavior for persistent queries that were saved before this option was available.

In addition, the controller's logging behavior can be changed with the standard logging parameters. See the Deephaven Operations Guide Log-Related Properties section for further details.

Initial startup configuration

When the controller starts for the first time on any Deephaven installation it must create helper queries to assist with Deephaven operations. Properties are used to assist with the creation of these queries, and in a non-standard configuration it may be useful to override these properties.

The Revert Helper query assists Deephaven when a query is reverted to a previous version. Following are the available parameters that the controller uses when creating the initial revert helper query (the first time the controller is run), and their default values.

  • revertHelper.queryOwner=<superuser> - the owner of the revert helper query; this must be a superuser.
  • revertHelper.queryName=RevertHelperQuery - the name of the query.
  • revertHelper.dbServer=Query_1 - the server on which the revert helper will run. It should be a server with a type of Query; if custom-named servers are used, this will need to reflect a named query server.
  • revertHelper.heapSize=1 - the heap size in GB of the helper query.

The following parameter defines how far back the revert helper looks when a user requests to revert a query to a previous version:

  • revertHelper.lookbackDays=180

The following parameter defines the number of seconds for which a console request to revert a query will wait for a response from the helper before displaying an error:

  • revertHelper.waitQuerySeconds=30

The Import Helper query assists with import, merge, and validation queries. The initial query-creation parameters have the same meanings as for the revert helper.

  • importHelper.queryOwner=<username>
  • importHelper.queryName=ImportHelperQuery
  • importHelper.dbServer=Merge_1 - the import helper should run on a server with a type of Merge.
  • importHelper.heapSize=1
  • importHelper.queryOwner=<username>