Basic installation

This guide offers detailed instructions for installing Deephaven directly on one or more host machines.

The steps presented use no third party containerization or deployment tools. For more information on installing Deephaven in other ways, see:

Prerequisites

Before you start, make sure you have read and understood the planning guide and ensure your system(s) meet the minimum system requirements. You also need:

  • A host from which to run the installation scripts, referred to as the Installation Host. This host does not need to be part of the cluster, and, if not, Deephaven will not be installed.
  • A user with SSH access to the installation targets, also called the service user. This guide uses dh_service as the service user.

Domain names and DNS records

It is strongly recommended that all machines have DNS records configured before beginning the installation. The Installation Host must be able to resolve all etcd and configuration server hostnames when the installation scripts are generated. All other nodes wait to resolve DNS until the installation scripts run.

The cluster.cnf file must include mappings describing each node in the cluster. The cluster configuration section provides more details.

TLS certificates

Deephaven uses TLS certificates for the web IDE, which must be provided during installation. These files are not required if valid web certificates have already been generated.

These certificates must:

  • Be stored as x509 certificates in pem format, without passwords.
  • Be named tls.crt (certificate) and tls.key (private key).
  • Have Extended Key Usage of TLS Web Server Authentication and TLS Web Client Authentication.
  • Have a Subject Alternative Name matching the Fully Qualified Domain Name of all the hosts in the cluster. More details follow in the cluster configuration section.
  • Be readable by the user running the installation script(s).

Note

If you use wildcard certificates, note that they are only valid for one subdomain level; *.my.company suffices for test.my.company, but NOT for my.test.my.company.

This certificate should be obtained from an internal corporate certificate authority already trusted for use in the organization or from an external provider, such as Digicert or Verisign. It must be trusted by web browsers to use the Deephaven web IDE, by your organization's Java installation, and by the operating system's cacert truststore.

For PKI environments (certificates issued by internal Certificate Authorities), the tls.crt file should be a certificate chain that includes the root and issuing CA and the certificate for the Deephaven cluster. The PKI guide provides more information about how Deephaven uses certificates.

Prepare the Cluster nodes

Make sure you make the following configuration changes on each node that you want to install Deephaven.

Disable swapping

If swapping is permitted, the kernel may swap a significant portion of RAM allocated to Deephaven processes. This can dramatically reduce the performance of workers and services, causing timeouts and adversely impacting system stability. For this reason, disabling swaps on servers hosting Deephaven is strongly recommended.

echo "vm.swappiness = 0"| sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Set minimum process limits

Deephaven uses many processes for system services, running consoles, and Persistent Queries. The system must be configured so that Deephaven can create processes as required.

Add or adjust the following settings in /etc/security/limits.conf file. These first group of settings ensure that Deephaven can start processes as required:

*          soft    nproc     204800
*          hard    nproc     204800
root       soft    nproc     unlimited

Set minimum open file limits

Deephaven workers read many files when accessing data and make network requests that require file descriptors. The system must be configured so that Deephaven processes can open as many files as required, or services and workers may fail unpredictably.

Add or adjust the following settings in /etc/security/limits.conf file. These settings allow Deephaven to open as many files as it needs to access data for users:

*          soft    nofile     65535
*          hard    nofile     65535
root       soft    nofile     unlimited

Warning

If you choose to use specific users instead of * in the sections above, you must make sure that you include a copy of each line for each Deephaven system user (DH_MERGE_USER, DH_QUERY_USER, and DH_ADMIN_USER) so that sufficient process and file handles can be opened.

Install a supported JDK (optional)

Deephaven requires a Java JDK installation. If your system has internet access, you may skip this step, as the installer automatically installs a supported JDK using the version you specify during configuration. If you install the JDK separately, install the same version you configure. See the Version Matrix for supported versions.

Install Python (Optional)

The Deephaven Installer uses the cluster configuration to decide which version of Python to use. If the version of Python requested is not detected and the Installer can access the Internet, it downloads and installs the version from source code. If you do allow the installer to access the Internet, you must ensure that the correct version of Python is installed, or the installation will fail. See the Python configuration cluster configuration options for more information.

Prepare the installer host

The Deephaven installation process is carried out from a local directory on an Installation Host. This host does not need to be part of the cluster itself, but it must have SSH access to all the nodes within the cluster.

To perform the installation, you will need a service account. You can specify any user in the cluster.cnf file as [DH_SSH_USER](./cluster-config-guide.md#dh_ssh_user). If you do not specify DH_SSH_USER, the installer will use the current user by default.

Note

For production deployments, Deephaven recommends specifying both DH_SSH_USER and DH_SSH_KEY so authentication to the target hosts can be strictly controlled.

The service user needs some administrative permissions on the nodes to modify the filesystem, and sudo permissions to act as the various Deephaven user accounts. See Elevated Permissions for more information.

Create a directory to place the installation files into:

mkdir -p /tmp/dh/install

Install a supported JDK

The installation process requires Java to install Deephaven on the cluster's nodes. Make sure that you install Java on the Installation Host. Deephaven recommends that you use the same version of Java that you are installing on the Deephaven nodes.

Define your cluster configuration

Deephaven clusters are defined by the cluster.cnf file. This file describes the cluster's topology, including which physical nodes are present, what services run on each host, and what software versions are installed.

For a complete description of the settings available in cluster.cnf files, see the Cluster Configuration guide.

Below is an example cluster configuration file for a typical three-node Deephaven system. You can use this file as a starting point. Place it in the installation directory you created and change the properties as needed.

Example cluster.cnf file
# The name of the cluster
DH_CLUSTER_NAME="dh-example-cluster"

# The local directory containing the installation files
DH_LOCAL_DIR="/tmp/dh/install"

# The directory on each host to upload and run the installation from
DH_REMOTE_DIR="/tmp/dh/install"

# The Deephaven version being installed
DH_VERSION="1.20240517.344"

# The root of the domain of the hosts
DH_DOMAIN_ROOT="mydomain.com"

# The Java version being used
DH_JAVA_VERSION="jdk17"

# Set to true to allow the installer to update the sudoers file with the proper rules automatically.
DH_MODIFY_ETC_SUDOERS="true"

# Replace this user with the user that runs the installation
DH_SSH_USER=dh_service

DH_OS="rocky9"

# Set to true to install Deephaven Python integration
DH_PYTHON="true"

# Set this to the version of Python you have installed. See the support matrix for supported versions
DH_PYTHON_VERSION="3.10"

# Set to true to configure Envoy automatically.
# Note that you must install Envoy separately. See the documentation for more information.
DH_CONFIGURE_ENVOY="false"

# Infrastructure node
DH_NODE_1_NAME="dh-example-cluster-infra"
DH_NODE_1_ROLE_INFRA="true"
DH_NODE_1_ROLE_QUERY="true"
DH_NODE_1_ROLE_ETCD="true"

# Query 1 Node
DH_NODE_2_NAME="dh-example-cluster-query-1"
DH_NODE_2_ROLE_QUERY="true"
DH_NODE_2_ROLE_ETCD="true"

# Query 2 Node
DH_NODE_3_NAME="dh-example-cluster-query-2"
DH_NODE_3_ROLE_QUERY="true"
DH_NODE_3_ROLE_ETCD="true"

Note

The cluster configuration file defines your system's topology. You should treat it as any other source code and place it under source control. You need it whenever you perform system updates.

Collect installation media

Next, download the installation media from the address provided by your Deephaven representative and place it directly into the installation directory on the Installation Host. The files you need are:

  • The Deephaven Installer

      Installer-1.20240517.344.jar
    
  • The Enterprise system archive

      deephaven-enterprise-jdk17-1.20240517.344.tar.gz
    
  • One or more Core+ archives

      deephaven-coreplus-0.37.4-1.20240517.344-jdk17.tgz
    
  • The etcd archive etcd-v3.5.12-linux-amd64.tar.gz

  • TLS Certificates. Create a directory for the certificates in the installation directory and place your certificate files there.

    mkdir /tmp/dh/install/certs
    

Generate the installation scripts

Now that the nodes are configured, the certificates are ready, and the installation media is in the correct place, you must run the installation generator. The installation generator uses the cluster configuration file you created to generate a set of install scripts that install and configure Deephaven on each node.

cd /tmp/dh/install
java -jar ./Installer-1.20240517.344.jar

Note

The installation generator updates the cluster configuration file with the default settings for any parameters left unspecified.

A note on elevated permissions

The Deephaven installer requires elevated permissions to install a few system services (such as Monit, Java, and etcd), create Deephaven service users, and create database and configuration directories. All commands that need elevated permissions are contained in the dh_root_prepare.sh script, which allows your security team to audit the commands.

Caution

If your organization has restrictions on how elevated permissions may be used, a system administrator may manually perform these steps. See the Elevated permissions section. However, Deephaven strongly recommends that you use the dh_root_prepare.sh script that is generated with the section above for this purpose. Executing these steps manually can be error-prone and mistakes at this stage can make the installation very difficult.

Install Deephaven

Run the Installer

To install Deephaven on all your nodes, simply run the generated master installation script:

./master_install.sh

See the troubleshooting sections if errors occur.

Install Envoy (Optional)

If you plan to use Envoy as a front proxy for Deephaven, you need to install and run Envoy separately before you can access the system externally. See the Envoy documentation for further details.

Configure an administrator

To start using the system, you need to configure an administrative user. Run the following commands on the infrastructure node of your cluster:

# Add a user `dh_admin` that is a member of each of the three administrative groups `iris-acleditors`, `iris-schemamanagers` and `iris-superusers`
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig acl user add --name dh_admin --group iris-acleditors iris-schemananagers iris-superusers

# Interactively set the password for the user `dh-admin`.  This command will prompt the user to enter a password.
sudo -u irisadmin /usr/illumon/latest/bin/dhconfig acl user set-password --name dh_admin

Install example data

Many of the example queries in the Deephaven documentation are based on the LearnDeephaven data set. This data set consists of three tables of historical data covering a few dates from 2017-08-21 to 2017-08-25, plus 2017-11-01. The LearnDeephaven dataset and its corresponding namespace take about 350MB on disk.

To install the LearnDeephaven data set, run the following command once on the cluster infrastructure node:

/usr/illumon/latest/install/learn_install.sh

Note

After installing LearnDeephaven, query workers need to be restarted before they can access it.

img

Install custom libraries

You can install your own custom Java libraries into Deephaven by placing them into /etc/sysconfig/illumon.d/java_lib on each host.

Verify the installation

Check services

On each node, run the following to check that all Deephaven services have started and are available.

/usr/illumon/latest/bin/dh_monit up --block

Note

If dh_monit up times out, use dh_monit summary to see which process has problems.

If Python is available, you can validate the installation with check-deephaven-cluster, which runs the following tests against the cluster to confirm that:

  • The passed in user can authenticate.
  • The WebClientData query is running.
  • Groovy and Python Core+ workers can be created on each query and merge server.
  • Each query server is correctly writing the DbInternal.ProcessEventLog table to the DIS, and can read it back.

check-deephaven-cluster takes the URL of the installation's connection.json file, and one of three authentication methods: username/password, private key file, or SAML. The connection.json argument is optional. If not provided, the script uses the connection.json file in the standard location.

Note

Port 8000 in the commands below is the port used by installations with Envoy enabled. For systems without Envoy, the port is 8123.

/usr/illumon/latest/bin/check-deephaven-cluster --connection-json https://deephaven-host:8000/iris/connection.json --key-file /path/to/private-key.txt
/usr/illumon/latest/bin/check-deephaven-cluster --connection-json https://deephaven-host:8000/iris/connection.json --saml
/usr/illumon/latest/bin/check-deephaven-cluster -c https://deephaven-host:8000/iris/connection.json -user username

Check user interfaces

  • From a browser, connect to https://deephaven-host:8000/ or, if not using Envoy, https://deephaven-host:8123/, and attempt to log in and create a console.

    • Launch a Query Config panel and verify the three initial Persistent Queries (ImportHelperQuery, RevertHelperQuery, and WebClientData) are running: img
  • Verify the Swing UI using the Deephaven Launcher.

Adding and removing nodes

You can add more nodes to a Deephaven cluster after it has been created. This can improve the system's fault tolerance and increase available compute resources. The most typical examples of adding nodes are:

  • Dedicated servers for query processing or data import processing. This allows scale-out of the system to handle more users and/or larger volumes of data.
  • Authentication server nodes. Having one or more backup authentication servers allows for continued use of the system in case the primary authentication server is inaccessible.
  • Configuration server nodes. Having one or more backup configuration servers allows for continued use of the system in case the primary configuration server is inaccessible. etcd provides high availability and fault tolerance for the storage of configuration data, but the configuration server is the interface through which this data is provided to other Deephaven services.
  • Persistent Query Controllers. Additional controllers provide redundancy.

Similarly, servers no longer needed can be removed from a cluster.

Note

This process does not support changes to the etcd nodes of a cluster.

To add or remove nodes, simply modify the cluster.cnf file, adding or removing DH_NODE_$N blocks for the nodes you wish to add or remove from the cluster. Then, run the installation process again, starting from the generation stage.

Note

When removing nodes, be sure to re-number the values for the remaining nodes in the configuration. For example, if you had 5 nodes, DH_NODE_1 through DH_NODE_5, and you deleted node 3, you need to renumber DH_NODE_4 and DH_NODE_5 properties in the cluster.cnf file.

Appendices

Appendix A: Elevated permissions

Caution

Deephaven recommends allowing the installer to configure the system(s) automatically. However, if you cannot run the installation script with elevated permissions, you may run the following steps by hand with a privileged user.

Create Deephaven service accounts

Deephaven uses three service user accounts internally: irisadmin, dbmerge, and dbquery. All three accounts must be members of dbmergegrp and dbquerygrp, and have secondary membership in a group with the same name as the user. The user home of each account must be /db/TempFiles/$user_name.

Caution

The shell snippets here use the default Deephaven service account names. If you have customized DH_ADMIN_USER, DH_QUERY_USER, DH_MERGE_USER, or any of the corresponding group properties, you must use those values below!

Set up user home:

sudo mkdir -p /db/TempFiles/{dbquery,dbmerge,irisadmin}
sudo chmod 775 /db
sudo chmod 775 /db/TempFiles

Create local user accounts:

sudo groupadd -r -g 9000 dbmerge
sudo groupadd -r -g 9001 dbquery
sudo groupadd -r -g 9002 irisadmin
sudo groupadd -r -g 9003 dbmergegrp
sudo groupadd -r -g 9004 dbquerygrp
sudo useradd -r -g 9003 -G 9000,9004 -u 9000 -d /db/TempFiles/dbmerge -s /bin/bash dbmerge
sudo useradd -r -g 9003 -G 9001,9004 -u 9001 -d /db/TempFiles/dbquery -s /bin/bash dbquery
sudo useradd -r -g 9003 -G 9002,9004 -u 9002 -d /db/TempFiles/irisadmin -s /bin/bash irisadmin

If group membership is incorrect, it can be fixed via the usermod command:

usermod $user -g dbmergegrp -a -G $user,dbmergegrp,dbquerygrp

Note

If you wish to use different user account names, you should set the cluster.cnf variables DH_*_USER and DH_*_GROUP, as explained here.

Configure sudoers

The service user and Deephaven system users require certain sudo permissions to be configured. See the Sudoers appendix for more details.

Install third-party dependencies

apt install monit wget mariadb-server build-essential git bzip2 unzip which rsync="3.*"
dnf install monit wget unzip glibc libgcc libgomp libstdc++ bzip2 git which ed openssl redhat-lsb-core
dnf install monit wget unzip glibc libgcc libgomp libstdc++ bzip2 git which ed openssl

Install ETCD 3.5.12

  • Create the etcd user and groups:
    sudo groupadd --system etcd
    sudo useradd -g etcd --comment "etcd user" --shell /usr/nologin --base-dir /var/lib etcd
    sudo mkdir -p /etc/etcd
    sudo mkdir -p /var/lib/etcd
    sudo chown etcd:irisadmin /etc/etcd
    sudo chown etcd:irisadmin /var/lib/etcd
    
  • Download and install etcd:
    wget --timeout=60 --waitretry=1 -t 5 -q "https://storage.googleapis.com/etcd/v3.5.12/etcd-v3.5.12-linux-amd64.tar.gz" -O /tmp/etcd-v3.5.12-linux-amd64.tar.gz
    sudo tar -xzf /tmp/etcd-v3.5.12-linux-amd64.tar.gz -C /tmp
    sudo cp --preserve=mode /tmp/etcd-v3.5.12-linux-amd64/etcd /tmp/etcd-v3.5.12-linux-amd64/etcdctl /usr/bin
    

Create Deephaven directories and permissions:

  • Owned by dbmerge:dbmergegrp:

    sudo mkdir /db
    sudo mkdir /db/TempFiles
    
    sudo chown dbmerge:dbmergegrp /db
    sudo chown dbmerge:dbmergegrp /db/TempFiles
    
  • Owned by irisadmin:irisadmin:

    sudo mkdir /usr/illumon
    sudo mkdir /db/TempFiles/irisadmin
    sudo touch /db/TempFiles/irisadmin/.bashrc
    sudo touch /db/TempFiles/irisadmin/.bash_profile
    sudo mkdir /etc/sysconfig/deephaven
    sudo mkdir /etc/monit.d/deephaven
    sudo mkdir /var/log/deephaven/monit
    
    sudo chown irisadmin:irisadmin /usr/illumon
    sudo chown irisadmin:irisadmin /db/TempFiles/irisadmin
    sudo chown irisadmin:irisadmin /db/TempFiles/irisadmin/.bashrc
    sudo chown irisadmin:irisadmin /db/TempFiles/irisadmin/.bash_profile
    sudo chown irisadmin:irisadmin /etc/sysconfig/deephaven
    sudo chown irisadmin:irisadmin /etc/monit.d/deephaven
    sudo chown irisadmin:irisadmin /var/log/deephaven/monit
    
  • Owned by irisadmin:dbmergegrp:

    sudo mkdir /var/lib/illumon
    sudo mkdir /var/lib/deephaven
    sudo mkdir /var/log/deephaven
    sudo mkdir /var/log/deephaven/deploy_schema
    sudo mkdir /var/log/deephaven/tdcp
    sudo mkdir /etc/deephaven
    sudo mkdir /db/TempFiles/irisadmin/generate_loggers_listeners
    sudo mkdir /db/TempFiles/irisadmin/generate_loggers_listeners/java
    sudo mkdir /db/VEnvs
    
    sudo chown irisadmin:dbmergegrp /var/lib/illumon
    sudo chown irisadmin:dbmergegrp /var/lib/deephaven
    sudo chown irisadmin:dbmergegrp /var/log/deephaven
    sudo chown irisadmin:dbmergegrp /var/log/deephaven/deploy_schema
    sudo chown irisadmin:dbmergegrp /var/log/deephaven/tdcp
    sudo chown irisadmin:dbmergegrp /etc/deephaven
    sudo chown irisadmin:dbmergegrp /db/TempFiles/irisadmin/generate_loggers_listeners
    sudo chown irisadmin:dbmergegrp /db/TempFiles/irisadmin/generate_loggers_listeners/java
    sudo chown irisadmin:dbmergegrp /db/VEnvs
    
  • Owned by dbmerge:dbmerge:

    sudo mkdir /var/log/deephaven/merge_server
    sudo mkdir /var/log/deephaven/dis
    sudo mkdir /db/TempFiles/dbmerge
    sudo touch /db/TempFiles/dbmerge/.bashrc
    sudo touch /db/TempFiles/dbmerge/.bash_profile
    
    sudo chown dbmerge:dbmerge /var/log/deephaven/merge_server
    sudo chown dbmerge:dbmerge /var/log/deephaven/dis
    sudo chown dbmerge:dbmerge /db/TempFiles/dbmerge
    sudo chown dbmerge:dbmerge /db/TempFiles/dbmerge/.bashrc
    sudo chown dbmerge:dbmerge /db/TempFiles/dbmerge/.bash_profile
    
  • Owned by dbquery:dbquery:

    sudo mkdir /var/log/deephaven/ltds
    sudo mkdir /var/log/deephaven/query_server
    sudo mkdir /db/TempFiles/dbquery
    sudo touch /db/TempFiles/dbquery/.bashrc
    sudo touch /db/TempFiles/dbquery/.bash_profile
    
    sudo chown dbquery:dbquery /var/log/deephaven/ltds
    sudo chown dbquery:dbquery /var/log/deephaven/query_server
    sudo chown dbquery:dbquery /db/TempFiles/dbquery
    sudo chown dbquery:dbquery /db/TempFiles/dbquery/.bashrc
    sudo chown dbquery:dbquery /db/TempFiles/dbquery/.bash_profile
    
  • Owned by etcd:irisadmin (only on nodes running etcd server ROLE_ETCD=true):

    sudo mkdir /etc/etcd/dh
    sudo mkdir /var/lib/etcd/dh
    
    sudo chown etcd:irisadmin /etc/etcd/dh
    sudo chown etcd:irisadmin /var/lib/etcd/dh
    
# preferably all links are owned by irisadmin; use chown -h irisadmin:irisadmin
sudo ln -nsf /etc/sysconfig/deephaven/illumon.confs.latest /etc/sysconfig/illumon.confs
sudo ln -nsf /etc/sysconfig/deephaven/illumon.d.latest /etc/sysconfig/illumon.d
sudo ln -nsf /etc/sysconfig/illumon.confs/illumon.iris.hostconfig /etc/sysconfig/illumon

# optional:
sudo ln -nsf /etc/sysconfig/illumon.confs/illumonRestart.cron /etc/cron.d/illumonRestart.cron

Alter monit system files

In order to prevent access to root permissions, the monit process supervision tool, which normally runs as the root user, must be edited to instead run as the DH_MONIT_USER (by default, irisadmin).

Before altering this system service, be sure to first stop monit via sudo systemctl stop monit.

  • Edit /etc/monit/monitrc and make the following changes:

    • Ensure the file /etc/monit/monitrc is owned by DH_MONIT_USER

      • sudo chown irisadmin:irisadmin /etc/monit/monitrc
    • Replace set log /var/log/monit.log with set log /var/log/deephaven/monit/monit.log

    • Uncomment the four lines of code:

      set httpd port 2812 and
      use address localhost
      allow localhost
      allow admin:monit
      
  • Edit /etc/logrotate.d/monit and make the following changes:

    • Replace /var/log/monit.log with /var/log/deephaven/monit/monit.log
  • Edit /etc/monitrc and make the following changes:
    • Ensure the file /etc/monitrc is owned by DH_MONIT_USER
    • Replace set log /var/log/monit.log or set log syslog with set log /var/log/deephaven/monit/monit.log
    • Uncomment the four lines of code:
      set httpd port 2812 and
      use address localhost
      allow localhost
      allow admin:monit
      
  • Invoke sudo systemctl edit monit.service, which will create+edit the file /etc/systemd/system/monit.service.d/override.conf and change the user and group to DH_MONIT_USER and DH_MONIT_GROUP (default irisadmin):
[Service]
User=irisadmin
Group=irisadmin
  • Update monit system file ownership
  sudo chown irisadmin:irisadmin /var/lib/monit/state
  sudo chown irisadmin:irisadmin /var/lib/monit/events
  • Restart the monit process
sudo systemctl daemon-reload
sudo systemctl start monit
sudo -u irisadmin /bin/monit reload

Note

If monit runs successfully, but some script fails while attempting to call systemctl, you can set DH_SKIP_MONIT_CHECK=true in cluster.cnf to direct the installation to skip all monit checks and trust that monit is set up correctly.

Restart system services

# If you do not grant service account permissions to control the monit service,
# you must alter monit system files, and ensure the service is always on.
systemctl enable monit
systemctl start monit

# We use a custom service, dh-etcd instead of stock system etcd service.
# Fully no-rooted installs must enable, but not start, a dh-etcd systemd service file,
# plus set DH_ETCD_TOKEN in cluster.cnf to the token set in your dh-etcd.service.
# You only need this on the 1, 3 or 5 machines you use for your etcd servers (ROLE_ETCD)
systemctl stop etcd
systemctl disable etcd
# These dh-etcd commands will only work after the dh-etcd system service is setup.
# Additional details for manual setup is covered in the `sudoers` section, below.
systemctl enable dh-etcd
systemctl start dh-etcd

# After any change to any services, you must reload the systemctl daemon:
systemctl daemon-reload

Appendix B: Users and groups

Deephaven uses the following users and groups. You can customize these users using cluster.cnf settings. See Custom Users.

Group NameSuggested gidMembers
dbquery9000dbquery
dbmerge9001dbmerge
irisadmin9002irisadmin
dbmergegrp9003dbquery, dbmerge, irisadmin
dbquerygrp9004dbquery, dbmerge, irisadmin
User NameSuggested uid
dbquery9000
dbmerge9001
irisadmin9002

Caution

The uid and gids associated with the users and groups must be consistent across all hosts so users can read and write the proper locations.

Caution

Special care must be taken when using network-provided users, like active directory. monit must not be started until the users exist, so if your users are provided over the network, you must make sure the monit service waits to start until the users have been loaded.

Appendix C: Generated Scripts

When you run the installation generator, it generates the following scripts:

  • master_install.sh— Run this file from the Installer Host to install the entire cluster. It copies the tar.gz or rpm installer to each host and then runs all the other scripts below in the correct order on the correct machine.
  • dh_root_prepare.sh - This script is copied to each Remote Host and then run on that host. This is the only script that invokes sudo, and only if needed.
  • dh_install.sh - This script is copied to each Remote Host and then run there. It unpacks the Deephaven tar.gz or rpm and runs the post_install.sh script.
  • dh_keygen.sh— This script is run on the Installer Host. It copies or creates all necessary certificates to the correct machines and then cleans up all temporary files that might contain certificates or private keys.
  • etcd_configure.sh - This script is run on the Installer Host and tests if the etcd cluster is healthy and correctly configured. If not, it generates and provisions all necessary files for the etcd cluster and then tests again that the cluster is healthy.
  • dh_node_finalize.sh— This script is run on the Installer Host and finalizes the cluster setup. It generates iris-endpoints.prop, imports all necessary files into etcd, and enables all monit services.
  • dh_log_sync.sh— This script is run on the Installer Host and copies all logs from Remote Hosts to the Installer Host, making it simpler to debug failures.
  • dh_node_validate.sh - This script is copied to each Remote Host and performs some system validation to ensure that the product is correctly configured.
  • dh_node_cleanup.sh - This script is copied to each Remote Host and cleans up all temporary files and all logs.

Appendix D: Sudoers

Tip

If DH_MODIFY_ETC_SUDOERS=true in the cluster.cnf file, the Deephaven installer automatically configures these rules for you.

The service user and Deephaven system users require sudo permissions for specific commands. Use visudo -f /etc/sudoers.d/deephaven to create a sudoers file for Deephaven at /etc/sudoers.d/deephaven and place the following rules inside it.

Caution

Make sure that:

  • The line #includedir /etc/sudoers.d is enabled in your /etc/sudoers file.
  • Replace dh_service with the user you configured as the install service user.
Deephaven sudoers file
# These rules are the minimum required for Deephaven
irisadmin ALL=(irisadmin:irisadmin) NOPASSWD: LOG_OUTPUT: ALL
irisadmin ALL=(irisadmin:dbmergegrp) NOPASSWD: LOG_OUTPUT: ALL
irisadmin ALL=(irisadmin:dbquerygrp) NOPASSWD: LOG_OUTPUT: ALL
irisadmin ALL=(dbmerge:dbmerge) NOPASSWD: LOG_OUTPUT: ALL
irisadmin ALL=(dbquery:dbquery) NOPASSWD: LOG_OUTPUT: ALL

# In the sections below, you may need to replace dh_service with your chosen service user
dh_service ALL=(irisadmin:irisadmin) NOPASSWD: LOG_OUTPUT: ALL
dh_service ALL=(irisadmin:dbquerygrp) NOPASSWD: LOG_OUTPUT: ALL
dh_service ALL=(dbmerge:dbmerge) NOPASSWD: LOG_OUTPUT: ALL
dh_service ALL=(dbquery:dbquery) NOPASSWD: LOG_OUTPUT: ALL
dh_service ALL=(etcd:irisadmin) NOPASSWD: LOG_OUTPUT: ALL
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl start dh-etcd
# warning: this rule can grant elevated permissions to uses with access to dh_service account. See details below.
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl enable /etc/etcd/dh/*/dh-etcd.service

# If you have chosen a different monit user (DH_MONIT_USER=someuser in cluster.cnf) you must change dh_service to that user
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl start monit
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl status monit
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl stop monit
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl reload monit
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl restart monit

# These rows are only necessary on hosts that run ETCD. You may remove these on other hosts
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl disable etcd
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl stop etcd
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl status dh-etcd
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl stop dh-etcd
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl restart dh-etcd
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/systemctl reload dh-etcd

dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/journalctl -u monit --no-pager -[xefkqae]*
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/journalctl -u monit --no-pager
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/journalctl -u dh-etcd --no-pager -[xefkqae]*
dh_service ALL=(root) LOG_OUTPUT: NOPASSWD: /usr/bin/journalctl -u dh-etcd --no-pager

Warning

systemctl enable grants root access. Anyone with write permissions to dh-etcd.service can effectively become root on the system. Ensure strict control of write permissions on this file, or have your system administrator enable the service before performing the Deephaven installation using a dh-etcd.service file that looks like the following:

[Unit]
Description=Etcd Server (ccfc147dd)
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/dh/ccfc147dd
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=8 /usr/bin/etcd --config-file '/etc/etcd/dh/etcd-token/config.yaml'"
Restart=on-failure
LimitNOFILE=65536
TimeoutStartSec=600

[Install]
WantedBy=multi-user.target

If you do pre-enable the dh-etcd service, be sure to replace the etcd-token with your own value and assign that token to DH_ETCD_TOKEN=etcd-token in your cluster.cnf file.

Appendix E: Troubleshooting

If an error occurs during the installation, detailed log files can be found in the logs directory of where you ran the installer. Further, all logs from remote hosts are copied to DH_LOCAL_DIR/logs/$node_hostname.

Tip

You can run the dh_log_sync.sh script to copy logs from remote machines to the local installer host. The log files are also contained in the remote installation directory on the host.

Installation configuration scripts write logs to /var/log/deephaven/install_configuration. Log files containing sensitive information are only visible to the irisadmin user. Before installation runs, log files in /var/log/deephaven/install_configuration are moved to /var/log/deephaven/previous_install, so only logs relevant to the last installation are found in install_configuration. After installation completes, Deephaven services write their logs to /var/log/deephaven/<service name>/.

Certificate problems

If the services fail because a matching Subject Alternative Name (SAN) cannot be found, check that /etc/sysconfig/illumon.d/dh-config/clients/single/host contains the correct name or IP address of the Configuration Server, and that /etc/sysconfig/deephaven/trust/truststore-iris.pem has a SAN for its configuration_server entry that matches the contents of the host file. Note that this requires reading the contents of the .pem file with something like OpenSSL that can decode the X509 formatting.

For example, run:

cat - | openssl x509 -text -noout

Then paste the certificate_server section of the .pem file to decode it.

Please refer to the TLS Certificates section to ensure that your certificates meet the minimum requirements.

Failures during dh_node_finalize.sh

If the dh_node_finalize.sh script does not complete normally, the dh_install.sh script may encounter problems trying to update files in etcd that have not yet been imported. If this happens:

  • Disable ETCD imports:

    if `[-d /etc/sysconfig/deephaven/etcd/client]`, then
      sudo -u irisadmin touch /etc/deephaven/SKIP_ETCD_IMPORT
    fi
    
  • Re-run the installation. dh_node_finalize.sh should now succeed.

  • Enable etcd imports:

    sudo -u irisadmin rm /etc/deephaven/SKIP_ETCD_IMPORT
    

Errors in Auth Server Logs

It is normal to see Authentication Server failures in the Configuration Server log while the Authentication Server is starting up. The Configuration Server must start first, as it is a dependency for the Authentication Server. The Configuration Server then attempts to connect to the Authentication Server, and retries while waiting for the Authentication Server to begin accepting connections. This nuance is handled gracefully if processes are started using /usr/illumon/latest/bin/dh_monit up (passing the --block flag directs dh_monit to wait until all processes are online).

Appendix F: Migrating ACLs to etcd

The ACL migration tool (/usr/illumon/latest/bin/migrate_acls) is used to copy existing ACL data from a SQL store to etcd. This tool must be run with Deephaven service admin rights. If it is run against a system that already has some ACL data in etcd, it fails. The overwrite argument directs the tool to remove all existing etcd ACL data (only ACL data - not other data stored in etcd) and replace it with the ACL data from the SQL store.

To migrate ACL data from SQL into an etcd store that does not yet have any ACL data:

sudo -u [admin account (default is irisadmin)] /usr/illumon/latest/bin/migrate_acls

Or, if there is already etcd ACL data which should be replaced:

sudo -u [admin account (default is irisadmin)] /usr/illumon/latest/bin/migrate_acls overwrite

Migration of ACL data should be performed before reconfiguring the system to use etcd for ACL data.

The following property settings indicate that a system uses etcd for ACL data:

  • IrisDB.groupProvider=etcd
  • IrisDB.permissionFilterProvider=etcd
  • authentication.server.customauth.class=com.illumon.iris.db.v2.permissions.EtcdDbAclProvider

If the setting DH_ACLS_USE_ETCD=true (which is the default in versions 1.20231218 or later) is used in the cluster.cnf file during installation, the installer adds these settings to the iris-endpoints.prop configuration file. If manually enabling etcd ACL storage, these settings can be added to iris-endpoints.prop or iris-environment.prop, as long as they are not overridden later by other settings. iris-endpoints.prop entries take precedence over those in iris-environment.prop, and settings further down a file take precedence over those further up.

To allow password management for Deephaven logins in etcd, managed user authentication must also be enabled by iris.enableManagedUserAuthentication=true.

After migrating ACLs and importing the new properties, use /usr/illumon/latest/dh_monit to restart all Deephaven services on all nodes. Once the system has restarted, if the installation was using a dedicated SQL service for ACLs, that service can be stopped, disabled, and, if desired, uninstalled.

If DH_ACLS_USE_ETCD=true was used during the upgrade from a system that had been using a SQL ACLs data store, the etcd ACL store is enabled and populated, but only with the initial minimal set of ACL data. In this case, after the upgrade, the ACL migration tool should be run with the overwrite option, as detailed above.