Kubernetes Configuration Settings
Configuration for a Kubernetes installation uses the same Deephaven customization and configuration processes as other deployment types. It also supports additional configuration via YAML properties provided to the Helm chart.
How to apply configuration changes via the Helm chart
Run a helm install or helm upgrade command as described in the Kubernetes installation or
upgrade guides to apply configuration changes set in your override YAML file (for example, my-values.yaml).
If your system is already installed, use the helm upgrade command.
How to configure properties not available in the Helm chart
The management-shell can be used to run command-line utilities in a Kubernetes environment. See How do I run commands on the command line
for details. The Deephaven Helm chart's values.yaml file contains many defaults that are configurable via your override YAML file. You can inspect those values to see what is available for configuration via the Helm chart.
Configuring non-worker process resources
Kubernetes resources to manage CPU, memory, and ephemeral storage for Deephaven non-worker processes are configurable via the Helm chart. The chart's values.yaml file contains defaults that may need to be adjusted depending on your use case and workload. Consult that file for the list of Deephaven processes that may be configured.
Each Deephaven service pod has two containers:
- The main container running the service (e.g., authserver).
- A second container running a tailer.
Note
We recommend keeping memory requests and limits equal to avoid unexpected out-of-memory (OOM) kills. If the request is lower than the limit, the JVM size may increase beyond the request size and subject the pod to OOM kills by Kubernetes if another process scheduled on the node requires memory, because a JVM cannot be reduced in size. Keeping memory requests and limits equal will help ensure the pod has all required memory when it is scheduled.
This example sets the Data Import Server (DIS) resources to 1 CPU core, 12Gi of memory, and 3Gi of ephemeral storage, and sets the tailer container resources to 500m CPU, 3Gi of memory, and 1Gi of ephemeral storage:
Verification
To verify your non-worker resource settings have been applied:
Best practices for non-worker resources
- Monitor resource usage and adjust limits based on actual utilization patterns.
- For memory-intensive operations, increase limits on query and merge servers.
- Reserve more CPU for the controller service if managing many workers.
- Ensure binary log size is sufficient based on log verbosity and retention needs.
Configuring worker process resources
Memory for worker processes is configured in the Settings panel of the Persistent Query, where you can specify Java heap size and additional memory overhead. The worker memory settings specified there are also padded with additional memory overhead as described in Configuring Worker Heap Overhead.
Setting worker CPU limits
Kubernetes resource limits for workers can be configured when creating a Persistent Query or Code Studio. By default, no limit is set. To set CPU limits for worker processes, add the Kubernetes.setCpuLimitToRequest=true property to the iris-environment.prop file.
When CPU limits are set, Deephaven query workers will not be able to take advantage of unutilized excess capacity on the Kubernetes nodes. For some workloads, this can impede performance and reduce parallelism, but also leads to more predictable performance, since the query will not run significantly faster or slower based on the utilization of the node it's scheduled on.
This can be done automatically during installation by setting the Helm value dispatchers.setCpuLimitToRequest to true (either in a YAML values file or with --set dispatchers.setCpuLimitToRequest=true in the Helm command line).
Note that if no default CPU request or limit is defined, and no CPU request is configured when creating a Persistent Query/Code Studio, then the corresponding worker will be created without any request or limit, regardless of the value of dispatchers.setCpuLimitToRequest.
Setting default CPU requests and limits
Default CPU requests and limits for workers can be configured via Helm values.
- If default CPU requests are set, but default CPU limits are not, and
dispatcher.setCpuLimitToRequestis set tofalse(the default), then workers will be created with a CPU request but no CPU limit. - If a default CPU limit is set, but a worker's request is greater than the default limit, then the worker's limit is set to the requested amount, regardless of the value of
dispatcher.setCpuLimitToRequest. - If
dispatcher.setCpuLimitToRequestis set totrue, the default CPU limits is always overridden by either the default CPU request (if configured) or a worker's specific CPU request (since the limit will always be set to the request).
The following values are used to configure default CPU requests and limits for workers:
Best practices for worker processes
- If workers are being unexpectedly OOM killed, increase memory allocation.
- Do not set memory values higher than necessary, as this reduces the number of workers that can be allocated on the cluster.
- Monitor memory usage patterns to fine-tune these settings over time.
- For detailed control over memory allocation, see Configuring Worker Heap Overhead and Deephaven properties.
Configuring Data Import Server
Data Import Server (DIS) configuration ensures persistent storage for Intraday data, allowing data to survive pod restarts and providing appropriate storage characteristics for your data volume and access patterns.
The DataImportServer uses Kubernetes PersistentVolumeClaims to store Intraday data. By default, even if the Helm chart is uninstalled, the DIS persistent volumes are preserved to prevent data loss.
The Kubernetes service for the Data Import Server allows Deephaven tailers to connect to DIS to stream data from binary log files. A ClusterIP service is used by default, which restricts access to the DIS to other nodes within the cluster. This service kind can be changed to LoadBalancer to allow connections from remote tailers outside the Kubernetes cluster.
Configuration options
The default DIS storage values for the chart are:
Example
To use a faster storage class and increase volume sizes for a production environment:
Verification
To verify your DIS storage settings:
Best practices for DIS storage
- Set up monitoring for your volumes so that you are alerted before they run out of space.
- Use SSD-backed storage classes for better performance.
- Size volumes based on expected data volume plus 20-30% headroom.
- Consider enabling volume snapshots for backup if available in your cluster.
- Set
keepPvs: falseonly in development environments where data loss is acceptable.
Configuring tolerations, node selectors and affinity
Kubernetes provides several mechanisms to determine where pods are scheduled. Deephaven's Helm chart does not interpret any of these values, but can pass them through from your values.yaml file to the various pods that are created by the installation (installation hooks, system-level processes, and workers). In particular, you can configure tolerations, node selectors, and affinity.
By default, no tolerations, selectors or affinity are added. To add tolerations to all created deployments, modify your values.yaml file to include a tolerations block, which is then copied into each pod. For example:
This adds the following tolerations to each pod (in addition to the default tolerations provided by the Kubernetes system):
Similarly, you can add a nodeSelector or an affinity block:
Which results in pods containing node selectors like:
And affinity as follows:
Configuring system process command arguments and environment variables
System process command arguments
Note
The hostconfig files described in configuring process resources are used to configure process parameters outside of Kubernetes environments, and changing them does not have an effect in a Kubernetes Deephaven deployment.
Deephaven processes in Kubernetes are started with arguments that are found in the Helm chart's values.yaml file under the process key. To override a value, add a similar value to your override YAML file. For example, to change the default maximum memory for the controller, you would add this to your YAML file:
If the maximum JVM size for a process is increased with the -Xmx flag, you may also have to adjust the container resource limits.
In this example the controller's max JVM memory is increased from the default value of 4G to 6G, so the controller container should be adjusted accordingly with this
override for the resource limits in your my-values.yaml file:
There is a common section with values that apply to all processes. You may redefine this under process by copying the original set of
values and changing or adding to it. For example, process.common.logRoot contains a JVM property denoting the directory home for process logs:
You may redefine this value in your my-values.yaml file:
If you need to provide an arbitrary property to a system process, you can use the process.jvmArgsUser key in your my-values.yaml override file. Adding it under common will affect all processes, or you can add it only to specific processes, similar to what is done for process arguments.
Environment variables
Environment variables can be set under the userEnv key. The variable name will be the key. For example:
As with system arguments, you can define environment variables for all processes under the common key, or under a process key like controller to affect only that process environment. There is also a systemEnv key in Deephaven's chart.yaml file, and if there is a value defined in both, then the userEnv value takes precedence.
Worker parameters
The Query Dispatcher allows you to change certain parameters when creating a worker (as described in the Code Studio Advanced Settings page). By default, the query server allows the user to specify the CPU shares, but not other parameters, as that would allow the user to execute arbitrary code on the cluster. The merge server allows specification of all values, as it is by default restricted to users in the iris-schemamanagers group.
To change the permitted Kubernetes control parameters, you can set the Kubernetes.workerValidator property. The three built-in values are:
AllowCpu: CPU shares may be specified, but pod template and container image may notAllowAll: All parameters are permittedDenyAll: No changes are permitted
Additionally, if the validator property begins with class:, then a class name that implements the com.illumon.iris.db.tables.remotequery.process.K8SWorkerValidator interface will be instantiated using the zero-argument constructor.
Configuring Envoy
Envoy is the central service through which all external clients connect. The envoyFrontProxyUrl tells clients
where to connect — this may be a load balancer directly pointing to the Envoy service installed by Deephaven,
or it may be a downstream proxy that clients connect to (such as a Kubernetes Ingress).
Configuring roles, role bindings, and service accounts
Deephaven uses two service accounts: one for creating query workers and one for the Helm install hooks used to set up the system. These service accounts grant permissions to Deephaven pods via Kubernetes role-based access control (RBAC). By default, the Helm installation will automatically create these two service accounts, as well as the appropriate roles and role bindings.
Automatic creation of the ServiceAccounts, Roles and RoleBindings can be disabled. In this case, the appropriate resources must be created manually.
The example values below are representative of the default values:
Defining roles and role bindings
In some clusters, Helm will not be able to create Kubernetes Roles or RoleBindings. In these cases, those objects must be created manually by an administrator with sufficient privileges in the Kubernetes namespace.
In the examples below, Helm values are configured to disable automatic creation of Kubernetes Roles and RoleBindings during the Helm install process. Examples of Kubernetes objects to add manually are provided. Note that the service accounts listed in the RoleBindings must match the serviceAccount.name and hookServiceAccount.name values set during installation, or their default values of <RELEASE-NAME>-deephaven and hook-<RELEASE-NAME>-deephaven.
The example below assumes a Helm release name of my-dh-install (i.e., it assumes the installation was run with helm install my-dh-install ...). The default service account names of my-dh-install-deephaven and hook-my-dh-install-deephaven are used for the RoleBindings.
Helm values
The following Helm values will disable automatic creation of Roles and RoleBindings.
Role and RoleBinding objects
Be sure to replace <NAMESPACE> and <RELEASE-NAME> in the example below with the namespace and Helm release name you are using. If you create service accounts as described below, you must set the name in the subject section of the RoleBinding objects to the service account names you use.
The dh-worker-management role below includes a section for the cert-manager.io API group. This section may be omitted if cert-manager will not be used (that is, if the certmgr.enabled value is not explicitly set to true).
Defining ServiceAccounts
If service accounts cannot be created automatically, administrators can add them manually.
Helm values for ServiceAccounts
The following values can be used to disable automatic creation of ServiceAccounts. If you disable automatic creation of service accounts, you must also specify the names of the service accounts you create manually. If automatic creation of RoleBindings is disabled as well (as described above), you must take care that the RoleBindings you add manually match the service accounts you create.
ServiceAccount objects
The example YAML below can be used to manually create service accounts for use by Deephaven.
Related documentation
- Kubernetes Quickstart
- Kubernetes IAP Integration
- Customizing Kubernetes Installation
- Kubernetes ETCD Recovery
- Troubleshooting Kubernetes
- Configuration Overview
- Deephaven Properties Files
- Worker Heap Size
- Kubernetes Resource Management
- Kubernetes Storage Classes
- Kubernetes Pod Affinity and Anti-Affinity
- Helm Values Files