Kubernetes Configuration Settings

Configuration for a Kubernetes installation uses the same Deephaven customization and configuration processes for much of the installation. Because the Helm chart also specifies the equivalent of machine and Helm resources that would be handled by system administrators in a bare-metal (or VM-based) installation, the standard configuration process is augmented by Helm values.

Configuring Worker Heap Overhead

The Deephaven system creates a new container for each worker, which hosts a single JVM process. This provides isolation between workers, and the Linux kernel's container runtime automatically limits the resources that the container can use. The bulk of a JVM's memory usage is for the heap, which is where most user defined objects are allocated. However, the JVM also uses off-heap native memory for several reasons including direct memory allocations, garbage collection information, meta-space, classes, compiler caches, and more. When creating the container for a worker, Deephaven provides two values to Kubernetes to size the container's memory: a limit and request.

  • The request is the minimum amount of heap that Kubernetes allocates to the process.
  • The limit is the maximum amount the process is permitted to use.

Deephaven sets both of these parameters to the same value. If the requested amount of heap is unavailable, then the worker cannot be scheduled by Kubernetes. If the JVM exceeds the limit, then the kernel terminates the process and the container is marked as "OOMKilled" (out-of-memory).

If workers are being unexpectedly OOM killed, then additional memory allocation should be increased. The disadvantage of increasing them beyond what is necessary is that fewer workers can be allocated on the cluster. Configuring Worker Heap Overhead contains information on how to control additional memory allocation for each worker beyond the heap size; these parameters are set using Deephaven properties.

Configuring Process Resources

Each Deephaven service pod has two containers: the main container running the service (e.g., authserver) and a second container running a tailer. The Helm chart provides values for the memory, CPU, and a shared directory for binary logs produced by the main Deephaven processes. The default values are:

resources:
  defaults:
    binlogsSize: 2Gi
    requests:
      cpu: 500m
      memory: 1Gi
      ephemeral-storage: 1Gi
    limits:
      cpu: 1
      memory: 4Gi
      ephemeral-storage: 1Gi
    tailer:
      requests:
        cpu: 250m
        memory: 1Gi
        ephemeral-storage: 1Gi
      limits:
        cpu: 1
        memory: 2Gi
        ephemeral-storage: 1Gi

You may override these defaults or provide resources for specific containers including "authserver", "configuration-server", "controller", "las", "merge-server", "query-server", "tdcp", and "webapi". For example, to increase the binary log directory size on the Log Aggregator service to 4GB that processes the logs for each worker, you can add the following to your Helm values.yaml file:

resources:
  las:
    binlogsSize: 4Gi

Data Import Server

The DataImportServer must have persistent storage for Intraday data. Because data is stored on the disk, even if the Helm chart is uninstalled, the DIS persistent volume is preserved by default, which can be changed by setting dis.keepPvs to false. You may also need to change the size to account for your data volume and the storage class to provide a faster disk or a value suitable for your Kubernetes providers.

The following values are the DIS storage defaults for the chart:

dis:
  keepPvs: true
  storageClassName: 'standard-rwo'
  # How big /db/Intraday is
  intradaySize: 10Gi
  # How big /db/IntradayUser is
  intradayUserSize: 10Gi

Configuring Tolerations, Node Selectors and Affinity

Kuberenetes provides several mechanisms to determine where pods are scheduled. Deephaven's Helm chart does not interpret any of these values, but can pass them through from your values.yaml to the various pods that are created by the installation (installation hooks, system-level processes, and workers). In particular, you can configure tolerations, node selectors, and affinity.

By default, no tolerations, selectors or affinity are added. To add tolerations to all created deployments, modify your values.yaml file to include a tolerations block, which is then copied into each pod. For example:

tolerations:
  - key: 'foo'
    operator: 'Exists'
    effect: 'NoSchedule'
  - key: 'bar'
    value: 'baz'
    operator: 'Equal'
    effect: 'NoSchedule'

Adds the following tolerations to each pod (in addition to the default tolerations provided by the Kubernetes system):

Tolerations:                 bar=baz:NoSchedule
                             foo:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

Similarly, you can add a nodeSelector or affinity block:

nodeSelector:
  - key: 'foo'
    operator: 'Exists'
    effect: 'NoSchedule'
  - key: 'bar'
    value: 'baz'
    operator: 'Equal'
    effect: 'NoSchedule'

affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
            - key: label
              operator: In
              values:
                - value1

Which result in pods containing node selectors like:

Node-Selectors:              key1=value1
                             key2=value2

And affinity as follows:

affinity:
  nodeAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
            - key: label
              operator: In
              values:
                - value1
        weight: 1

Configuring System Process Command Arguments and Environment Variables

System Process Command Arguments

Note

The hostconfig files described in configuring process resources are used to configure process parameters outside of Kubernetes environments, and changing them does not have an effect in a Kubernetes Deephaven deployment.

Deephaven processes in Kubernetes are started with arguments that are found in the helm chart's values.yaml file under the process key. To override a value, add a similar value to your override yaml file. For example, to change the default max memory for the controller, you would add this to your yaml file:

userProc:
  controller:
    jvmArgsMemory: '-Xmx6g -Xms4g -Xmn32m'

If the maximum JVM size for a process is increased with the -Xmx flag, you may also have to adjust the container resource limits. In this example the controller's max JVM memory is increased from the default value of 4G to 6G, so the controller container should be adjusted accordingly with this override for the resource limits in your my-values.yaml file:

# Note - resource limit set higher than the -Xmx6g max JVM size to account for any overhead.
resources:
  controller:
    limits:
      memory: 6.5Gi

There is a common section with values that apply to all processes. You may redefine this under process by copying the original set of values and changing or adding to it. For example, process.common.logRoot contains a JVM property denoting the directory home for process logs:

process:
  common:
    logRoot: '/var/log/deephaven'

You may redefine this value in your my-values.yaml file:

process:
  common:
    logRoot: '/new/logroot/directory'

In the event an arbitrary property should be provided to a system process, there is a process.jvmArgsUser key that may be populated in your my-values.yaml override file. Adding it under common will affect all processes, or it may be added only to specific processes similar to what is done for process arguments.

process:
  common:
    jvmArgsUser: '-Dproperty.common=set.on.all.processes'
  controller:
    jvmArgsUser: '-Dproperty.controller=set.on.controller.only'

Environment Variables

Environment variables can be set under the userEnv key. The variable name will be the key, for example:

userEnv:
  common:
    MY_ENV_VAR: 'MY_ENV_VAR_VALUE'

Similar to the system arguments, they may be provided for all processes under the common key, or under a process key like controller to affect only that process environment. There is a systemEnv key in Deephaven's chart.yaml file also, and if there is a value defined in both, then the userEnv value takes precedence.

Worker Parameters

The Query Dispatcher allows you to change certain parameters when creating a worker (as described in the Code Studio Advanced Settings page). By default, the query server allows the user to specify the CPU shares, but not other parameters, as that would allow the user to execute arbitrary code on the cluster. The merge server allows specification of all values, as it is by default restricted to users in the iris-schemamanagers group.

To change the permitted Kubernetes control parameters, you can set the Kubernetes.workerValidator property. The three built-in values are:

  • AllowCpu: CPU shares may be specified, but pod template and container image may not
  • AllowAll: All parameters are permitted
  • DenyAll: No changes are permitted

Additionally, if the validator property begins with class:, then a class name that implements the com.illumon.iris.db.tables.remotequery.process.K8SWorkerValidator will be instantiated using the zero-argument constructor.