Kubernetes Configuration Settings
Configuration for a Kubernetes installation uses the same Deephaven customization and configuration processes for much of the installation. Because the Helm chart also specifies the equivalent of machine and Helm resources that would be handled by system administrators in a bare-metal (or VM-based) installation, the standard configuration process is augmented by Helm values.
Configuring Worker Heap Overhead
The Deephaven system creates a new container for each worker, which hosts a single JVM process. This provides isolation between workers, and the Linux kernel's container runtime automatically limits the resources that the container can use. The bulk of a JVM's memory usage is for the heap, which is where most user defined objects are allocated. However, the JVM also uses off-heap native memory for several reasons including direct memory allocations, garbage collection information, meta-space, classes, compiler caches, and more. When creating the container for a worker, Deephaven provides two values to Kubernetes to size the container's memory: a limit and request.
- The request is the minimum amount of heap that Kubernetes allocates to the process.
- The limit is the maximum amount the process is permitted to use.
Deephaven sets both of these parameters to the same value. If the requested amount of heap is unavailable, then the worker cannot be scheduled by Kubernetes. If the JVM exceeds the limit, then the kernel terminates the process and the container is marked as "OOMKilled" (out-of-memory).
If workers are being unexpectedly OOM killed, then additional memory allocation should be increased. The disadvantage of increasing them beyond what is necessary is that fewer workers can be allocated on the cluster. Configuring Worker Heap Overhead contains information on how to control additional memory allocation for each worker beyond the heap size; these parameters are set using Deephaven properties.
Configuring Process Resources
Each Deephaven service pod has two containers: the main container running the service (e.g., authserver) and a second container running a tailer. The Helm chart provides values for the memory, CPU, and a shared directory for binary logs produced by the main Deephaven processes. The default values are:
resources:
defaults:
binlogsSize: 2Gi
requests:
cpu: 500m
memory: 1Gi
ephemeral-storage: 1Gi
limits:
cpu: 1
memory: 4Gi
ephemeral-storage: 1Gi
tailer:
requests:
cpu: 250m
memory: 1Gi
ephemeral-storage: 1Gi
limits:
cpu: 1
memory: 2Gi
ephemeral-storage: 1Gi
You may override these defaults or provide resources for specific containers including "authserver", "configuration-server", "controller", "las", "merge-server", "query-server", "tdcp", and "webapi". For example, to increase the binary log directory size on the Log Aggregator service to 4GB that processes the logs for each worker, you can add the following to your Helm values.yaml
file:
resources:
las:
binlogsSize: 4Gi
Data Import Server
The DataImportServer must have persistent storage for Intraday data. Because data is stored on the disk, even if the Helm chart is uninstalled, the DIS persistent volume is preserved by default, which can be changed by setting dis.keepPvs
to false
. You may also need to change the size to account for your data volume and the storage class to provide a faster disk or a value suitable for your Kubernetes providers.
The following values are the DIS storage defaults for the chart:
dis:
keepPvs: true
storageClassName: 'standard-rwo'
# How big /db/Intraday is
intradaySize: 10Gi
# How big /db/IntradayUser is
intradayUserSize: 10Gi
Configuring Tolerations, Node Selectors and Affinity
Kuberenetes provides several mechanisms to determine where pods are scheduled. Deephaven's Helm chart does not interpret any of these values, but can pass them through from your values.yaml
to the various pods that are created by the installation (installation hooks, system-level processes, and workers). In particular, you can configure tolerations, node selectors, and affinity.
By default, no tolerations, selectors or affinity are added. To add tolerations to all created deployments, modify your values.yaml
file to include a tolerations block, which is then copied into
each pod. For example:
tolerations:
- key: 'foo'
operator: 'Exists'
effect: 'NoSchedule'
- key: 'bar'
value: 'baz'
operator: 'Equal'
effect: 'NoSchedule'
Adds the following tolerations to each pod (in addition to the default tolerations provided by the Kubernetes system):
Tolerations: bar=baz:NoSchedule
foo:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Similarly, you can add a nodeSelector
or affinity block
:
nodeSelector:
- key: 'foo'
operator: 'Exists'
effect: 'NoSchedule'
- key: 'bar'
value: 'baz'
operator: 'Equal'
effect: 'NoSchedule'
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: label
operator: In
values:
- value1
Which result in pods containing node selectors like:
Node-Selectors: key1=value1
key2=value2
And affinity as follows:
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: label
operator: In
values:
- value1
weight: 1
Configuring System Process Command Arguments and Environment Variables
System Process Command Arguments
Note
The hostconfig
files described in configuring process resources
are used to configure process parameters outside of Kubernetes environments, and changing them does not have an effect in a Kubernetes Deephaven deployment.
Deephaven processes in Kubernetes are started with arguments that are found in the helm chart's values.yaml file under
the process
key. To override a value, add a similar value to your override yaml file. For example,
to change the default max memory for the controller, you would add this to your yaml file:
userProc:
controller:
jvmArgsMemory: '-Xmx6g -Xms4g -Xmn32m'
If the maximum JVM size for a process is increased with the -Xmx
flag, you may also have to adjust the container resource limits.
In this example the controller's max JVM memory is increased from the default value of 4G to 6G, so the controller container should be adjusted accordingly with this
override for the resource limits in your my-values.yaml
file:
# Note - resource limit set higher than the -Xmx6g max JVM size to account for any overhead.
resources:
controller:
limits:
memory: 6.5Gi
There is a common
section with values that apply to all processes. You may redefine this under process
by copying the original set of
values and changing or adding to it. For example, process.common.logRoot
contains a JVM property denoting the directory home for process logs:
process:
common:
logRoot: '/var/log/deephaven'
You may redefine this value in your my-values.yaml
file:
process:
common:
logRoot: '/new/logroot/directory'
In the event an arbitrary property should be provided to a system process, there is a process.jvmArgsUser
key that may be populated
in your my-values.yaml
override file. Adding it under common
will affect all processes, or it may be added only to specific processes
similar to what is done for process arguments.
process:
common:
jvmArgsUser: '-Dproperty.common=set.on.all.processes'
controller:
jvmArgsUser: '-Dproperty.controller=set.on.controller.only'
Environment Variables
Environment variables can be set under the userEnv
key. The variable name will be the key, for example:
userEnv:
common:
MY_ENV_VAR: 'MY_ENV_VAR_VALUE'
Similar to the system arguments, they may be provided for all processes under the common
key, or under a process key
like controller
to affect only that process environment. There is a systemEnv
key in Deephaven's chart.yaml file also,
and if there is a value defined in both, then the userEnv
value takes precedence.
Worker Parameters
The Query Dispatcher allows you to change certain parameters when creating a worker (as described in the Code Studio Advanced Settings page). By default, the query server allows the user to specify the CPU shares, but not other parameters, as that would allow the user to execute arbitrary code on the cluster. The merge server allows specification of all values, as it is by default restricted to users in the iris-schemamanagers
group.
To change the permitted Kubernetes control parameters, you can set the Kubernetes.workerValidator
property. The three built-in values are:
AllowCpu
: CPU shares may be specified, but pod template and container image may notAllowAll
: All parameters are permittedDenyAll
: No changes are permitted
Additionally, if the validator property begins with class:
, then a class name that implements the com.illumon.iris.db.tables.remotequery.process.K8SWorkerValidator
will be instantiated using the zero-argument constructor.