Controlling query worker heap size

The Remote Query Dispatcher controls worker heap sizes based on a combination of properties. These properties should be updated in a customer-owned file such as iris-environment.prop so they are not overwritten during the next upgrade. Use dhconfig to update properties.

To restrict properties to a specific dispatcher, place them in a stanza for that specific server. You can find examples by looking in the installer-generated iris-endpoints.prop file. For example:

/usr/illumon/latest/bin/dhconfig properties export --file iris-endpoints.prop | grep "\[host="

To restrict the properties described here, place them in a stanza similar to those found above.

[host=fqdn-1|ip-1] {
    RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB=49152
}

Remote query dispatcher properties

Each Remote Query Dispatcher instance can be configured as to the resources it allows workers to consume. Dispatchers running on larger servers may allow more resources to be consumed, while dispatchers running on small servers or on servers running many processes such as the Data Import Server may have less resources available. Each dispatcher keeps a running total of the current memory utilization from all its workers and compares it to the property value to determine whether or not new workers can be started.

Calculation-based heap properties

The following properties tell a dispatcher how much worker heap it is allowed to allocate based on its own calculations, based on the requested worker heaps plus overhead:

Configuration PropertyDescriptionDefault
RemoteQueryDispatcher.maxTotalQueryProcessorHeapMBThe total available heap for all worker usage. The combined heap for all workers cannot exceed this value in MiB.354304
RemoteQueryDispatcher.maxPerWorkerHeapMBThe maximum heap size allowed for any single worker on this dispatcher. If not defined, it defaults to the total available heap size for the dispatcher.None

Server-based heap properties

These properties prevent a dispatcher from allocating more memory than the server has available based on the operating system's statistics:

Configuration PropertyDescriptionDefault
RemoteQueryDispatcher.reservedAvailableMemoryMBWhen set to a non-negative value, the dispatcher subtracts the property's value from the machine's available memory (as reported by the MemAvailable field of /proc/meminfo) and verifies that the worker heap is less than this value before creating the worker. When set to a negative value, no additional checks are performed beyond the maxTotalQueryProcessorHeapMB value. This property is ignored for administrative users (members of the groups in RemoteQueryDispatcher.adminGroups). This property is not available on Kubernetes.2048
RemoteQueryDispatcher.adminReservedAvailableMemoryMBThis property has the same effect as RemoteQueryDispatcher.reservedAvailableMemory, but is applied for administrative users (as defined by the RemoteQueryDispatcher.adminGroups property).1024

Configuring worker heap overhead

The Deephaven system creates a new JVM for each worker, specifying the maximum allowed heap through the -Xmx parameter based on the user's specification. The majority of a JVM's memory usage is for the heap, where most user-defined objects are allocated. However, the JVM also uses off-heap native memory for several purposes, including direct memory allocations, garbage collection information, meta-space, classes, compiler caches, and more.

For dispatchers running many small workers, this can result in higher memory usage than the configured values, causing the dispatcher to under-calculate actual memory usage for the RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB restriction. If this happens and the server runs out of memory, the operating system may kill random processes.

Deephaven provides properties to adjust how the dispatcher calculates memory usage, increasing every worker's assumed memory usage if these properties are set:

Configuration PropertyDescriptionDefault
RemoteQueryDispatcher.memoryOverheadMBAdds this value to the heap calculation for every worker500
RemoteQueryDispatcher.memoryOverheadMultiplierMultiplies the requested heap by this value, rounds up to the nearest integer, and adds that to the memory-used calculation for every worker.05

As an example, assume the following values:

RemoteQueryDispatcher.memoryOverheadMB=300
RemoteQueryDispatcher.memoryOverheadMultiplier=0.05

If a heap of 1GB (1,024 MB) is requested, the dispatcher assumes the worker process uses 1,376 MB of memory and subtracts it from the available heap.

  • .05 (from RemoteQueryDispatcher.memoryOverheadMultiplier) * 1,024 is 51.2 MB, which is rounded up to 52 MB. 1,024 + 52 is 1,076 MB.
  • 1,076 MB + 300 MB from RemoteQueryDispatcher.memoryOverheadMB is 1,376 MB.

For a 32GB worker (32,768 MB), the dispatcher would account for 34,707 MB of memory.

  • .05 (from RemoteQueryDispatcher.memoryOverheadMultiplier) * 32,768 is 1,638.4 MB, which is rounded up to 1,639 MB. 32,768 + 1,639 is 34,407 MB.
  • 34,407 MB + 300 MB from RemoteQueryDispatcher.memoryOverheadMB is 34,707 MB.

If workers are being unexpectedly out-of-memory killed, then these properties should be increased. The disadvantage of increasing them beyond what is necessary is that the dispatcher can create fewer workers.

Other properties

Configuration PropertyDescriptionDefault
minimumWorkerHeapSizeMBThe minimum allowed heap size for a worker.512
RemoteProcessingRequest.defaultQueryHeapMBThe default heap size displayed by the web for a new Persistent Query or Code Studio.4096

Worker maximum heap default

The controller allows a default maximum worker size through the PersistentQueryController.defaultMaxHeapSizeGB property. This is used by the controller's server selection provider as a default for a dispatcher until that dispatcher connects and tells the controller its actual value. If not defined, then the only restriction applied is the maximum worker size, currently 1 TB.