Controlling query worker heap size

The Remote Query Dispatcher (dispatcher) controls worker heap sizes based on a combination of properties. This page describes how to configure those properties and how they fit into overall dispatcher configuration. This guide is intended for system administrators who tune dispatcher memory usage and worker limits.

Worker heap properties are part of the normal dispatcher property set and are delivered from the configuration server. In most environments you should:

  • Keep installer-owned files unchanged. Use a customer-owned file such as iris-environment.prop for overrides so they are not overwritten during upgrades.
  • Use dhconfig to view and update properties. See managing configuration.
  • Use stanzas when you need different heap limits for different dispatchers (for example, a large query server versus a smaller merge server).

Before changing worker heap settings, review the Remote Query Dispatcher configuration guide. That guide explains dispatcher roles, base properties, and how stanzas scope configuration to individual dispatchers.

Note

Prerequisites:

Before changing worker heap settings, you should be comfortable with:

  • Understanding dispatcher roles and topology in your installation (query servers versus merge servers).
  • Access to dhconfig and the configuration server used for your environment.
  • Knowledge of which dispatcher hosts or services you want to tune.

If you are new to these topics, start with the Remote Query Dispatcher configuration and Deephaven properties files guides.

Typical workflow

Configuring worker heap limits usually follows this pattern:

  1. Inspect current dispatcher properties.
  2. Decide on new heap limits.
  3. Add or update properties in a customer-owned file, optionally inside a stanza that targets a specific dispatcher.
  4. Deploy the updated configuration to the configuration server.
  5. Restart the affected dispatcher processes.

Step 1: Inspect current dispatcher properties

Use dhconfig to export the properties that currently apply to dispatchers. For example, to inspect installer-provided endpoint stanzas:

/usr/illumon/latest/bin/dhconfig properties export --file iris-endpoints.prop | grep "\[host="

This shows example [host=...] stanzas that scope properties to dispatchers running on specific hosts.

Step 2: Choose worker heap limits

Decide how much total heap each dispatcher should make available to workers, and whether any per-worker limits are required. The most common properties are:

  • RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB – total heap across all workers on this dispatcher.
  • RemoteQueryDispatcher.maxPerWorkerHeapMB – maximum heap for any single worker on this dispatcher.

The sections below describe these and related properties in more detail.

Step 3: Configure properties in a customer-owned file

Add or update the properties in a customer-owned file such as iris-environment.prop. To apply settings to all dispatchers, place the properties outside any stanza. To apply settings only to a specific dispatcher, place them inside a stanza that matches that dispatcher.

For example, to set a higher total heap limit for a dispatcher running on fqdn-1:

[host=fqdn-1|ip-1] {
    RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB=49152
}

This stanza applies only to dispatchers whose host matches fqdn-1 or ip-1. For more details on scoped stanzas and matching rules, see Deephaven properties file format.

Step 4: Deploy configuration

After editing the customer-owned file, upload or apply it using dhconfig as described in managing configuration.

Step 5: Restart affected dispatchers

Dispatcher processes read their properties at startup, so restart any affected dispatchers after changing worker heap properties.

Validate your changes

After restarting dispatchers, validate that the new configuration behaves as expected:

  • Start test workers with specific heap sizes and confirm they are accepted or rejected according to your new limits.
  • Check dispatcher logs for messages that indicate workers were refused due to heap limits or available memory checks.
  • If enabled, use the dispatcher web server or monitoring tools to confirm the current heap limits and observed worker memory usage. See metrics and monitoring.

If workers are still failing with out-of-memory errors, see the guidance below on overhead and OS-based memory checks, and refer to the out-of-memory FAQ.

Conceptual overview

At a high level, each dispatcher decides whether it can start a new worker based on three factors:

  • Total worker heap across this dispatcher – controlled by RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB.
  • Maximum heap per worker – controlled by RemoteQueryDispatcher.maxPerWorkerHeapMB.
  • Safety margin versus operating system reported memory – controlled by RemoteQueryDispatcher.reservedAvailableMemoryMB and RemoteQueryDispatcher.adminReservedAvailableMemoryMB.

The dispatcher keeps a running estimate of how much memory its current workers use, including overhead. When a new worker is requested, the dispatcher:

  1. Calculates the worker's estimated memory usage (requested heap plus overhead).
  2. Adds that to the current total and checks against the dispatcher-wide limits.
  3. Checks the operating system's available memory and reserved margin, if those properties are enabled.

Only if all applicable checks pass will the worker be started.

Remote Query Dispatcher properties

Use these properties to implement the heap limits you designed in the preceding sections.

Each Remote Query Dispatcher instance can be configured as to the resources it allows workers to consume. Dispatchers running on larger servers may allow more resources to be consumed, while dispatchers running on small servers or on servers running many processes such as the Data Import Server may have less resources available. Each dispatcher keeps a running total of the current memory utilization from all its workers and compares it to the property values to determine whether or not new workers can be started.

Calculation-based heap properties

The following properties tell a dispatcher how much worker heap it is allowed to allocate based on its own calculations, based on the requested worker heaps plus overhead:

Configuration PropertyDescriptionDefault
RemoteQueryDispatcher.maxTotalQueryProcessorHeapMBThe total available heap for all worker usage. The combined heap for all workers cannot exceed this value in MiB.354304
RemoteQueryDispatcher.maxPerWorkerHeapMBThe maximum heap size allowed for any single worker on this dispatcher. If not defined, it defaults to the total available heap size for the dispatcher.None

Server-based heap properties

These properties prevent a dispatcher from allocating more memory than the server has available based on the operating system's statistics:

Configuration PropertyDescriptionDefault
RemoteQueryDispatcher.reservedAvailableMemoryMBWhen set to a non-negative value, the dispatcher subtracts the property's value from the machine's available memory (as reported by the MemAvailable field of /proc/meminfo) and verifies that the worker heap is less than this value before creating the worker. When set to a negative value, no additional checks are performed beyond the maxTotalQueryProcessorHeapMB value. This property is ignored for administrative users (members of the groups in RemoteQueryDispatcher.adminGroups). This property is not available on Kubernetes.2048
RemoteQueryDispatcher.adminReservedAvailableMemoryMBThis property has the same effect as RemoteQueryDispatcher.reservedAvailableMemory, but is applied for administrative users (as defined by the RemoteQueryDispatcher.adminGroups property).1024

When evaluating a worker request, the dispatcher enforces both sets of limits:

  • The estimated total worker usage, including overhead, must remain below maxTotalQueryProcessorHeapMB.
  • If reservedAvailableMemoryMB is non-negative, the requested worker must also fit under MemAvailable - reservedAvailableMemoryMB (or under MemAvailable - adminReservedAvailableMemoryMB for administrative users).

These properties are dispatcher-local; each dispatcher instance applies these checks based on its own view of available memory.

Configuring worker heap overhead

The Deephaven system creates a new JVM for each worker, specifying the maximum allowed heap through the -Xmx parameter based on the user's specification. The majority of a JVM's memory usage is for the heap, where most user-defined objects are allocated. However, the JVM also uses off-heap native memory for several purposes, including direct memory allocations, garbage collection information, meta-space, classes, compiler caches, and more.

For dispatchers running many small workers, this can result in higher memory usage than the configured values, causing the dispatcher to under-calculate actual memory usage for the RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB restriction. If this happens and the server runs out of memory, the operating system may kill random processes.

Deephaven provides properties to adjust how the dispatcher calculates memory usage, increasing every worker's assumed memory usage if these properties are set:

Configuration PropertyDescriptionDefault
RemoteQueryDispatcher.memoryOverheadMBAdds this value to the heap calculation for every worker500
RemoteQueryDispatcher.memoryOverheadMultiplierMultiplies the requested heap by this value, rounds up to the nearest integer, and adds that to the memory-used calculation for every worker.05

As an example, assume the following values:

RemoteQueryDispatcher.memoryOverheadMB=300
RemoteQueryDispatcher.memoryOverheadMultiplier=0.05

If a heap of 1GB (1,024 MB) is requested, the dispatcher assumes the worker process uses 1,376 MB of memory and subtracts it from the available heap.

  • .05 (from RemoteQueryDispatcher.memoryOverheadMultiplier) * 1,024 is 51.2 MB, which is rounded up to 52 MB. 1,024 + 52 is 1,076 MB.
  • 1,076 MB + 300 MB from RemoteQueryDispatcher.memoryOverheadMB is 1,376 MB.

For a 32GB worker (32,768 MB), the dispatcher would account for 34,707 MB of memory.

  • .05 (from RemoteQueryDispatcher.memoryOverheadMultiplier) * 32,768 is 1,638.4 MB, which is rounded up to 1,639 MB. 32,768 + 1,639 is 34,407 MB.
  • 34,407 MB + 300 MB from RemoteQueryDispatcher.memoryOverheadMB is 34,707 MB.

If workers are being unexpectedly out-of-memory killed, then these properties should be increased. Start with small changes, observe the impact, and adjust as needed. Increasing these values too far reduces how many workers a dispatcher can start and may cause otherwise valid worker requests to be refused.

As a general guideline:

  • On dispatchers that run few large workers, the default overhead values are often sufficient, and aggressive increases can reduce overall capacity.
  • On dispatchers that run many small workers, consider gradually increasing memoryOverheadMB or memoryOverheadMultiplier so that the estimated usage better matches real JVM memory usage, reducing the chance of OS-level out-of-memory events.

Other properties

Configuration PropertyDescriptionDefault
minimumWorkerHeapSizeMBThe minimum allowed heap size for a worker.512
RemoteProcessingRequest.defaultQueryHeapMBThe default heap size displayed by the web for a new Persistent Query or Code Studio.4096

Worker maximum heap default

The controller allows a default maximum worker size through the PersistentQueryController.defaultMaxHeapSizeGB property. This value is the default requested worker size when no explicit heap is specified. It is used by the controller's server selection provider as a default for a dispatcher until that dispatcher connects and tells the controller its actual value.

Dispatcher-side limits (such as RemoteQueryDispatcher.maxPerWorkerHeapMB and RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB) still apply. If a user or the controller requests a worker heap larger than the dispatcher allows, the dispatcher will refuse to start that worker even if the controller default is higher.

If PersistentQueryController.defaultMaxHeapSizeGB is not defined, the controller only enforces the overall maximum worker size, currently 1 TB. Dispatcher properties should still be used to enforce practical limits per dispatcher.

Tuning example

The following example shows how to combine global and stanza-scoped properties to tune different dispatchers.

Assume the following:

  • Most dispatchers should allow a moderate total heap for workers.
  • A dedicated query server on fqdn-big can allow more worker heap.
  • A merge server should have conservative limits because it shares resources with other services.

You can configure this as follows in a customer-owned properties file such as iris-environment.prop:

# Global defaults for all dispatchers
RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB=262144   # 256 GiB total worker heap per dispatcher
RemoteQueryDispatcher.maxPerWorkerHeapMB=65536               # 64 GiB maximum per worker

# Higher limits for a large query server
[host=fqdn-big] {
    RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB=393216   # 384 GiB
}

# More conservative limits for a merge server
[service.name=dbmerge] {
    RemoteQueryDispatcher.maxTotalQueryProcessorHeapMB=131072   # 128 GiB
    RemoteQueryDispatcher.maxPerWorkerHeapMB=32768              # 32 GiB
}

In this configuration:

  • All dispatchers start from the global defaults.
  • The dispatcher running on fqdn-big uses the larger total heap limit defined in its host stanza.
  • Merge servers use more conservative limits defined in the [service.name=dbmerge] stanza.

Choose actual values based on the machine's physical memory, other workloads on the host, and your expected worker sizes.