Your query runs fine on one core, but what happens when Deephaven tries to split the work across eight? Or sixteen? In version 41, we've made a fundamental change to how filters and selectables behave - and it's going to make your queries faster without you lifting a finger.
Starting with Deephaven 41, queries run in parallel by default. In previous versions, Deephaven assumed all formulas required sequential processing. Now it assumes they can run in parallel. For most users, this change is invisible — your queries simply run faster. But if you have code that modifies shared variables or depends on row order, you'll need to make a small update.
What changed and why it matters
Deephaven's query engine can process different segments of a column in parallel during both initialization and updates. There's a catch, though. The engine can only parallelize stateless operations — those that produce the same output regardless of execution order or which thread runs them.
Stateless operations don't maintain internal state between invocations, making them safe for concurrent execution across multiple cores.
In versions before 41, Deephaven assumed all formulas required sequential processing by default. This was safe, but conservative — the engine couldn't automatically parallelize work across threads. With stateless as the new default, the engine can automatically parallelize more operations without any configuration.
The performance impact
Consider a simple filter operation on a 10-million-row table:
In previous versions, Deephaven would process this sequentially. Now it automatically divides the rows among CPU cores, with each core evaluating the filter for its assigned rows simultaneously.
The same applies to column calculations in update and select:
Do you need to change your code?
Most users: No. If your filters and column expressions are pure calculations — they depend only on their inputs and don't maintain state between rows — you don't need to do anything. Your queries will simply run faster.
Some users: Yes. If your code relies on stateful behavior, you'll need to update it.
Quick check: Does your code use global variables, depend on row order, or modify external state? If yes, keep reading.
What makes an operation stateless?
An operation is stateless if it:
- Doesn't read or modify global variables.
- Doesn't depend on which row is processed first.
- Produces the same output for the same input, regardless of when or how it runs.
These are all stateless and parallelize safely:
Example: A stateful operation that breaks
With parallel execution, multiple threads increment counter simultaneously. You'll see gaps in the sequence, duplicate values, or values that don't follow the expected pattern.
How to force sequential execution
If you have legitimately stateful operations, use .with_serial() to force rows to be processed one at a time, in order:
For filters, construct a Filter object explicitly:
Caution
You cannot use .with_serial() with view or update_view. These operations compute values on-demand when cells are accessed, so they cannot guarantee processing order. Use select or update instead when you need serial execution.
When operations depend on each other: Barriers
Sometimes you need one operation to complete before another starts — for example, when column A populates data that column B reads. Use barriers to control this ordering:
With this barrier, column A processes all rows completely before column B starts. Both columns can still be parallelized internally — the barrier only controls the ordering between them.
Note
Serial operations automatically create barriers between each other by default. If you have two serial columns in the same update, the first finishes completely before the second starts.
Rethinking stateful patterns
Before marking operations as serial, consider whether you can restructure your logic to be stateless. Stateless operations are:
- Faster: They parallelize automatically.
- Simpler: No hidden dependencies between rows.
- Safer: No race conditions or ordering issues.
Alternative: Use table operations for accumulation
Instead of accumulating state within a filter, use Deephaven's built-in aggregation and windowed operations:
This approach keeps each operation stateless while achieving the same result — and it's more explicit about what your query is doing.
Understanding the parallelization model
Deephaven parallelizes queries in two phases:
Initialization parallelism
When you first run a table operation, the engine splits the work across the Operation Initialization Thread Pool. For a where or update on a large table, different chunks of rows are processed simultaneously on different cores.
Update parallelism
For ticking tables, the Update Graph Processor Thread Pool handles ongoing updates. This pool parallelizes in two ways:
- Within each operation: Rows are divided among cores, just like during initialization.
- Across operations: Independent tables in the update graph are computed simultaneously.
Both thread pools benefit from stateless operations. You can configure their sizes:
| Property | Default | Description |
|---|---|---|
OperationInitializationThreadPool.threads | -1 (all cores) | Threads for parallel initialization |
PeriodicUpdateGraph.updateThreads | -1 (all cores) | Threads for parallel update processing |
Configuration properties
You can change the default behavior using these properties:
QueryTable.statelessSelectByDefault— controls default forselectandupdate.QueryTable.statelessFiltersByDefault— controls default for filters.
Caution
In Python builds that use the GIL (global interpreter lock), parallelizing filters and selectables can negatively impact query performance. To prevent performance regressions, even stateless operations that use Python objects are not parallelized unless the Python build is free-threaded.
Quick reference
| Scenario | Solution | Why |
|---|---|---|
| Pure column math | Default (parallel) | Thread-safe, no shared state |
| Global counter | .with_serial() | Needs sequential row processing |
| Column A must finish before Column B | Barriers | Controls cross-operation ordering |
| File I/O or logging | .with_serial() | Serialize access to shared resource |
| Non-thread-safe library | .with_serial() | Forces single-threaded access |
Migration checklist
If you're upgrading to Deephaven 41:
- Quick check: Does your code use global variables, depend on row order, or modify external state?
- Test thoroughly: Run your existing queries and verify that results match expectations.
- Add
.with_serial(): For operations that need sequential processing, useSelectable.parse(...).with_serial(). - Add barriers: If one operation must complete before another starts, use
Barrierwith.with_declared_barriers()and.with_respected_barriers(). - Consider refactoring: Where possible, restructure stateful logic to use table operations like
update_byinstead.
The bigger picture
This change is part of Deephaven's ongoing work to maximize performance automatically. By making stateless the default, we're:
- Reducing cognitive load: You don't need to think about parallelization for most queries.
- Improving performance: More operations parallelize out of the box.
- Encouraging best practices: Stateless operations are generally cleaner and more predictable.
The best code change is one you don't have to make. With stateless defaults, your existing queries get faster automatically.
Next steps
- Review the version 41 release notes for the complete list of changes.
- Learn more about parallelization in Deephaven.
- Explore update-by operations for stateless alternatives to accumulation patterns.
- See the ConcurrencyControl API for full details on
.with_serial()and barriers.
Questions about migrating your queries? Join our Slack community — we're happy to help you take advantage of better parallelism.