Vectorization and the recipe paradigm
Deephaven's query engine uses vectorized operations and a declarative "recipe" paradigm to achieve high performance on both static and real-time data. This guide explains the technical foundations of this approach and why it matters for your queries.
The recipe paradigm: Instead of writing step-by-step instructions that process data one element at a time, you define what result you want — like a recipe that describes the finished dish. Deephaven's engine then figures out how to compute it efficiently, processing data in optimized batches.
The paradigm shift: Imperative vs declarative
Traditional programming: Imperative SISD
In traditional programming, you write imperative code that processes one data element at a time. This is Single Instruction, Single Data (SISD) — each instruction operates on a single value:
This approach:
- Executes instructions sequentially.
- Processes one data element per instruction.
- Requires explicit loops for multiple elements.
- Creates intermediate objects for each value.
Deephaven: Declarative and chunked
Deephaven uses a declarative approach. The engine processes data in optimized chunks, applying one operation across many values at once:
See emptyTable and update for more details.
This approach:
- Specifies what to compute, not how.
- Processes data in optimized chunks (vectorization).
- Avoids intermediate objects.
Why the recipe approach is faster
The recipe approach avoids the overhead of interpreted loops for data-processing work:
- Vectorization - Processes multiple values per CPU instruction.
- No interpreter overhead - Computation stays in compiled code.
- Better memory access - Sequential columnar reads are cache-friendly.
- Parallelization - Engine can split work across cores.
What is vectorization?
CPU-level vectorization
Modern CPUs have special instructions that operate on multiple data elements simultaneously. For example, instead of adding two numbers at a time:
Vectorized CPUs can do:
How Deephaven enables vectorization
Deephaven's engine is designed to enable CPU vectorization:
- Columnar storage - Data for a column is stored contiguously in memory.
- Chunk-oriented processing - Operations work on blocks of data at once.
- Type-specific operations - Specialized code for each data type avoids type checks in inner loops.
- JIT compilation - The JVM can optimize and vectorize hot code paths.
By structuring engine operations as chunk-oriented kernels, Deephaven allows the JVM's JIT compiler to vectorize computations where possible.
The Chunk architecture
Deephaven moves data using a structure called a Chunk:
When you write:
The engine:
- Reads column
Xin chunks (e.g., 4096 values at a time). - Applies the operation to each chunk (vectorized multiplication).
- Writes results to column
Yin chunks.
This approach:
- Amortizes memory access costs.
- Enables vectorization.
- Reduces per-element overhead.
- Works efficiently with CPU caches.
The recipe paradigm: How it works
Recipes are specifications
When you write a Deephaven query:
See timeTable for more details.
You're creating a specification (recipe) that says "Y should always equal X times 2". You're not executing a loop or directly computing values.
Lazy evaluation and dependency tracking
The engine builds a Directed Acyclic Graph (DAG) of dependencies:
When data ticks:
- New rows arrive in
t1. - Engine detects that
t2depends ont1. - Engine automatically computes
Yfor the new rows. - Updates propagate through the DAG.
This requires significant additional infrastructure with imperative loops — because loops execute once and stop, you need to build your own subscription and recomputation logic.
Update propagation example
See updateBy and CumSum for more details.
Watch this table in the UI. Every second:
- A new row arrives in
source. XSquaredis computed for the new row.SumXis updated for the new row.
Write the recipe once, and it runs forever.
Real-world example: Time operations
Here's a more complex example that demonstrates multiple concepts working together - time manipulation, chained operations, and Java function integration:
This example illustrates several key concepts:
- Declarative recipes - Each
.update()specifies what to compute, not how to loop. - Automatic propagation - All three tables (
t1,t2,t3) update every second. - Chained operations - Tables build on each other through the DAG.
- Real-time execution - New rows trigger automatic recomputation.
- Java integration - Using
epochNanosToInstant()from DateTimeUtils. - Type conversions - Converting between epoch nanos, Instants, and timestamps.
Every second, a new row arrives and all formulas execute automatically. The engine handles:
- Dependency tracking between
t1→t2→t3. - Type conversions and time arithmetic.
- Efficient execution of all operations.
Query compilation
Under the hood, Deephaven:
- Parses your query string into an Abstract Syntax Tree (AST).
- Analyzes the AST to determine dependencies and types.
- Generates optimized Java code (or uses pre-compiled classes for simple operations).
- Compiles the generated code.
- Executes the compiled code on chunks of data.
For example, "Y = X * 2" might become:
This compiled code:
- Has no interpreter overhead.
- Can be JIT-optimized by the JVM.
- Can be vectorized by the CPU.
- Runs at native speed.
Real-time processing: The killer feature
Why recipes enable real-time
The recipe paradigm makes real-time processing trivial. Compare:
Loop approach (doesn't work for real-time):
Recipe approach (automatically handles updates):
Incremental computation
The engine is smart about updates. It doesn't recompute everything - it only processes what changed:
When a new row arrives:
- Only the new row is processed.
- All formulas are evaluated for that row.
- Results are appended to output columns.
- Nothing else is recomputed.
For updates or modifications:
- Only affected rows are recomputed.
- Dependencies are tracked automatically.
- Downstream tables update accordingly.
Example: Live aggregations
This query:
- Processes streaming trade data.
- Maintains separate rolling averages per symbol.
- Updates automatically as new data arrives.
- Would be extremely difficult to implement with loops.
Memory efficiency
No intermediate objects
Loop approach creates objects:
Recipe approach stays in native memory:
Column sharing and copy-on-write
Deephaven uses smart memory management:
See where for more details.
Deephaven tables can share their RowSet with other tables in the same update graph that contain the same row keys. This sharing avoids copying data unnecessarily.
Columnar vs row-oriented storage
Row-oriented (like lists of maps):
- Accessing column X requires skipping Y values.
- Poor cache locality for column operations.
- Can't vectorize efficiently.
Columnar (like Deephaven):
- Column X is contiguous in memory.
- Excellent cache locality.
- Enables vectorization.
Common patterns: Technical details
Pattern: Element-wise operations
Engine execution:
- Reads
XandYcolumns in chunks. - Applies vectorized operations chunk-by-chunk.
- Writes results to
Zcolumn. - No interpreter overhead, no intermediate objects.
Pattern: Conditional operations
The ternary operator compiles to generated Java code with no interpreter overhead.
Pattern: Cross-row operations
These operations:
- Maintain state efficiently.
- Update incrementally when data ticks.
- Would require significantly more code to implement with loops, and would lose automatic real-time propagation.
- Are highly optimized in the engine.
When loops ARE appropriate
Valid use case: Data extraction
This is extraction, not transformation. The data is leaving Deephaven.
Valid use case: Control flow
You're using loops to control table creation, not to transform table data.
Invalid use case: Column transformations
Use .update() instead!
Performance best practices
1. Let the engine vectorize
✅ Good - Vectorizable:
⚠️ Careful - Complex logic in query strings may not vectorize:
2. Minimize cross-language calls
❌ Slow - Calls closure for every row:
✅ Fast - Stays in compiled code:
3. Use appropriate operations
For rolling calculations, use updateBy:
For aggregations, use dedicated methods:
4. Filter early
Advanced: JVM and vectorization
JIT compilation
The Java Virtual Machine (JVM) uses Just-In-Time (JIT) compilation to optimize hot code paths. For Deephaven queries:
- Initial execution - Code is interpreted.
- Profiling - JVM identifies hot methods.
- Compilation - Hot methods are compiled to native code.
- Optimization - Compiler applies vectorization, loop unrolling, etc.
This means:
- First execution may be slower (compilation overhead).
- Subsequent executions are much faster.
- Long-running queries benefit most.
Key takeaways
- Think declaratively - Specify what to compute, not how to iterate.
- Recipes enable real-time - Declarative queries update automatically.
- Vectorization = performance - Chunked operations process multiple elements at once.
- No interpreter overhead - Computation stays in compiled code.
- Use loops for extraction, not transformation - Get data out, don't transform inside loops.
The paradigm shift:
- Old way: "For each row, multiply X by 2 and store in Y".
- Deephaven way: "Y should always equal X times 2".
This shift unlocks:
- High performance through vectorization.
- Automatic real-time updates.
- Cleaner, more maintainable code.
- Efficient memory usage.