Recipes, not loops!

If you're coming from pandas, traditional Python, or other data processing tools, you're likely accustomed to writing loops to transform data. Stop! Deephaven works fundamentally differently, and understanding this difference early will save you countless hours of frustration and help you write better, faster code.

The fundamental paradigm shift: recipes, not loops

How you might be thinking

In pandas or traditional Python, you tell the computer exactly how to process each row:

This loop processes one element at a time and builds a new list. You're giving step-by-step instructions for how to process the data.

How to think in Deephaven

In Deephaven, you specify what you want, not how to compute it. You write a recipe that describes the transformation, and the Deephaven engine figures out the optimal way to execute it:

Notice:

  • No loops — you describe the relationship (XSquared = X * X) and the engine applies it to every row.
  • You specify what to compute, not how to iterate.
  • The engine applies this recipe to all rows automatically.
  • This works the same way for arbitrarily complex operations.

Why this matters

For static data

Even for static, one-time calculations, the recipe approach has advantages:

  1. Clearer code - Declarative recipes are easier to read than imperative loops.
  2. Faster execution - The engine can optimize vectorized operations.
  3. Less error-prone - No manual loop management or index tracking.

For real-time data

This is the critical difference. Loops execute once and stop. Recipes update automatically.

Watch what happens:

  • The table keeps updating - new rows appear every second.
  • Your recipe runs automatically on every new row.
  • You wrote it once, but it executes forever.

With a loop approach:

The recipe paradigm explained

Recipes are specifications, not instructions

When you write:

You're not saying:

  • "Start at row 0"
  • "Read X from row 0"
  • "Multiply by 2"
  • "Store in Y at row 0"
  • "Go to row 1"
  • "Repeat..."

You're saying:

  • "For every row, Y should equal X times 2"

The engine decides:

  • How to chunk the data for optimal performance.
  • Whether to parallelize the operation.
  • How to handle updates efficiently.
  • What rows need recomputation when data changes.

The engine is smart about updates

  1. Tracks dependencies - It knows that Y depends on X.
  2. Computes incrementally - Only new or changed rows are processed.
  3. Updates automatically - Results update without you doing anything.

This requires significant additional infrastructure with loops — a loop executes once and stops, so you would need to build your own subscription and recomputation logic.

Bridging pandas and Deephaven

Many users need to work with both pandas and Deephaven. Here's how to think about the transition:

Key principle: Once you're in Deephaven, think in recipes. Save loops for when you convert back to pandas.

When loops ARE appropriate

There are valid uses for loops in Deephaven:

✅ Extracting data from Deephaven

See the table iteration guide for details.

✅ Control flow in your Python code

❌ Transforming table columns

Common patterns: Wrong vs Right

Pattern: Create a column from another column

Wrong (loop approach):

Right (recipe approach):

Pattern: Conditional logic

Wrong (loop approach):

Right (recipe with ternary operator):

Pattern: Running calculations

Wrong (loop with accumulator):

Right (use update_by):

Quick reference: Migration guide

pandas/Python PatternDeephaven Recipe
df.apply(func).update("Y = func(X)")
for row in df.iterrows():❌ Don't! Use .update()
df['Y'] = df['X'] * 2.update("Y = X * 2")
df[df['X'] > 5].where("X > 5")
df.rolling(window=10).mean().update_by(rolling_avg_tick(...))
df.groupby('G').sum().sum_by("G")

Next steps