Skip to main content
Version: Python

How to programmatically generate query strings with Python

The Deephaven Query Language allows users to write very powerful queries to filter and modify tables of data. As an example, consider this case that uses a formula to compute a new column and another formula to filter the resulting table.

from deephaven import new_table
from deephaven.column import int_col

source = new_table([int_col("Value", [0, 1, 2, 3, 4, 5, 6])])

result = source.update("X = sqrt(Value) + i").where("2 < X && X < 8")

At their heart, Deephaven query strings are just Python strings. As such, all of the power of Python can be used to generate query strings. This can be convenient when working with complex queries. Let's work though a few examples that are simplified by using Python to generate query strings.

note

This guide assumes you are familiar with the use of strings, f-strings, loops, and list comprehension in Python. If not, please refer to the Python documentation for more information.

Many columns

In practice, queries may have a large number of inputs, making it inconvenient to type in each column name. Other times, the input column names are determined by user inputs and are not known when the query is written. Both of these situations can be addressed by using a list of column names to generate queries.

In the following example, an f-string and str.join are used to create a query string to sum up all of the columns and then take the square root.

from deephaven import new_table
from deephaven.column import int_col

cols = ["A", "B", "C", "D"]

source = new_table([int_col(c, [0, 1, 2, 3, 4, 5, 6]) for c in cols])

result = source.update(f"X = sqrt(sum({','.join(cols)}))")

If the list of columns changes, the query string programatically adapts:

from deephaven import new_table
from deephaven.column import int_col

cols = ["A", "B", "C", "D", "E"]

source = new_table([int_col(c, [0, 1, 2, 3, 4, 5, 6]) for c in cols])

result = source.update(f"X = sqrt(sum({','.join(cols)}))")

Repeated logic

Some queries repeat the same logic -- with minor tweaks. For example, a query may add columns containing data from 1, 5, and 10 minutes ago. Generated query strings can also help simplify these situations.

In the following example, an f-string is used to create columns of data from 1, 5, and 10 rows before.

from deephaven import empty_table

source = empty_table(100).update("X = ii")

offsets = [1, 5, 10]

result = source

for offset in offsets:
result = result.update(f"X{offset} = X_[ii-{offset}]")

This can be simplified further by using a list comprehension.

from deephaven import empty_table

source = empty_table(100).update("X = ii")

offsets = [1, 5, 10]
result = source.update([f"X{offset} = X_[ii-{offset}]" for offset in offsets])

Be creative!

Programatically generating query strings works for all Deephaven operations, not just update. For example, this case uses multiple programatically generated query strings while performing a join.

from deephaven import empty_table

source = empty_table(100).update(["X = ii", "Y = X", "Z = sqrt(X)"])

offsets = [1, 5, 10]

result = source

for offset in offsets:
result = result.natural_join(
source.update(f"XOffset = X+{offset}"),
on="X=XOffset",
joins=[f"Y{offset}=Y", f"Z{offset}=Z"],
)