Programmatically generate query strings with Python
The Deephaven Query Language allows users to write very powerful queries to filter and modify tables of data. Consider the following query, which uses a formula to add a new column to a table and a filter to filter the resulting table.
Deephaven query strings are passed into table operations as Python strings. As such, all of the power of Python can be used to generate query strings. This can be convenient when working with complex queries. Let's work though a few examples that are simplified by using Python to generate query strings.
Note
This guide assumes you are familiar with the use of strings, f-strings, loops, and list comprehension in Python. If not, please refer to the Python documentation for more information.
Many columns
In practice, queries may have a large number of inputs, making it inconvenient to type in each column name. Other times, the input column names are determined by user inputs and are not known when the query is written. Both of these situations can be addressed by using a list of column names to generate queries.
In the following example, an f-string and str.join are used to create a query string to sum up all of the columns and then take the square root.
If the list of columns changes, the query string programmatically adapts:
Repeated logic
Some queries repeat the same logic -- with minor tweaks. For example, a query may add columns containing data from 1, 5, and 10 minutes ago. Generated query strings can also help simplify these situations.
In the following example, an f-string is used to create columns of data from 1, 5, and 10 rows before.
This can be simplified further by using a list comprehension.
Data analysis, particularly in finance, often involves binning data into time buckets for analysis. These queries rarely use a single time bucket to analyze the data - they often use several or more. Python's f-strings make queries shorter and more readable. Consider first, a query that places data into 9 different temporal buckets without f-strings:
Not only was that query tedious to write, but the formatting is long and repetitive, and doesn't take advantage of Python's power. Consider the following query, which does the same thing, but with f-strings and list comprehension.
This query is shorter, faster to write, and easier to read for a Python programmer. It also makes future updates easier to write, and changes to the bin_sizes list are automatically reflected in the result table.
Be creative!
Programatically generating query strings works for all Deephaven operations, not just update. For example, this case uses multiple programatically generated query strings while performing a join.
Related documentation
- Built-in query language constants
- Built-in query language variables
- Built-in query language functions
- Create a new table
- Create an empty table
- Formulas in query strings
- Filters in query strings
- Operators in query strings
- Python variables in query strings
- Python functions in query strings
- Python classes in query strings
- Think like a Deephaven ninja