How to use formulas
This guide will show you how to work with formulas in your query strings.
Formulas can be used in two contexts:
- To filter data, such as with
where
,where_in
, etc. In this case, they must return booleans (e.g.,"x%3 == 1"
). - To assign values, such as with update, view, etc. In this case, a result is set equal to a column name (e.g.,
"A = x%3"
).
Regardless of the formula usage, there are common language features used to construct a formula:
- operators (
+
,-
,*
,/
,%
,_
,.
,[]
,()
) - functions
- objects
- variables
- special variables
Filter data
You can use formulas to filter your data to show only what you want using where
.
Boolean formulas in filter methods are also known as conditional filters. These formulas are designed to return a boolean value (true
or false
) to narrow data sets to only desired values.
In the following example, operators are used to limit values in result tables.
from deephaven import new_table
from deephaven.column import string_col, int_col
source = new_table([
string_col("X", ["A", "B", "C", "D", "E", "F", "G"]),
int_col("Y", [1, 2, 3, 4, 5, 6, 7]),
int_col("Z", [2, 3, 1, 2, 3, 1, 2])
])
result = source.where(filters=["Y > 5"])
- source
- result
Assign data
You can use formulas to add and assign new data. When a formula returns a value, it can be used with selection methods to create columns.
In the following example, operators are used to create new columns of values based on prior data.
from deephaven import new_table
from deephaven.column import string_col, int_col
source = new_table([
string_col("X", ["A", "B", "C", "D", "E", "F", "G"]),
int_col("Y", [1, 2, 3, 4, 5, 6, 7]),
int_col("Z", [2, 3, 1, 2, 3, 1, 2])
])
result = source.update(formulas=["Sum = Y + Z"])
- source
- result
More complex formulas
Formulas can be used with the full power of the query language, such as with variables, functions, and objects.
In this example, functions and objects are used to both filter and assign values.
from deephaven import new_table
from deephaven.column import string_col, int_col
def f(a, b):
return a * b
class MyObj:
def __init__(self, a, b, c):
self.a = a
self.b = b
self.c = c
def compute(self, value1):
return self.a + value1
obj = MyObj(1, 2, 3)
source = new_table([
string_col("X", ["A", "B", "C", "D", "E", "F", "G"]),
int_col("Y", [1, 2, 3, 4, 5, 6, 7]),
int_col("Z", [2, 3, 1, 2, 3, 1, 2])
])
result = source.where(filters=["(int)f(Y, Z) > 9"]).update(formulas=["A = obj.a", "B = obj.compute(Y)"])
- source
- result