How to use regular expressions in Deephaven
This guide will show you how to limit your data based on string values. To frame this understanding, we use the String Java Object and inherit the methods from that class, including the ability to search on Java regex. A full treatment of regexes is too extensive to include in this guide.
Filter with built-in methods
In this example, a startsWith
is used with where
to limit the table to only values where X
starts with A
.
from deephaven import new_table
from deephaven.column import string_col, int_col
source = new_table([
string_col("X", ["AA", "A x", "BaA", "5A", "a3B", "A"]),
int_col("Y", [3, 2, 1, 5, 6, 4])
])
result = source.where(filters=["X.startsWith(`A`)"])
- source
- result
In this example, a startsWith
is used with ||
to limit the table to only values where X
starts with A
or X
starts with a
.
result = source.where(filters=["X.startsWith(`A`) || X.startsWith(`a`)"])
- result
startsWith
is just one of the methods available. See the Javadocs for more built-in methods.
Filter with regex
If there is no pre-built method to search the string for your desired results, you can filter with regex.
In this example, a RegexFilter
is used to limit the table to values where X
has three characters.
dh_regex_filter = jpy.get_type("io.deephaven.engine.table.impl.select.RegexFilter")
result = source.where([dh_regex_filter("X", "...")])
- result
In this example, a RegexFilter
is used to limit the table to values where X
has any (zero or more) characters with a digit.
result = source.where([dh_regex_filter("X", ".*[0-9].*")])
- result