The Fluent Interface

The Deephaven client has numerous methods that take expressions (e.g. TableHandle::select or TableHandle::view), boolean conditions (e.g. TableHandle::where), column names (e.g. TableHandle::sort), and so on. These methods generally come in two flavors: a string version and a more structured typed version. The reason both flavors exist is because the string versions are convenient and simple to use for small programs, whereas the typed versions are typically more maintainable in larger programs.

Consider these two ways of doing a TableHandle::where, using literal strings versus using the “fluent” syntax.

auto table = tableManager.fetchTable("trades");
auto filtered1 = table.where("ImportDate == `2017-11-01` && Ticker == `AAPL`");

auto (importDate, ticker) = table.getColumns<StrCol, StrCol>("ImportDate", "Ticker");
var filtered2 = table.where(importDate == "2017-11-01" && ticker == "AAPL");

The advantage of the filtered1 query is that it is simple and compact. On the other hand, the advantage of the filtered2 query is that it is able to do much more error checking at compile time. Consider the following query syntax errors:

// typo in Ticker
auto filtered1 = table.where(
    "ImportDate == `2017-11-01` && Thicker == `AAPL`");
// nonsensical string multiplication
auto filtered1 = table.where(
    "ImportDate == `2017-11-01` && Ticker * 12 == `AAPL`");
// extra closing parenthesis
auto filtered1 = table.Where(
    "(ImportDate == `2017-11-01`) && (Ticker == `AAPL`))");

Because the code is using the literal string syntax, these errors would not be caught until the server attempted to parse and execute them. However, none of the corresponding fluent versions will even compile!

auto (importDate, ticker) =
    table.getColumns<StrCol, StrCol>("ImportDate", "Ticker");
// typo in Ticker
auto filtered2 = table.where(
    importDate == "2017-11-01" && thicker == "AAPL");
// nonsensical string multiplication
auto filtered2 = table.where(
    importDate == "2017-11-01" && ticker * 12 == "AAPL");
// extra closing parenthesis
auto filtered2 = table.where(
    (importDate == "2017-11-01") && (ticker * 12) == "AAPL"));

How the fluent syntax works

The fluent syntax uses certain C++ types along with operator overloading to build up an abstract syntax tree of your expression on the client side. Then, library methods pass that tree to the server to be executed. Because the fluent syntax is built on top of C++ syntax, it needs to be legal according to the rules of C++. One advantage of following C# syntax is that many potential errors are caught at compile time, or even sooner, e.g. by the programmer’s IDE.

Consider the following code fragment:

auto (a, b, c, d) =
    table.getColumns<NumCol, NumCol, NumCol, NumCol>("A", "B", "C", "D");
auto filtered = table.where(a + b + c <= d);

The transformation of the expression into an abstract syntax tree is done automatically by the compiler. Basically, infix operators like + and <= are transformed into method calls, and certain implicit type conversions are performed. Below is a sketch of the equivalent code after the infix operators are transformed to method calls:

NumericExpression temp1 = operator+(a, b);
NumericExpression temp2 = operator+(temp1, c);
BooleanExpression temp3 = operator<=(temp2, d);

Building expressions with the fluent syntax

The fluent syntax is designed to capture the kinds of “natural” expressions one would write in a programming language. Rather than formally describing the syntax here, we instead provide an informal description.

There are basically four kinds of expressions in the system: NumericExpression, StringExpression, DateTimeExpression, and BooleanExpression. These model the four types of expressions we want to represent in the system.

In typical usage, client programs do not explicitly declare variables of these types. Instead, these objects are created as anonymous temporaries (as the intermediate results of overloaded operators) which are then consumed by other operators or by Deephaven methods like TableHandle::select or TableHandle::where.

Local vs Remote Evaluation

Because the fluent syntax interoperates with ordinary C++ expression syntax, it might not be readily apparent which parts of a complicated C++ expression are executed locally on the client machine, and which parts are participating in an expression tree to be evaluated on the server. Generally, the rules are:

Evaluated locally

  • Numeric literals

  • Variables

  • Method calls

  • Unary and binary operators involving the above

Evaluated at the server

  • Column terminals

  • Local values implicitly converted into Fluent values

  • Unary operators, binary operators, and certain special methods involving Fluent expressions

Note that both of these definitions are intentionally recursive in nature. Also note that when one of the arguments to a binary operator is a Fluent expression, the other argument will be implicitly converted to a Fluent expression.

Consider the following examples:

auto table = tableManager.fetchTable("trades");
auto (importDate, ticker, close) =
    table.GetColumns<StrCol, StrCol, NumCol>("ImportDate", "Ticker", "Close");
auto t0 = table.where(importDate == "2017-11-01" && ticker == "AAPL");

var x = 1;

int myFunc(int arg)
{
    return arg + 10;
}

// Equivalent Deephaven Code Studio expression is "Result = 100 + Close"
var t1a = t0.select((100 + close).as("Result"));
// Equivalent Deephaven Code Studio expression is "Result = 300 + Close"
var t2a = t0.select((100 + 200 + close).as("Result"));
// Equivalent Deephaven Code Studio expression is "Result = 101 + Close"
var t3a = t0.select((100 + x + close).as("Result"));
// Equivalent Deephaven Code Studio expression is "Result = 111 + Close"
var t4a = t0.select((100 + myFunc(x) + close).as("Result"));

A binary operator with at least one NumericExpression yields a NumericExpression. Because binary operators like left-to-right associativity, mathematically equivalent but differently-ordered expressions get sent to the server as a different tree:

// Equivalent Deephaven Code Studio expression is "Result = Close + 100"
auto t1b = t0.select((close + 100).as("Result"));
// Equivalent Deephaven Code Studio expression is "Result = (Close + 100) + 200"
auto t2b = t0.select((close + 100 + 200).as("Result"));
// Equivalent Deephaven Code Studio expression is "Result = (Close + 100) + 1"
auto t3b = t0.select((close + 100 + x).as("Result"));
// Equivalent Deephaven Code Studio expression is "Result = (Close + 100) + 11"
auto t4b = t0.Select((close + 100 + myFunc(x)).as("Result"));

Note that the library is does not collapse (Close + 100) + 11 into the mathematically-equivalent (Close + 111). This difference is largely of academic interest, because the final result is the same due to the commutative property of addition. It would probably matter only in cases of numeric over/underflow.

Building Fluent Expressions

In more advanced use cases, users may want to write methods that derive fluent expressions from other fluent expressions. Some programming languages call such methods “combinators”. In this simple example we write an add5 function that yields the fluent expression e + 5 for whatever expression e is passed into it:

NumericExpression add5(NumericExpression e)
{
    return e + 5;
}

// Equivalent Deephaven Code Studio expression is "Result = (Close * Volume) + 5"
auto t1 = t0.select(add5(close * volume).as("Result"));

NumericExpression

NumericExpression objects are either Numeric terminals or the result of an operator applied to some combination of Numeric terminals and NumericExpression objects.

Numeric terminals are:

  • C# numeric literals of various primitive types such as 3 and -8.2

  • Client-side numeric variables such as int x` or double x

  • Client-side numeric expressions such as x * 2 + 5

  • Numeric columns, which are typically obtained from a call like getCols.

The operators are the the usual unary arithmetic operators +, -, ~, and the usual binary operators +, -, *, /, %, &, |, ^.

In this example, the table t1 contains two columns: the Ticker column and a Result columns which holds the product Price * Volume + 12. Notice that in a TableHandle::select statement, when we are creating a new column that is the result of a calculation, we need to give that new column a name (using the Expression::as method). In general, the fluent syntax expr.as("X") corresponds to Deephaven Code Studio expression X = expr.

auto table = tableManager.fetchTable("trades");
auto (importDate, ticker, close, volume) =
    table.getColumns<StrCol, StrCol, NumCol, NumCol>("ImportDate", "Ticker",
    "Close", "Volume");
auto t0 = table.where(importDate == "2017-11-01" && ticker == "AAPL");
auto t1 = t0.select(ticker, (close * volume).As("Result"));
// string literal equivalent
auto t1_literal = t0.Select("Ticker", "Result = Close * Volume");

StringExpression

StringExpression objects are either String terminals or the result of the + operator applied to some combination of String terminals and StringExpression objects.

String terminals are:

  • C++ numeric literals like "hello".

  • Client-side string variables such as string x.

  • Client-side string expressions such as x + "QQQ"

  • String columns, which are typically obtained from a call like getCols.

Example:

auto t2 = t0.select(ticker, (ticker + "XYZ").as("Result"));
auto t2_literal = t0.select("Ticker", "Result = Ticker + `XYZ`");

StringExpression provides four additional methods that work on StringExpression objects. These operations have the semantics described in the Deephaven documentation, and they yield BooleanExpression (described in the BooleanExpression subsection). For example:

var t1 = t0.where(ticker.startsWith("AA"));
var t1_literal = t0.where("ticker.startsWith(`AA`)");
var t2 = t0.where(ticker.matches(".*P.*"));
var t2_literal = t0.where("ticker.matches(`.*P.*`)");

DateTimeExpression

DateTime terminals are:

  • C++ string literals, variables or string expressions in Deephaven StringExpression format, e.g. "2020-03-01T09:45:00.123456 NY".

  • Client-side variables/expressions of type StringExpression

StringExpression is the standard Deephaven Date/Time type, representing nanoseconds since January 1, 1970 UTC.

BooleanExpression

BooleanExpression objets can be used to represent expressions involving boolean-valued columns (e.g. !boolCol1 || boolCol2) but more commonly, they are used to represent the result of relational operators applied to other expression types. BooleanExpression objects support the unary !, as well as the binary operators && and || and their cousins & and |.

Note that the shortcutting operators && and || do not exhibit their usual shortcutting behavior when used with Deephaven fluent expressions. Because the value of either side of the expression isn’t knowable until it is evaluated at the server, it is not possible (nor even particularly meaningful) to do shortcutting on the client. As a consequence of this, && is a synonym for the (non-shortcutting) boolean & operator; likewise || is a synonym for the non-shortcutting boolean | operator.

For example, in t1 = t0.where(col0 < 5 && col1 > 12) we would send the whole expression to the server for evaluation. There would be no attempt to first determine the “truth” of col0 < 5 (a concept that doesn’t even make much sense anyway in the context of a full column of data) in order to try shortcut the evaluation of col1 > 12.

This example creates two boolean-valued columns and does simplistic filtering on them:

// TODO(kosak): This example doesn't work yet. Need BoolCol and boolean literals
auto empty = manager.emptyTable(5, {}, {});
auto t = empty.update( ((BooleanExpression)true).as("A"),
    ((BooleanExpression)false).as("B"));
// Deephaven Code Studio equivalent
auto t_literal = empty.Update("A = true", "B = false");
auto (a, b) = t.GetColumns<BoolCol, BoolCol>("A", "B");
auto t2 = t.where(a);
auto t3 = t.where(a && b);

More commonly, BooleanExpression are created as the result of relational operators on other expressions. For example we might say

std::vector<int> aValues{10, 20, 30};
std::vector<std::string> sValues{"x", "y", "z"};
TableMaker tm;
tm.addColumn("A", aValues);
tm.addColumn("S", sValues);
auto temp = tm.MakeTable(manager);
auto a = temp.getNumCol("A");
auto result = temp.where(a > 15);

Here a > 15 applies the > operator to two NumericExpression objects yielding a BooleanExpression suitable for passing to the TableHandle method and being evaluated on the server. The library supports the usual relational operators (<, <=, ==, >=, >, !=) on NumericExpression, StringExpression, and DateTimeExpression; meanwhile BooleanExpression itself supports only == and !=.

Column Terminals

A Column Terminal is used to represent a database column symbolically, so it can be used in a fluent invocation such as t.where(a > 5). To do this, the program needs to know the name of the database column (in this example, “A”) as well as its type (in this example, NumCol).

auto a = temp.getCol<NumCol>("A");

The Column Terminal types are:

  • NumCol

  • StrCol

  • DateTimeCol

  • BoolCol

Note that the single fluent type NumCol stands in for all the numeric types (short, int, double, and so on). This does not mean that the server represents all these types as the same thing, or that there is some kind of loss of precision involved. Rather it is simply a reflection of the fact that the numeric types generally interoperate with each other and support all the same operators; from the point of view of the fluent layer, when building an abstract syntax tree for an expression like x + y for evaluation at the server, it’s not necessary to know the exact types of x and y at this point, other than knowing that they behave like numbers.

The syntax for creating a single Column Terminal is

auto col = table.getXXX(name);

where getXXX is one of ` getNumCol, getStrCol, getDateTimeCol, or getBoolCol, and name is the name of the column.

To conveniently bind more than one column at a time, the program can use getCols.

For example this statement binds three columns at once:

auto (importDate, ticker, close) =
  table.getCols<StrCol, StrCol, NumCol>("ImportDate", "Ticker", "Close");

SelectColumns

A SelectColumn is an object suitable to be passed to a select, update, view, or updateView method. It either needs to either refer to an already-existing column, or it is an expression bound to a column name, which will cause a new column to be created. Examples:

// Assume "close" is already a column, so we can use it directly
auto t1 = t0.select(close);
// "100 + close" is an expression; to turn it into a SelectColumn
// we need to bind it to a new column name with the "as" method.
auto t2 = t0.select((100 + close).as("Result"));
// The above would be expressed in the Deephaven Code Studio as:
var t2_literal = t0.select("Result = 100 + Close")