Intro Guide to Working With Deephaven’s Query Language

Programmers and developers are always striving to make clean and efficient code. Creating the best version of a program often leads to a lot of frustration because doing this requires a deep knowledge of various functions to get to the programer’s end goal. However, this is not the case when coding within the Deephaven database. This is because the functions in the Deephaven Query Language, or DQL for short, are tailor-made for manipulating data in tables and plotting that data. Users can use one premade function to achieve a result that in most languages would take multiple functions to accomplish. DQL is always accessible to Deephaven users regardless of if they are coding in Python or Groovy. This article highlights some basic functions and illustrates how staying within DQL can be more efficient and produce better results than trying to code in straight Python.

The Deephaven Query Language

When I first started coding in the Deephaven database I used mainly the Python language in my programs because I didn't yet realize how much DQL could help me in my coding endeavors. After learning many of Deephaven's distinctive functions, my eyes were open to the new coding possibilities. Before that, my programs were filled with many unnecessary loops and I was pulling data out of the query scope that did not need to be pulled out. Once I switched over to DQL, my code was easier for readers to follow, looked cleaner overall, and ran in less time.

Old Python Code vs Improved DQL Code

To see the direct comparison between the Python language and DQL, I made the following two programs. Both of these programs create a new table with row values equal to the index of that row. Then a cumulative sum for a column is done and that value is saved as a new column. However, the first one uses the Python language and the second one uses DQL.

Python

from deephaven.TableTools import emptyTable

def pythonCumulativeSum(array):
    sum = 0
    for i in range(len(array)):
        sum += array[i]
    return sum

empty = emptyTable(10). \
    update("Column = i")
pythonArray = empty.getColumn("Column").getDirect()
pythonSum = pythonCumulativeSum(pythonArray)
empty = empty.update("Sum = pythonSum")

DQL

from deephaven.TableTools import emptyTable

def queryCumulativeSum(array):
    sum = 0
    for i in range(array.size()):
        sum += array.get(i)
    return sum

empty = emptyTable(10). \
    update("Column = i"). \
    by(). \
    update("Sum = (int)queryCumulativeSum.call(new Object[]{Column})"). \
    ungroup()

img

Besides being more elegant and efficient, the DQL program runs faster than the Python program. The Python-only query takes longer because the data in a table cannot be manipulated while the data is in the table - a column must first be taken out of the query scope, manipulated with the Python language, and then returned to the query scope. This process can be very time consuming and leads to extra, unnecessary lines of code. In contrast, using DQL means that code never has to leave the query scope. Whatever a user’s language preference is, all programmers strive for reduced runtime and simplicity. Using the DQL accomplishes this.

Useful DQL Functions

After I learned most of the beginner level functions within DQL, I realized that there are some fundamental functions that all Deephaven users should know. You'll notice these in my program above. I hope these functions help you as much as they helped me!

emptyTable()

The foundation of any Deephaven program is creating tables. I found that this can be done with the emptyTable(numRows) function:

from deephaven.TableTools import emptyTable
    empty = emptyTable(10). \
    update("Column = i")

Once created, users can update the table and add columns and other useful data.

img

call()

The call() function, which enables users to call functions that they created. In order to properly call a function using DQL, the inputs for the code should look like: call(new Object[]{x}), where x is the variable being imputed into the user-created function. This function call will always return a data type of Object. In order to get a return with a different data type (for example, double), the full function call would look like:

Column = (double)arraySize.call(new Object[]{Column})

This also works with int and other data types.

size()

The size() function works with two different input types. The first input type is a table, which returns the number of rows in the table. The second input type is an array within a table, which returns the number of elements within that array. This function is very similar to the Python len() function, but the size() function works within the query scope while len() does not.

from deephaven.TableTools import emptyTable

def arraySize(array):
    return array.size()

empty = emptyTable(10). \
    update("Column = i", "Size = empty.size()"). \
    by(). \
    update("ArraySize = arraySize.call(new Object[]{Column})")

img

get()

The get() function works with an array within a table or a column in a table. The input for the get() function is the index of the variable that the user wants to retrieve from the array. The function returns the element at that location in the array. The Python equivalent of this function is the [x] function.

from deephaven.TableTools import emptyTable

def cumulativeSum(array):
    sum = 0
    for i in range(array.size()):
        sum += array.get(i)
   return sum

empty = emptyTable(10). \
    update("Column = i"). \
    by().
    update("Sum = cumulativeSum.call(new Object[]{Column})")

img

Ternary If Statement

One important part of the DQL that I learned was the if-then statement format for the DQL. This format is x ? y : z. In words, this means "if x then y else z". So if x is true, then the function returns y and if x is false then the function returns z. In the example below, the if statement checks to see if the row is even or odd. If the row is even, then the function returns “Even”, and if the row is odd, then the function returns “Odd”.

from deephaven.TableTools import emptyTable

empty = emptyTable(10). \
    update("Column = i", "EvenOrOdd = (i % 2 == 0) ? `Even` : `Odd`")

img