Skip to main content
Version: Python

Create and populate static tables with new_table and empty_table

Deephaven is often used to read table data from Parquet, Kafka, or other external sources, but it can also generate static or ticking tables from scratch. There are two functions for creating static tables: empty_table and new_table. This guide will show you how to use these functions to create static tables and columns, and how to add data to those tables.

empty_table

The empty_table function takes a single argument - an int representing the number of rows in the new table. The resulting table has no columns and the specified number of rows. In the following example, we create a table with 10 rows and no columns:

from deephaven import empty_table

table = empty_table(10)

Calling empty_table on its own generates a table with no data, but it can easily be populated with columns and data using update or another selection method. This can be done in the same line that creates the table, or at any time afterward.

In the following example, we create a table with 10 rows and a single column X with values 0 through 9 by using the special variable i to represent the row index. Then, the table is updated again to add a column Y with values equal to X squared:

from deephaven import empty_table

table = empty_table(10).update("X = i")

table = table.update("Y = X * X")

Deephaven's update and other selection methods can take user-defined functions as arguments and harness the power of the Deephaven Query Language to handle complex data transformations. For more information, see the select, view, and update guide.

DQL supports logical operators, Java functions, user-defined functions, and more. In the following example, we'll create a table with 100 rows, then create four columns:

from deephaven import empty_table

source = empty_table(100).update(
formulas=[
# mathematical operations are supported
"X = 0.1 * i",
# many built-in functions are provided to cover common operations
"SinX = sin(X)",
# in-line logical operations and comparison operators are supported
"PositiveSinX = SinX > 0 ? true : false",
# and they can all be combined
"TransformedX = PositiveSinX == true ? 5 * X : 0",
]
)

DQL is a powerful, versatile tool for table transformations. For more information, see the formula documentation.

new_table

Deephaven's new_table function allows you to create a new table and manually populate it with data. new_table accepts a list of Deephaven column objects. The following query creates a new table with a string column and an int column.

from deephaven import new_table

from deephaven.column import string_col, int_col

result = new_table(
[
string_col(
"NameOfStringCol", ["Data String 1", "Data String 2", "Data String 3"]
),
int_col("NameOfIntCol", [4, 5, 6]),
]
)

Here, we create an example with two integer columns. Then, we update the table to add a new column X via a formula that uses a variable, a user-defined function, an auto-imported Java function, and various operators:

from deephaven import new_table
from deephaven.column import int_col

var = 3


def f(a, b) -> int:
return a + b


source = new_table([int_col("A", [1, 2, 3, 4, 5]), int_col("B", [10, 20, 30, 40, 50])])

result = source.update(formulas=["X = A + 3 * sqrt(B) + var + f(A, B)"])

Array columns

new_table can also be used to create columns from Python Sequences. This cannot be done with methods like int_col and string_col. In these cases, you must use the InputColumn class directly along with the dtypes package.

The following example creates a new table with a single integer array column.

from deephaven.column import InputColumn
from deephaven import new_table
from deephaven import dtypes
import numpy as np

int_array = dtypes.array(dtypes.int32, np.array([1, 2, 3], dtype=np.int32))
int_array_col = InputColumn("IntArrayCol", dtypes.int32_array, input_data=[int_array])

source = new_table([int_array_col])

Create new columns in a table

Here, we will go into detail on creating new columns in your tables.

Selection methods -- such as select, view, update, update_view, and lazy_update -- and formulas are used to create new columns:

  • The selection method determines which columns will be in the output table and how the values are computed.
  • The formulas are the recipes for computing the cell values.

In the following examples, we use a table of student test results. Using update, we create a new Total column containing the sum of each student's math, science, and art scores. Notice that update includes the columns from the source table in the output table.

from deephaven import new_table
from deephaven.column import string_col, int_col

scores = new_table(
[
string_col("Name", ["James", "Lauren", "Zoey"]),
int_col("Math", [95, 72, 100]),
int_col("Science", [100, 78, 98]),
int_col("Art", [90, 92, 96]),
]
)

total = scores.update(formulas=["Total = Math + Science + Art"])

Now we make the example a little more complicated by adding a column of average test scores.

average = scores.update(formulas=["Average = (Math + Science + Art) / 3 "])

For the next example, we have the students' test results in various subjects and the class averages. We want to see which students scored higher than the class average. We can use the select method to create a table containing the Name and Subject columns from the source table, plus a new column indicating if the score is above average.

from deephaven import new_table
from deephaven.column import string_col, int_col

class_average = new_table(
[
string_col(
"Name",
[
"James",
"James",
"James",
"Lauren",
"Lauren",
"Lauren",
"Zoey",
"Zoey",
"Zoey",
],
),
string_col(
"Subject",
[
"Math",
"Science",
"Art",
"Math",
"Science",
"Art",
"Math",
"Science",
"Art",
],
),
int_col("StudentAverage", [95, 100, 90, 72, 78, 92, 100, 98, 96]),
int_col("ClassAverage", [86, 90, 95, 86, 90, 95, 86, 90, 95]),
]
)

above_average = class_average.select(
formulas=["Name", "Subject", "AboveAverage = StudentAverage > ClassAverage"]
)

Column types

Deephaven supports the following column types:

Data TypeMethod
booleanbool_col
bytebyte_col
charchar_col
java.time.Instantdatetime_col
doubledouble_col
floatfloat_col
intint_col
java.lang.Objectjobj_col
longlong_col
Python Objectpyobj_col
shortshort_col
Stringstring_col

As demonstrated above, Deephaven can handle types outside of this list by using InputColumn.