Skip to main content
Version: Python

Create tables in Deephaven

Deephaven excels as an interface for ingesting data, parsing it as a table, and manipulating the data. However, Deephaven also includes a range of versatile methods for creating tables from scratch. This guide will show you how to create tables in Deephaven using the following methods:

new_table

The most direct way to create a table in Deephaven is with the new_table method. This method initializes a table, and columns are placed into the table through one or more column methods such as int_col. Each column contains one data type. For example, int_col creates a column of Java primitive int values.

The following query creates a new in-memory table with a string column and an int column.

from deephaven import new_table

from deephaven.column import string_col, int_col

result = new_table(
[
string_col(
"Name_Of_String_Col", ["Data String 1", "Data String 2", "Data String 3"]
),
int_col("Name_Of_Int_Col", [4, 5, 6]),
]
)

This produces a table with a String column, an integer column, and three rows.

For a more in-depth look at creating tables with new_table, see the new_table guide.

empty_table

Unlike a table created with new_table, an empty table does not contain columns by default. The empty_table function takes a single argument - an int representing the number of rows in the new table. The resulting table has no columns and the specified number of rows:

from deephaven import empty_table

table = empty_table(10)

Calling empty_table on its own generates a table with no data, but it can easily be populated with columns and data using update or another selection method. This can be done in the same line that creates the table, or at any time afterward.

In the following example, we create a table with 10 rows and a single column X with values 0 through 9 by using the special variable i to represent the row index. Then, the table is updated again to add a column Y with values equal to X squared:

from deephaven import empty_table

table = empty_table(10).update("X = i")

table = table.update("Y = X * X")

For a more in-depth look at creating tables with empty_table, see the empty_table guide.

ring_table

The ring_table method allows you to create a ring table from a blink table or an append-only table.

In this example, we'll create a ring table with a three-row capacity from a simple append-only time table.

from deephaven import time_table, ring_table

source = time_table("PT00:00:01")
result = ring_table(parent=source, capacity=3)

img

For a more in-depth look at creating tables with ring_table, see the ring_table guide.

time_table

A time table is a ticking, in-memory table that adds new rows at a regular, user-defined interval. Its sole column is a timestamp column.

The time_table method creates a table that ticks at the input period. The period can be passed in as nanoseconds:

from deephaven import time_table

minute = 1_000_000_000 * 60
result = time_table(period=minute)

Or as a duration string:

from deephaven import time_table

result = time_table(period="PT2S")

For an in-depth look at creating tables with time_table, see the time_table guide.

input_table

Input tables allow users to enter new data into tables in two ways: programmatically and manually through the UI.

In the first case, data is added to a table with add, an input table-specific method similar to merge. In the second case, data is added to a table through the UI by clicking on cells and typing in the contents, similar to a spreadsheet program like MS Excel.

Input tables can be:

  • Append-only - any entered data is added to the bottom of the table.
  • Keyed - contents can be modified or deleted; allows access to rows by key.
  • In this guide, we'll create some simple append-only input tables. For a full guide to input_table, see the input_table guide.

Here, we'll create an input table from a pre-existing table:

from deephaven import empty_table, input_table

source = empty_table(10).update(["X = i"])

result = input_table(init_table=source)

Here, we will create an input table from a list of column definitions. Column definitions must be defined in a dictionary.

from deephaven import input_table
from deephaven import dtypes as dht

my_col_defs = {"Integers": dht.int32, "Doubles": dht.double, "Strings": dht.string}

result = input_table(col_defs=my_col_defs)

The resulting table is initially empty, and ready to receive data.

For a more in-depth look at creating tables with input_table, see the input_table guide.

Replay historical data

Deephaven's TableReplayer allows you to replay historical data as if it were live data. This is useful for experimenting with live tables and other purposes. In this guide, we'll show you how to replay historical data as real-time data based on timestamps in a table.

Get a historical data table

To replay historical data, we need a table with timestamps in DateTime format. Let's grab one from Deephaven's examples repository. We'll use data from a 100 km bike ride in a file called metriccentury.csv.

from deephaven import read_csv

metric_century = read_csv(
"https://media.githubusercontent.com/media/deephaven/examples/main/MetricCentury/csv/metriccentury.csv"
)

Replay the data

The data is in memory. We can replay it with the following steps:

  • Import TableReplayer.
  • Set a start and end time for data replay. These times correspond to those in the table itself.
  • Create the replayer using the start and end time.
  • Call add_table to prepare the replayed table.
    • This takes two inputs: the table and the DateTime column name.
  • Call start to start replaying data.
from deephaven.replay import TableReplayer
from deephaven.time import to_j_instant

start_time = to_j_instant("2019-08-25T15:34:55Z")
end_time = to_j_instant("2019-08-25T17:10:22Z")

replayer = TableReplayer(start_time, end_time)
replayed_table = replayer.add_table(metric_century, "Time")
replayer.start()

The data will now replay in real time.

For a more in-depth look at replaying historical data, see the guide on replaying data.