Merge

Deephaven's merge method combines two or more tables into one aggregate table, which essentially stacks the tables one on top of the other:

t3 = merge(t1, t2)

  • The resulting table's rows will maintain the same order as the source tables. If the source tables tick, rows will be inserted within the merged table where they appear in the source rather than at the end.
  • The tables to be merged should have the same columns, and they should be of the same type.
  • Null inputs are skipped.
  • Python users should import the appropriate module: from deephaven.TableTools import merge

Examples

Let's start with the following source tables:

t1 = newTable(col("Source", "T1", "T1"), col("Val",1, 2))
t2 = newTable(col("Source", "T2", "T2"), col("Val", 24, 25))
t3 = newTable(col("Source", "T3", "T3"), col("Val", 13, 14))
from deephaven.TableTools import merge, mergeSorted, newTable, col

t1 = newTable(col("Source", "T1", "T1"), col("Val",1, 2))
t2 = newTable(col("Source", "T2", "T2"), col("Val", 24, 25))
t3 = newTable(col("Source", "T3", "T3"), col("Val", 13, 14))

Merge tables

The examples below merge the source tables in various ways. You can see that the tables may be individually listed or contained within an array, with the same results.

t4 = merge(t1,t2)

t5 = merge([t1,t2,t3])

ts=[t1, t2, t3]
t6 = merge(ts)

The example below uses mergeSorted to both merge the tables, and then sort the values rather than place them in stacked order.

t7 = mergeSorted("Val", t1, t2, t3)

Null values

When merging tables, null values are ignored.

In this case, both tables t6 and t7 will contain the same data as t1.

t8 = merge(null, t1)
t9 = merge([null, t1])
t8 = merge(None, t1)
t9 = merge([None, t1])

However, if there are only null values, the result table will return a null because it doesn't have enough information to produce a result table.

t10 = merge([null])
t10 = merge([None])

Reducing the amount of merge operations

The next two examples produce the same results; however, the second query is preferable as it requires only one merge, whereas the first one is doing a loop equivalent to t11 = merge(t5, merge(t4, merge(t3, merge(t2, merge(t1, null))))).

# Method 1: this works but is slightly inefficient

t11 = None
for i in range(5):
   t = newTable(col("Source", f"T{i}", f"T{i}"), col("Val",i, 10*i))
   t11 = merge(t11, t)
// Method 1: this works but is slightly inefficient
t11 = null
for(i=0; i<5; i++) {
   t = newTable(col("Source", "T${i}", "T${i}"), col("Val",i, 10*i))
   t11 = merge(t11, t)
}
# Method 2: a more efficient implementation
ts = []
for i in range(5):
   t = newTable(col("Source", f"T{i}", f"T{i}"), col("Val",i, 10*i))
   ts.append(t)
t12 = merge(ts)
// Method 2: a more efficient implementation
ts = []
for(i=0; i<5; i++) {
   t = newTable(col("Source", "T${i}", "T${i}"), col("Val",i, 10*i))
   ts.add(t)
}
t12 = merge(ts)

Refreshing data

The new rows are inserted in the order of the source table; in other words, rows from tt1 are inserted in the middle of table tt3, above the second stacked table, rather than at the end.

tt1 = db.timeTable("00:00:02").view("Table=`T1`", "I=i")
tt2 = db.timeTable("00:00:05").view("Table=`T2`", "I=i")

tt3 = merge(tt1,tt2)

Note

mergeSorted does not work with ticking tables at this time.