Skip to main content
Version: Python

raj

raj, reverse-as-of join, joins data from a pair of tables - a left and right table - based upon one or more match columns. The match columns establish key identifiers in the left table that will be used to find data in the right table. Any data types can be chosen as keys.

When using raj, the first N-1 match columns are exactly matched. The last match column is used to find the key values from the right table that are closest to the values in the left table without going under the left value. For example, if the right table contains a value 5 and the left table contains values 4 and 6, the right table's 5 will be matched on the left table's 4.

The output table contains all of the rows and columns of the left table plus additional columns containing data from the right table. For columns appended to the left table, row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table, as defined above. If there is no matching key in the right table, appended row values are NULL.

Syntax

left.raj(table: Table, on: List[str], joins: List[str]=[]) -> Table

Parameters

ParameterTypeDescription
tableTable

The table data is added from (the right table).

onList[str]

Columns from the left and right tables used to join on.

  • ["A = B"] will join when column A from the left table matches column B from the right table.
  • ["X"] will join on column X from both the left and right table. Equivalent to "X = X".
  • ["X, A = B"] will join when column X matches from both the left and right tables, and when column A from the left table matches column B from the right table.

The first N-1 match columns are exactly matched. The last match column is used to find the key values from the right table that are closest to the values in the left table without going under.

joins optionalList[str]

Columns from the right table to be added to the left table based on key may be specified in this list:

  • [] will add all columns from the right table to the left table (default).
  • ["X"] will add column X from the right table to the left table as column X.
  • ["Y = X"] will add column X from right table to left table and rename it to be Y.

Returns

A new table containing all of the rows and columns of the left table plus additional columns containing data from the right table. For columns appended to the left table, row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table, as defined above. If there is no matching key in the right table, appended row values are NULL.

Examples

These examples look at stock quotes and trades. Quotes are the published prices and sizes people are willing to trade a security at, while trades are the prices and sizes of actual trades. raj is used to find the first quote immediately after.

The following example joins all quote columns onto the trade table.

from deephaven import new_table
from deephaven.column import string_col, int_col, double_col, datetime_col
from deephaven.time import to_datetime

trades = new_table([
string_col("Ticker", ["AAPL", "AAPL", "AAPL", "IBM", "IBM"]),
datetime_col("Timestamp", [to_datetime("2021-04-05T09:10:00 NY"), to_datetime("2021-04-05T09:31:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY")]),
double_col("Price", [2.5, 3.7, 3.0, 100.50, 110]),
int_col("Size", [52, 14, 73, 11, 6])
])

quotes = new_table([
string_col("Ticker", ["AAPL", "AAPL", "IBM", "IBM", "IBM"]),
datetime_col("Timestamp", [to_datetime("2021-04-05T09:11:00 NY"), to_datetime("2021-04-05T09:30:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY"), to_datetime("2021-04-05T17:00:00 NY")]),
double_col("Bid", [2.5, 3.4, 97, 102, 108]),
int_col("BidSize", [10, 20, 5, 13, 23]),
double_col("Ask", [2.5, 3.4, 105, 110, 111]),
int_col("AskSize", [83, 33, 47, 15, 5]),
])

result = trades.raj(table=quotes, on=["Ticker", "Timestamp"])

The following example illustrates joining on columns of different names as well as joining a subset of columns, some with renames.

from deephaven import new_table
from deephaven.column import string_col, int_col, double_col, datetime_col
from deephaven.time import to_datetime

trades = new_table([
string_col("Ticker", ["AAPL", "AAPL", "AAPL", "IBM", "IBM"]),
datetime_col("TradeTime", [to_datetime("2021-04-05T09:10:00 NY"), to_datetime("2021-04-05T09:31:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY")]),
double_col("Price", [2.5, 3.7, 3.0, 100.50, 110]),
int_col("Size", [52, 14, 73, 11, 6])
])

quotes = new_table([
string_col("Ticker", ["AAPL", "AAPL", "IBM", "IBM", "IBM"]),
datetime_col("QuoteTime", [to_datetime("2021-04-05T09:11:00 NY"), to_datetime("2021-04-05T09:30:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY"), to_datetime("2021-04-05T17:00:00 NY")]),
double_col("Bid", [2.5, 3.4, 97, 102, 108]),
int_col("BidSize", [10, 20, 5, 13, 23]),
double_col("Ask", [2.5, 3.4, 105, 110, 111]),
int_col("AskSize", [83, 33, 47, 15, 5]),
])

result = trades.raj(table=quotes, on=["Ticker", "TradeTime = QuoteTime"], joins=["Bid", "Offer = Ask"])