aj
aj
, as-of join, joins data from a pair of tables - a left and right table - based upon one or more match columns. The match columns establish key identifiers in the left table that will be used to find data in the right table. Any data types can be chosen as keys.
When using aj
, the first N-1
match columns are exactly matched. The last match column is used to find the key values from the right table that are closest to the values in the left table without going over the left value. For example, if the right table contains a value 5
and the left table contains values 4
and 6
, the right table's 5
will be matched on the left table's 6
.
The output table contains all of the rows and columns of the left table plus additional columns containing data from the right table. For columns optionally appended to the left table, row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table, as defined above. If there is no matching key in the right table, appended row values are NULL
.
Syntax
left.aj(table: Table, on: List[str], joins: List[str]=[]) -> Table
Parameters
Parameter | Type | Description |
---|---|---|
table | Table | The table data is added from (the right table). |
on | List[str] | Columns from the left and right tables used to join on.
The first |
joins optional | List[str] | Columns from the right table to be added to the left table based on key may be specified in this list:
|
Returns
A new table containing all of the rows and columns of the left table, plus additional columns containing data from the right table. For columns appended to the left table), row values equal the row values from the right table where the keys from the left table most closely match the keys from the right table, as defined above. If there is no matching key in the right table, appended row values are NULL
.
Examples
These examples look at stock quotes and trades. Quotes are the published prices and sizes people are willing to trade a security at, while trades are the prices and sizes of actual trades. aj
is used to find the quote at the time of a trade.
The following example joins all quote columns onto the trade table.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col, datetime_col
from deephaven.time import to_datetime
trades = new_table([
string_col("Ticker", ["AAPL", "AAPL", "AAPL", "IBM", "IBM"]),
datetime_col("Timestamp", [to_datetime("2021-04-05T09:10:00 NY"), to_datetime("2021-04-05T09:31:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY")]),
double_col("Price", [2.5, 3.7, 3.0, 100.50, 110]),
int_col("Size", [52, 14, 73, 11, 6])
])
quotes = new_table([
string_col("Ticker", ["AAPL", "AAPL", "IBM", "IBM", "IBM"]),
datetime_col("Timestamp", [to_datetime("2021-04-05T09:11:00 NY"), to_datetime("2021-04-05T09:30:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY"), to_datetime("2021-04-05T17:00:00 NY")]),
double_col("Bid", [2.5, 3.4, 97, 102, 108]),
int_col("BidSize", [10, 20, 5, 13, 23]),
double_col("Ask", [2.5, 3.4, 105, 110, 111]),
int_col("AskSize", [83, 33, 47, 15, 5]),
])
result = trades.aj(table=quotes, on=["Ticker", "Timestamp"])
- trades
- quotes
- result
The following example illustrates joining on columns of different names as well as joining a subset of columns, some with renames.
from deephaven import new_table
from deephaven.column import string_col, int_col, double_col, datetime_col
from deephaven.time import to_datetime
trades = new_table([
string_col("Ticker", ["AAPL", "AAPL", "AAPL", "IBM", "IBM"]),
datetime_col("TradeTime", [to_datetime("2021-04-05T09:10:00 NY"), to_datetime("2021-04-05T09:31:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY")]),
double_col("Price", [2.5, 3.7, 3.0, 100.50, 110]),
int_col("Size", [52, 14, 73, 11, 6])
])
quotes = new_table([
string_col("Ticker", ["AAPL", "AAPL", "IBM", "IBM", "IBM"]),
datetime_col("QuoteTime", [to_datetime("2021-04-05T09:11:00 NY"), to_datetime("2021-04-05T09:30:00 NY"), to_datetime("2021-04-05T16:00:00 NY"), to_datetime("2021-04-05T16:30:00 NY"), to_datetime("2021-04-05T17:00:00 NY")]),
double_col("Bid", [2.5, 3.4, 97, 102, 108]),
int_col("BidSize", [10, 20, 5, 13, 23]),
double_col("Ask", [2.5, 3.4, 105, 110, 111]),
int_col("AskSize", [83, 33, 47, 15, 5]),
])
result = trades.aj(table=quotes, on=["Ticker", "TradeTime = QuoteTime"], joins=["Bid", "Offer = Ask"])
- trades
- quotes
- result