Discover, filter, sort, and decorate data
This section is designed to introduce you to the basics of the Deephaven Query Language. First we'll access data, then show you simple ways to manipulate that data within a query.
Access Data
Everything in Deephaven is a table or table update.
The syntax to query a static table of historical data (db.t
) is generally the same as to query an updating table of real-time data (db.i
).
Your First Query
Below is a query that opens a table of historical data and filters to the date and symbol you care about, and creates a new column based on that data.
stockTrades=db.t("LearnDeephaven", "StockTrades")\
.where("Date=`2017-08-25`", "Sym=`AAPL`")\
.updateView("Dollars = Size * Last")
To break this down in more detail, the Deephaven Query Language tells Deephaven to:
- open a historical table (
db.t
) - named "StockTrades" stored in the "LearnDeephaven" namespace,
- filter the data only to Apple trades on August 25, 2017,
- and create a Dollars column calculated from Size * Last.
Important
It is best practice to always filter the partitioning column (in this case "Date") when opening a table, and this filter should be first in order after the call to open the table as shown above.
Discover
Deephaven lets you discover namespaces and tables are available in two ways:
- Type
t = db.getCatalog()
and a table (“t”) will show up listing all of the stuff available to you. You can filter from there. - Use autocomplete while writing scripts.
Important
Persistent queries should always contain some sort of filter addressing any partitioned column of a source table:
db.t("Namespace", "TableName")
.where("inRange(Date, `2020-09-01`, `2020-09-08`)")
The GIF below shows autocomplete in the console to find the StockTrades table. It then also shows the user filtering the table to a new date.
Access, filter, sort and decorate data
You may want to filter, sort, or decorate your data:
t = db.t("LearnDeephaven" , "StockTrades")\
.where("Date=`2017-08-23`")\
.sort("Sym")\
.renameColumns("Symbol = Sym")\
.view("Timestamp", "Symbol", "Last", "Size")\
.head(10)
Producing...
The operations may be self-evident, but here is an explanation of the above:
-
where
- This is the dominant filtering mechanism. Our Filter page details advanced and conditional filter methods.
- Proper practices: Always start your queries with a filter if relevant or possible, particularly on “partitioning” columns. Use
meta = tableName.getMeta()
to find those...but usually "Date" is a great assumption.
-
sort
- The most basic sort arranges your data from A-Z.
- You can also
.sortDescending("FieldName")
. - Do multiple sorts →
.sort("ThisField", "ThenThatField")
. - .reverse just turns the head on its table -- in a cheap way (from a computer science POV).
-
renameColumns
- As obvious, simply changes the column name.
- Somewhat related →
.moveUpColumns("Desired1stColumn", "Desired2ndColumn")
-
view
- Show only a subset of the original columns - in an “on-demand / just-in-time” way.
view()
can include a calculation, like.view("C1", "C2", "C3 = C1 + C2")
.- The operations
.select("C1", "C2", "C3")
or.select("C1", "C2", "C3 = C1 + C2")
are similar, but preemptively retrieve the columns to memory. updateView()
is very similar toview()
... but it adds new columns to the table, rather than simply identifying the total list of columns you want to see.update()
is analogous toselect()
as presented in the bullet point above.- Very worth reading → For a complete how-to on choosing the right selection method, see Selection.
-
head
- Trim down your data. head returns the number of rows at the beginning of the table.
- Similarly .tail(10) to see the last 10 records.
- These work on ticking tables, which is an advantage of working in Deephaven.
- Also
.headPct(0.15)
and.tailPct(0.15)
to see the first or last 15% of rows. - And finally
.headBy(10, "SomeKey")
will yield the first 10 rows for each key in a table. Example: “First 10 trades for each Symbol” might be:firstTen = tradesTable.headBy(10, "Symbol")