Ultimate Table Operations Cheat Sheet
Access Data
Tip
The import and named tables in these lines are used throughout this page, and should be run first.
Sourcing Tables
Merging Tables
Viewing metadata of table
Printing information
Filtering
Most queries benefit by starting with filters. Less data generally means better performance.
For SQL developers : In Deephaven, Joins are not a primary operation for filtering. Use where(), whereIn(), and whereNotIn().
Note
Backticks ` are used for strings and single quotes ' are used for timestamps and characters
Date & Time examples
Filtering first by partitioning values (which is most often the set of Dates) is both a best practice, and necessary for some downstream use cases.
String Examples
Number examples
Multiple Filters
WhereIn / WhereNotIn
Nulls and NaNs
Head and Tail
Sort
Select And Create New Columns
Option 1: Choose and add new columns -- Calculate and write to memory
Use select() and update() when it is expensive to calculate or accessed frequently.
Option 2: Choose and add new columns -- Reference a formula and calc on the fly
Use view() and updateView() when formula is quick or only a portion of the data is used at a time. Minimizes RAM used.
Option 3: Add new columns -- Reference a formula and calc on the fly
Use lazyUpdate() when there are a small-ish number of unique values; on-demand formula results are stored in cache and re-used.
Using the updateView
(or equivalently update, select, view) methods
Getting the row number
Doing Math
Handling Arrays
Calculate Percentile in a series
Manipulate Time And Calendars
Bin Data
Manipulate Strings
Use Ternaries; If-Thens
Create And Use A Custom Function
Manipulate Columns
Group And Aggregate
Simple Grouping
Un-Grouping
Aggregations
Join Data From Multiple Tables
For SQL developers: Joins in Deephaven are used to extend result sets by joining data from other tables, not as much for filtering.
Joins that get used a lot
Natural Join
Important
The right table of the join needs to have only one match based on the key(s).
Multiple keys
AJ (As-Of Join)
As-of joins are the time series joins vital to the capital markets' use cases. It is often wise to make sure the Right-table is sorted (based on the key). aj is designed to find "the exact match" of the key or "the record just before". For timestamp aj-keys, this means "that time or the record just before.
Less common joins
Other Useful Methods
Use columns as arrays and cells as variable
Read and write csv
It is very easy to import CSVs via the Code Studio UI and to export any Table to a CSV via the Table Widget (in Code Studios and Dashboard Tables).
CSVs imported from the client need to be done via Code Studio UI.
CSVs imported from a server-side directory should be done via the script below.
Write a Table to a Namespace
Though the below uses a namespace called ExampleNamespace, it is best practice for teams to establish a naming protocol for these directories.
Do Cum-Sum and Rolling Average
Another Example of Creating a Rolling Sum
Setting up an EMA
Use CumulativeUtil
Use Numpy And Pandas
Format Tables
Datetime Formatting
Number Formatting
Color Formatting
List of canned colors found here.
Plot Programmatically
Substantial documentation about plotting exists. The following example intends to show the basics, and everything about styling and labeling is omitted.
Note
Run the following first, to create some tables to use in the plot examples below.
Time series plots
Multiple series on the same axis
Bar chart
Plot-by-some key
Stacked Area
Area graph
Business time
Scatter
Histogram
Open-high-low-close
Error Bar Plot
One-click plotting
One-click plotting allows a script writer to designate a key that users can enter into a one-click or linker experience, thereby toggling the plot to filter on the fly.
Users will need to add an "Input Filter" or configure the Linker tool to their UI to manipulate the One-Click plots. See web controls
Another one-click example