Use table snapshots
Deephaven tables are updated on a cycle - when data sources tick, all tables that are dependent on the data source are updated. You may want to reduce the update frequency for a given table to either improve query performance or prevent excessive changes from propagating to downstream consumers. Table snapshots also provide a mechanism for creating a static table from a ticking table.
Ticking at a Specified Interval
Tables like market data may tick continually at a high frequency, yet parts of your query may operate more efficiently at a lower frequency. For example, if you are calculating desired positions for an OMS consumer you may want to have a slower update frequency to avoid rapidly changing orders. You may have complex operations downstream of the market data table and by throttling the frequency of updates you can reduce the required CPU to compute them.
The cost of the snapshot is that at each interval an in-memory copy of the source table is made. Thus, the snapshotted columns must be read from disk or have their formulas computed. Additionally, the snapshot result requires enough memory to store the entire table in memory.
In this example, the "StockTrades" and "volume" tables tick on the queries default interval (generally once per second). The lastBy
and sumBy
operations are calculated on each tick.
However, the "snappedPrice" and "snappedVolume" tables only tick each minute. The downstream sortDescending and naturalJoin thus require fewer recalculations than if they were computed each second. This is especially important when computationally expensive models are embedded into a query.
# The allStockTrades table should be ticking roughly once per second.
# In the demo system, only the historical data is available.
# For a real ticking dataset, you would replace “db.t” with “db.i”.
allStockTrades=db.t("LearnDeephaven", "StockTrades").where()
# The lastTrades and volume tables would also tick each second.
lastTrades=allStockTrades.lastBy("Sym").renameColumns("TradeTime=Timestamp")
volume=allStockTrades.view("Sym", "Size").sumBy("Sym").renameColumns("Volume=Size")
# This time table adds a row once each minute.
tt=timeTable("00:01:00")
# The snappedPrice and snappedVolume tables tick only when tt ticks,
# once per minute.
snappedPrice=tt.snapshot(lastTrades)
snappedVolume=tt.snapshot(volume)
# The lastByVolume only depends on snappedVolume and snappedPrice,
# so it need only be recomputed on a one minute interval.
lastByVolume=snappedVolume.dropColumns("Timestamp").sortDescending("Volume").naturalJoin(snappedPrice, "Sym")
Creating a static snapshot
When the left hand side of a snapshot operation is static, then the resulting table is also static. This can be useful if you would like to freeze a table in time by making an in-memory copy of it. The TableTools emptyTable
method creates a static table with zero columns and a specified number of rows. A single row zero-column table can be used to make an exact (but static) copy of a ticking table as shown below. This examples uses one of Deephaven's internal tables:
from deephaven import ttools
pqsl=db.i("DbInternal", "PersistentQueryStateLog").where("Date=currentDateNy()")
snappqsl=ttools.emptyTable(1).snapshot(pqsl, True)
print("PQSL Refreshing: " + str(pqsl.isLive()))
print("SnapPQSL Refreshing: " + str(snappqsl.isLive()))
snapshotHistory
The snapshotHistory
operation provides a history of the right hand side table at each interval. For example, the following snippet records the current state of each persistent query every minute in the "psqlHistory" table.
pqsl=db.i("DbInternal", "PersistentQueryStateLog").where("Date=currentDateNy()").lastBy("SerialNumber")
tt=timeTable("00:01:00").renameColumns("SnapshotTime=Timestamp")
pqslHistory=tt.snapshotHistory(pqsl)
The history table will grow in memory on each interval, because previous versions and the new version must be saved. This memory utilization must be taken into account when determining the proper interval for snapshotting and the required query heap resources.